# 6.4.5 FullExtraction

FullExtraction is the pipeline where the RVS spectra are cleaned and calibrated, and filtering of bad data takes place. All the SpectroObservation, regardless of their magnitude and window geometry, the SpectroObservationVo and the spurious detection transits are also processed. Only after the star spectra have undergone deblending, the faint spectra ($G_{\mathrm{RVS}}^{\rm{ext}}>$14 mag), are the spurious detection transits and the VOs removed.

Similarly to the Scatter and the Calibration pipelines, also FullExtraction starts with a Preprocessing step, where the astrometric information (AGIS coordinates, OGA3 and the Gaia Ephemeris) is used to associate to each sample of the spectrum the FoV coordinates $(\eta,\zeta)$ of the source at the time when the sample crosses the CCD fiducial line, and the barycentric velocity correction (Section 3.4.6) is associated to each spectrum. The contaminant-transits are also identified: these are the transits of the contaminant-sources selected in SourceInit, brighter than $G_{\mathrm{RVS}}^{\rm{ext}}\sim 15$ mag and not having an RVS window. Some of the processing steps undergone by the RVS spectra are similar to the ones undergone by the calibrators, during the Calibration Preparation. The major differences are that in FullExtraction the Calibration results are applied, all the RVS spectra are treated, the blended spectra are deblended, the spectra rejections are less strict, and the epoch $G_{\rm RVS}$ is estimated.

• Bias and bias non-uniformity is first removed using appropriate calibration coefficients (see Section 6.3.1).

• Saturated samples (65 535 ADU) are flagged. The spectra presenting more than 40 saturated samples are later removed from the pipeline.

• The fixed CCD gain that was measured on ground, is applied.

• The dark current that was measured on ground, is subtracted (see Section 6.3.2).

• The straylight background is removed by subtracting the flux from the cell in the RVS straylight map that corresponds to the processing window’s position, time and pixel sampling. The spectra with a total negative flux resulting from a too high-background subtraction are removed from the pipeline, and so are the spectra for which the background is higher than 100 electrons pixel${}^{-1}$ s${}^{-1}$ or for which the background is higher than 40 electrons pixel${}^{-1}$ s${}^{-1}$ with an uncertainty higher than 0.4 electrons pixel${}^{-1}$ s${}^{-1}$.

• The flux loss outside the window is estimated using the LSF-AC calibration model obtained in the Calibration pipeline.

• Spectra that contain any columns from the cosmetic defect list (see Section 6.3.2) are flagged and removed from the pipeline.

• Spectra having a nearby relatively bright contaminant (i.e. a source with no RVS window and with a magnitude brighter than the target source magnitude + 3 mag, and within a relevant area of about 2500 AL x 20 AC pixels around the target source) are flagged and filtered from the pipeline.

• Cosmic rays are removed from both 2D and 1D windows. If pixels are saturated due to cosmic rays and the cosmic ray is successfully removed, the saturation flag is turned off. The samples affected by cosmic rays are flagged, and if their number is $\geq 100$ the spectrum is removed from the pipeline and so do the spectra presenting more than 40 saturated samples.

• 2D windows are optimally collapsed into 1D spectra if there are no saturated pixels, otherwise the 2D windows are collapsed into 1D spectra with a simple summing in AC.

• The spectra blended with one (or more) other source spectra and presenting a truncated window are deblended. The deblending algorithm is described in Seabroke et al. (2021, Section 2.5). The samples which are badly deblended (based on the Reciprocal Condition Number of the deblending matrix) are flagged. The spectra requiring deblending but that could not be deblended are removed from the pipeline. More than half of the blended spectra are removed (about 600 million) because they could not be deblended. After deblending has taken place only the star spectra with $G_{\mathrm{RVS}}^{\rm{ext}}\leq$14 are kept. At the end of the FullExtraction pipeline there are $\sim$70 % spectra non-blended and $\sim$30 % deblended.

• The trended wavelength coefficients are applied to the field angles and the wavelength range cut to 846–870 nm to avoid the RVS filter wings.

• The internal $G_{\rm RVS}$ magnitude is estimated from the spectra flux integrated between 846 and 870 nm.

• The RVS filter response that was measured on ground, is removed.

• Normalise the spectra: the spectra of the bright stars are normalised to their pseudo-continuum using a $2^{\mathrm{nd}}$-degree polynomial fitting. The stellar lines are iteratively rejected using a sigma-clipping with interval $[-3,+10]\sigma$. For the faint stars ($G_{\mathrm{RVS}}^{\rm{ext}}>$12) and the very cool stars, presenting the molecular TiO band in their spectrum, the polynomial was replaced by a constant equal to the median of the fluxes.

• Detect and flag the emission-line spectra (for these spectra, the radial velocity is not estimated). The spectra presenting artificial jumps and those presenting gradients are flagged.

• Determine the atmospheric parameters (see Section 6.4.5) to associate to the stars for which there is no information available from the input data (Section 6.2.3 or Section 6.2.1).

The uncertainties with all these processes are propagated for each sample in each window/spectrum.

The number of CCD spectra successfully treated by the FullExtraction pipeline is $\sim$2 billion, among them, $\sim$540 million have undergone successfully deblending. The number of input spectra that have been removed during the processing is $\sim$855 million, among them $\sim$600 million are removed because could not be deblended, and another $\sim$135 million because they were contaminated by a relatively bright source with no RVS window.

The $\sim$2 billion cleaned and calibrated spectra successfully produced by FullExtraction are input to the STAMTA pipeline, and used in the single-transit analysis (Section 6.4.8).

## Determine atmospheric parameters

The astrophysical parameters ($T_{\rm eff}$, $\log g$ and $[\rm Fe/H]$) are needed in the further processing, to select the appropriate synthetic spectrum (Section 6.2.3) for each observed spectrum. The first approach is the use of the auxiliary data file (Section 6.2.3) that contains the atmospheric parameters for some stars: if this information is lacking for a given star, the atmospheric parameters from the input (Section 6.2.1) are used, which are computed with a preliminary version of Apsis (V3.1) (see Chapter 11 for a description of the Apsis pipeline). However there are some stars for which also this information is lacking. For the stars with $G_{\mathrm{RVS}}^{\rm{ext}}\leq 12$ mag, the astrophysical parameters are determined from Pearson correlation with all of the templates generated (Section 6.4.8) from the restricted set of synthetic spectra in Section 6.2.3. The template that gives the highest correlation peak determines the parameters for that star; for a more detailed description see Sartoretti et al. (2018, Section 6.5). The solar atmospheric parameters are assigned to the stars with $G_{\mathrm{RVS}}^{\rm{ext}}>12$ mag. The origin of the atmospheric parameters associated to these stars is set rv_atm_param_origin = 333.