skip to main content

gaia data release 3 documentation

6.1 Introduction

6.1.2 Overview of the spectroscopic processing

The RVS spectroscopic processing pipeline is the result of the work of CU6 (Section 1.2.2): the scientific teams provided the scientific codes to DPCC (CNES), who integrated them in the SAGA (System of Accommodation of Gaia Algorithms) host framework. SAGA is developed by Thales. The RVS pipeline was run at DPCC (Section 1.3.5) on an Hadoop cluster with 2500 cores and 17 TB RAM. The processing took approximately 3 million hours CPU time and about 120 days real time, and needed 300 TB disk space.

The technical design of the DR2 RVS pipeline was completely revised in order to make possible the treatment of many more data (2.8 billion of spectra in DR3 against 280 million in DR2) and to avoid the storage of voluminous intermediate data. The DR3 pipeline is composed of the 6 specialised pipelines shown in Figure 6.1.

Figure 6.1: The RVS pipeline. The DR3 pipeline is composed of six specialised pipelines which run independently but require in input the products of the upstream pipeline. SourceInit and EpochInit are technical pipelines including mostly technical functionality in charge of preparing the data to process in the downstream pipelines. EpochInit, Scatter, Calibration and FullExtraction pipelines process the data of each Trending Epoch per transit or per calibration unit (i.e. a dataset covering a fixed observation time interval containing the calibration stars and the standard stars). STAMTA, instead, processes the data per source (the observations of the source in all the epochs are treated). All the scientific pipelines include at the end the Automated Verification of their products, permitting the verification of their results (Section 6.5.1). The pipeline functionalities are described in Section 6.4. Figure by Antoine Guerrier.

New scientific functionalities introduced in the DR3 RVS pipeline

The main goal of the DR3 spectroscopic pipeline is to estimate the radial_velocity of as many stars as possible observed by the RVS and brighter than GRVSext14 mag. This magnitude limit increases at each data release: it was GRVSext12 mag in DR2, and it is expected to be GRVSext16 mag in DR4. This is because at each data release the number of observations per source increases, and so does the expected signal to noise in the combined spectra, permitting to measure the radial_velocity of fainter stars. Listed here are the new scientific functionalities implemented in the pipeline which have permitted to accomplish the DR3 objectives.

  • Radial velocities of faint stars: The radial velocities of the faint stars (fainter than grvs_mag 12 mag, but brighter than GRVSext14 mag) are estimated. This is thanks to the implementation of rv_method_used=2 in the STAMTA pipeline (Section 6.4.6). For the sources fainter than grvs_mag 12 mag, the radial_velocity is estimated using the combined epoch cross-correlation functions (C-function, Section 6.4.8) and not the epoch radial velocities, which are imprecise.

  • Straylight calibration (Section 6.4.3): the straylight map is estimated every 30 hours, and the spectra borders of the faint stars are also used, in addition to the Virtual Objects (VOs; Section 1.1.3). In DR2 a single straylight map was computed off-line using only the Ecliptic Poles Scanning Law (Section 1.1.4) data, and was used to process all the data. A better estimation of the straylight permits a better estimation of grvs_mag for the faint stars.

  • Deblending of the CCD spectra: the transit spectra contaminated by nearby sources with an RVS window are deblended and used to obtain the radial_velocity and the rvs_mean_spectrum of the source, while in DR2 they were removed from the pipeline. The deblending functionality permits to have a larger number of epoch spectra per source, increase the SNR of the combined CCF and estimate the radial velocity of the stars with fainter magnitude. It improves in part also the contamination problems shown by some rectangular truncated windows in DR2, reported by Boubert et al. (2019). The deblending algorithm is described in Seabroke et al. (2021), their Section 2.5.

  • Calibration of the Line Spread Function (LSF): The along-scan LSF (LSF-AL) profile, the across-scan LSF (LSF-AC) profile and peak position are calibrated (Section 6.3.4). The LSF-AL calibration has permitted to reduce the systematic shifts between the two FoVs which were present in the wavelength calibration zero point and reflected in the epoch radial velocities, while the LSF-AC and AC-Peak are needed by the deblending implementation, and by the estimation of the flux loss out of the window, which is taken into account in the computation of grvs_mag.

  • Broadening velocities: vbroad is computed for each transit (Section 6.4.8) by all the STA methods. The epoch vbroad produced by RVFou (Section 6.4.8) are used to produce the median vbroad published in DR3.

  • Contamination from nearby sources without an RVS window: In DR2 the contamination from the sources without RVS window was not considered. In DR3, the contamination from sources without a RVS window and brighter than GRVSext15 is taken into account in the following way. When the contaminant is in a relevant area around the target (about 2500x20 pixels) and the difference in magnitude with the target is < 3 mag, the target CCD spectrum is removed from the pipeline.

  • Quality indicators based on the combined cross-correlation functions were developed and used in Post-processing (Section 6.5.2) to identify and flag as invalid spurious radial velocities.

  • Magnitude used for data selection: the selection of the data for the processing steps is done using the magnitude GRVSext, obtained using G and GRP from DR2, and not anymore using the Initial Gaia Source List (IGSL) magnitude. When G and GRP are available the transformation used is (Gaia Collaboration et al. 2018a):

    For G- GRP <1.4 mag:

    For 1.4G-GRP<1.7 mag:

    For about 2.5 % of the stars treated by the pipeline, GRP was not available (due to the sourceid changes since DR2) and the on-board magnitude was used. Note: grvs_mag is computed in the last processing step (MTA, Section 6.4.9), and it is not available for upstream data selection.