Author(s): Enrique Utrilla
The on-ground processing of the wealth of data produced by the Gaia spacecraft is complex and produces a number of complementary data products, that can be used by other members of the DPAC in later processing cycles to further refine their own data. This means that some of the data produced by different teams and at different times during the complete operational period of the mission must be integrated to create comprehensive reference libraries added into the Main Data Base (MDB) so that they are available to all DPAC groups that may need them. This process is described in Section 13.2.1.
These integrated records and other MDB data products are the foundations of the data published in the Gaia Catalogue. Nevertheless they contain data in different stages of maturity and quality, and in a format which is convenient for the internal DPAC processing, but not suitable for storage in a database to support queries from the scientific community. As an intermediate step to prepare for the publication, these data are filtered to remove as much as possible data not meeting some minimum quality criteria, and converted to a format more convenient for presentation. These consolidated data are ingested in a DPAC-internal instance of the Gaia Archive database and subject to a process of scientific validation in order to detect potential conversion errors, consistency issues, and problems in the data themselves. This process is described in Section 13.2.3.
This cycle of conversion-ingestion-validation is repeated a number of times as new data becomes available in order to refine both the conversion software and the quality filters themselves. In parallel, some additional support datasets are computed, such as crossmatches between the sources in the new Gaia data and selected external catalogues.
The last steps of the publication process are the removal of some specific fields used internally for validation, the ingestion of these final data in the operational, publicly accessible Gaia Archive database, and the generation of a set of ECSV files which is made available for the direct bulk download of the whole catalogue.