11.4 Quality assessment and validation
Author(s): Morgan Fouesneau, René Andrae, Coryn A.L. Bailer-Jones, Ioannis Bella-Velidis, Elisa Brugaletta, Ruth Carballo, Orlagh L. Creevey, Ludovic Delchambre, Thavisha Dharmawardena, Ronald Drimmel, Yves Frémat, Daniel Garabato, Ulrike Heiter, Georges Kordopatis, Andreas J. Korn, Alessandro Lanzafame, Alex Lobel, Minia Manteiga, Douglas J. Marshall, Mathias Schultheis, Rosanna Sordo, Caroline Soubiran, Antonella Vallenari
The CU8 validation involves analyzing the scientific results of the Apsis processing to report on their quality and enable improvements to the processing. The validation is an integral part of each data processing cycle and may trigger a (partial) reprocessing of the data during that cycle.
Validation in CU8 is a procedure that assesses the quality of the results produced by an algorithm. One can perform this task by comparing values for various objects or outputs to expected ones — for example, comparing temperature estimates of stars against literature values — however, a validation shall never modify the results themselves. Instead, the procedure produces a report, which gives an overview of the various tests and results and concludes on the data’s quality. These conclusions may contain recommendations to change or update the code that produced the results.
In contrast, a (re-)calibration is a procedure that alters the data to correct for potential issues found during the validation of some results. For example, the comparison of some values to external data may exhibit discrepancies. In some cases, one may apply corrections to the data.
For the completeness of this document, we briefly summarize that CU8 proceeds with two types of validation procedures.
External validation
The external validation requires all relevant input and output data on specific sources. The purpose is to compare Apsis APs with previously published APs and use the input data to understand discrepancies. To properly validate the outputs with external validation, we must keep the relationships between input data and output products.
Internal validation
The point of internal validation is to look for internal consistency. One compares results with general expectations: e.g., parameters are within sensible ranges, distributions, and correlations roughly what we expect. During the internal validation, we select the Gaia sources either (a) using predefined criteria, e.g., all sources with some parameter values in a predefined range (that parameter being either Apsis input data, e.g., G magnitude, or Apsis output data, e.g., DSC class probability), or (b) using posterior criteria, e.g., all sources near an outlier cluster centre (produced by OA) or a specific number of sources with the smallest surface gravities. Again both Apsis inputs and outputs are required.