8.4 Quality assessment and validation 8.4 Quality assessment and validation 8.4.2 Additional validation

8.4.1 Summary of main validation results

Without going into further details, the most important scientific validation results from Andrae et al. (2018) are:

•

For low-extinction stars in the range from 3000K to 10000K, we can estimate $T_{\rm eff}$ with an ‘error’ (strictly just a discrepancy) of about 324K with respect to literature estimates. However, this error depends on the intrinsic temperature distribution of the dataset under consideration; it can be smaller or much larger. For instance, we achieve an RMS differences of only 73K for solar analogues from Tucci Maia et al. (2016) and 230K for Gaia benchmark stars from Heiter et al. (2015). For stars with extinction larger than the training sample (see Table 8.2), the $T_{\rm eff}$ estimates are systematically too cool. Similarly, our estimates are too high (hot) for stars of very low extinction, e.g. in the Galactic halo. This is also the case for stars with low metallicities. Our quoted percentiles appear to be reliable uncertainty estimates for $T_{\rm eff}$ .
•

The training sample for ExtraTrees has a non-uniform $T_{\rm eff}$ distribution. This leads to artificial stripes when plotting a Hertzsprung-Russell diagram (for example).
•

We estimate $A_{\rm G}$ and $E(G_{\rm BP}-G_{\rm RP})$ with errors of 0.46 mag and 0.23 mag, respectively. These error estimates apply after removing the outliers that originate from the degeneracy of extinction with the observables, as explained in Section 6.5 of Andrae et al. (2018).
•

Given the large extinction errors, we recommend that extinctions should not be used for individual stars, but only in an ensemble sense, e.g. to apply a dust correction of the colour-magnitude diagram or for making extinction maps which average over many stars (Andrae et al. 2018).
•

${\cal L}$ and ${\cal R}$ were estimated without considering extinction, as discussed in Section 8.3.3 and shown in Figure 8.7. This implies that ${\cal L}$ and ${\cal R}$ values will be underestimated for those stars where extinction is not negligible. In Andrae et al. (2018), however, we show that the radius estimates should still be valid to within the uncertainties given. One can also use Equation 8.7 and Equation 8.8 to estimate new ${\cal L}$ and ${\cal R}$ based on the user’s preference for $A_{\rm G}$ .
•

Only 48% of sources with temperature estimates have estimates of ${\cal L}$ and ${\cal R}$ , due to the post-processing filtering. One of these filters is to restrict luminosities to those with relative uncertainties of 0.3 or better. After applying these filters the median uncertainty in luminosity is 15% for ${\cal L}$ $\geq 1.0$ ${\cal L}_{\odot}$ . Similarly, we restrict publishing values of radii to those with relative temperature uncertainties less than 0.2. The resulting median uncertainties in radii are 7% for all radius values.

Section 7 of Andrae et al. (2018) outlines the known limitations of our results. This must be consulted before using the Apsis data products.