skip to main content

gaia data release 3 documentation

12.3 Characteristics of the extragalactic sources

12.3.4 Class overlap statistics

We focus here on the three classifiers providing results in the extragalactic tables. For each of them, it is interesting to check the distribution of the respective classes reported in the tables, both among the other modules, and against the other classes of a given classifier (DSC, Vari-Classification and OA). For the former checks, we consider neither the Vari-Agn sample as it is a sub-sample of Vari-Classification, nor the OA sample as it is not used to build the source list. For the latter checks, statistics can be extracted for each pair of classifier. In the following, the class labels reported correspond to those hosted by the classlabel_dsc, vari_best_class_name and classlabel_oa fields. It should also be noted that the respective classifiers can provide different classification and as such their quasar and galaxy labels do not necessarily agree.

qso_candidates table

DSC classes against other QSO modules

Table 12.7 shows the overlap between the various QSO modules and the DSC ‘Combmod’ classes. As can be seen, for all modules, the majority of sources overlapping with DSC are classified as quasars, with a minimum of nearly 80% for any module other than QSOC. This confirms the high completeness level of the DSC quasar sample. Non-surprisingly, the next most significant contributing class is that of ‘star’, which accounts of more than 1/3 of the QSOC overlaps. Also, overlaps with the galaxy class are relatively limited.

Table 12.7: Overlaps between the DSC ‘Combmod’ classes and the other QSO modules in the qso_candidates table. The percentages are given with respect to the total number of sources in a given module, i.e. not to those overlapping with DSC, so that the fraction of sources not overlapping with DSC overall is also given. The column corresponding to DSC itself indicates the fraction of sources classified as a given class in the full table. We recall that the fact that not only sources classified as quasar by DSC ‘Combmod’ exist in the table is intrinsic to the way the integrated table is built, as explained in Section 12.2.
DSC Class DSC (%) CRF3 (%) Surface (%) Vari- (%) QSOC (%)
brightness Classification
quasar 5243012 78.9 1274886 79.0 739633 79.9 935131 90.3 1038955 56.6
galaxy 156970 2.4 45371 2.8 28680 3.1 19026 1.8 36305 2.0
star 1089026 16.4 290371 18.0 156567 16.9 79402 7.7 705506 38.5
whitedwarf 51132 0.8 3 0.0 31 0.0 5 0.0 22727 1.2
physicalbinary 92040 1.4 13 0.0 55 0.0 5 0.0 25925 1.4
unclassified 15331 0.2 2003 0.1 926 0.1 1520 0.1 4700 0.3
Not overlapping 0 0 1526 0.1 47 0.0 118 0.0 0 0.0

The same statistics using this time the classes from the DSC-Joint classification are shown in Table 12.8. Owing to the majority of DSC-Joint sources being classified as ‘unclassified’, the overall number of sources from the various modules overlapping with the DSC-Joint quasar class drops significantly. However, the relative fraction of sources from a given module overlapping with the DSC-Joint quasar class is higher compared to that of the full DSC sample (Table 12.7).

Table 12.8: Overlaps between the DSC-Joint classes and the other QSO modules in the qso_candidates table. The percentages have the same meaning as in Table 12.7.
DSC-Joint Class DSC-Joint (%) CRF3 (%) Surface (%) Vari- (%) QSOC (%)
brightness Classification
quasar 547201 8.2 447140 27.7 298017 32.2 372116 35.9 324065 17.7
galaxy 12302 0.2 9281 0.6 6903 0.7 1963 0.2 417 0.0
unclassified 6088008 91.6 1156226 71.6 620972 67.1 661010 63.9 1509636 82.3
Not overlapping 0 0 1526 0.1 47 0.0 118 0.0 0 0.0

Vari-Classification classes against other QSO modules

We describe here the overlaps between the various classes featured in the qso_candidates table from the Vari-Classification. By construct the majority of those correspond to the ‘AGN’ class, but other classes also enter the table via sources contributed by the other modules (see the vari_classifier_class_definition table for a definition of the codes used for these classes). Table 12.9 shows their distribution among the various QSO modules, showing that about 8% of all sources featuring such classification are not labelled as ‘AGN’. Of those, the majority (5% of the total) are labelled as RR Lyrae, which indeed is a variable type known to be mistaken as extragalactic sources in variability studies (Rimoldini et al. 2022). The overlap of the ‘AGN’ class with other modules does not exceed half of the sources from those modules, but this is because the majority of the remaining sources do not overlap with the Vari-Classification. In effect, nearly all of the sources in common are labelled as ‘AGN’, showing that this sample is very pure but also achieves a good completeness especially against the CRF3 and the Surface brightness sample.

Table 12.9: Overlaps between the Vari-Classification classes and the other QSO modules in the qso_candidates table. The percentages have the same meaning as in Table 12.7. See the vari_classifier_class_definition table for a description of what each of these classes correspond to.
Class Vari- (%) CRF3 (%) Surface (%) DSC (%) QSOC (%)
Classification brightness
ACV, etc 33 0.0 0 0.0 0 0.0 32 0.0 18 0.0
ACYG 16 0.0 0 0.0 0 0.0 7 0.0 14 0.0
AGN 1035207 92.2 833755 51.7 513084 55.4 944148 17.0 477971 26.1
BCEP 9 0.0 0 0.0 0 0.0 1 0.0 9 0.0
BE|GCAS|SDOR|WR 208 0.0 1 0.0 4 0.0 124 0.0 198 0.0
CEP 2235 0.2 0 0.0 53 0.0 1106 0.0 1995 0.1
CV 2443 0.2 32 0.0 54 0.0 2241 0.0 1533 0.1
DSCT|GDOR|SXPHE 6224 0.6 106 0.0 57 0.0 5246 0.1 2205 0.1
ECL 10937 1.0 183 0.0 248 0.0 8304 0.1 6143 0.3
GALAXY 4688 0.4 1289 0.1 1165 0.1 1895 0.0 751 0.0
LPV 279 0.0 30 0.0 221 0.0 46 0.0 33 0.0
MICROLENSING 1 0.0 0 0.0 1 0.0 0 0.0 0 0.0
RCB 51 0.0 0 0.0 1 0.0 50 0.0 45 0.0
RR 59054 5.3 85 0.0 68 0.0 42805 0.8 44648 2.4
RS 91 0.0 3 0.0 81 0.0 11 0.0 8 0.0
S 403 0.0 1 0.0 6 0.0 216 0.0 341 0.0
SDB 1 0.0 0 0.0 0 0.0 0 0.0 1 0.0
SN 299 0.0 0 0.0 0 0.0 247 0.0 85 0.0
SOLAR_LIKE 1 0.0 0 0.0 1 0.0 0 0.0 0 0.0
SYST 6 0.0 0 0.0 2 0.0 2 0.0 4 0.0
WD 23 0.0 0 0.0 0 0.0 18 0.0 20 0.0
YSO 152 0.0 2 0.0 142 0.0 8 0.0 8 0.0
Not overlapping 0 0 778619 48.2 410719 44.4 4536860 81.8 1297908 70.8
Notes. ACV|CP|MCP|ROAM|ROAP|SXARI class

Outlier Analysis classes against other QSO modules

The OA classes featured in the qso_candidates table are here checked against the sources from the other QSO modules. It should be borne in mind here that OA entries in this table are not based on any selection specific to the OA classes, instead the OA information is simply populated for all sources being selected as eligible by the other modules.

As explained in Section 11.3.12, several OA classes can be associated with quasar classification because separate classes are used for different redshift intervals. To simplify the analysis, we group them here in some sort of meta-classes gathering all sources of a certain source type. We isolate here four meta-classes: Quasar, Galaxy, White Dwarf, and any other star category different from White Dwarf. The distribution of the 2.8 million OA sources featured in qso_candidates over these meta-classes is shown in the second column of Table 12.10.

Table 12.10: Overlaps between the OA classes and the QSO modules in the qso_candidates table. The percentages have the same meaning as in Table 12.7.
OA Class OA (%) CRF3 (%) Surface (%) DSC (%) QSOC (%) Vari- (%)
brightness Classification
OA Star 1017843 36.3 145882 9.0 83963 9.1 710007 12.8 396259 21.6 57218 5.5
OA WD 555800 19.8 27536 1.7 13905 1.5 480024 8.7 117016 6.4 10098 1.0
OA Quasar 1097757 39.2 317683 19.7 148554 16.0 823980 14.9 345557 18.8 135869 13.1
OA Galaxy 131825 4.7 59706 3.7 31656 3.4 71543 1.3 37341 2.0 42133 4.1
Not overlapping 0 0 1063366 65.9 647861 70.0 3458342 62.4 937945 51.1 789889 76.3

Outlier Analysis classes against DSC and Vari-Classification classes

We check here how the OA classes compare to those of the respective DSC and Vari-Classification classes compare for the sources present in the qso_candidates table. We use the same meta-class scheme as above to organise the OA class labels. Table 12.11 and Table 12.12 shows the corresponding class overlaps.

Table 12.11: Overlaps between the respective OA and DSC ‘Combmod’ classes in the qso_candidates table. The percentages have the same meaning as in Table 12.7.
DSC Class DSC (%) OA Star (%) OA WD (%) OA Quasar (%) OA Galaxy (%)
quasar 5243012 78.9 617511 60.7 453890 81.7 763200 69.5 48658 36.9
galaxy 156970 2.4 5971 0.6 1036 0.2 43163 3.9 44743 33.9
star 1089026 16.4 317260 31.2 67758 12.2 271527 24.7 35610 27.0
whitedwarf 51132 0.8 23846 2.3 11809 2.1 2805 0.3 0 0.0
physicalbinary 92040 1.4 51018 5.0 20754 3.7 12784 1.2 284 0.2
unclassified 15331 0.2 2237 0.2 553 0.1 4278 0.4 2530 1.9
Not overlapping 0 0 0 0.0 0 0.0 0 0.0 0 0.0
Table 12.12: Overlaps between the respective OA and Vari-Classification classes in the qso_candidates table. The percentages have the same meaning as in Table 12.7.
Class Vari-Classification (%) OA Star (%) OA WD (%) OA Quasar (%) OA Galaxy (%)
ACV, etc 33 0.0 9 0.0 0 0.0 23 0.0 0 0.0
ACYG 16 0.0 14 0.0 0 0.0 2 0.0 0 0.0
AGN 1035207 92.2 57218 5.6 10098 1.8 135869 12.4 42133 32.0
BCEP 9 0.0 9 0.0 0 0.0 0 0.0 0 0.0
BE|GCAS|SDOR|WR 208 0.0 185 0.0 0 0.0 8 0.0 0 0.0
CEP 2235 0.2 1546 0.2 8 0.0 532 0.0 9 0.0
CV 2443 0.2 448 0.0 63 0.0 885 0.1 115 0.1
DSCT|GDOR|SXPHE 6224 0.6 2335 0.2 856 0.2 337 0.0 0 0.0
ECL 10937 1.0 5839 0.6 971 0.2 1004 0.1 103 0.1
GALAXY 4688 0.4 15 0.0 1 0.0 960 0.1 1635 1.2
LPV 279 0.0 24 0.0 0 0.0 0 0.0 2 0.0
MICROLENSING 1 0.0 0 0.0 0 0.0 0 0.0 0 0.0
RCB 51 0.0 3 0.0 0 0.0 1 0.0 4 0.0
RR 59054 5.3 33727 3.3 1640 0.3 6235 0.6 139 0.1
RS 91 0.0 8 0.0 0 0.0 1 0.0 0 0.0
S 403 0.0 62 0.0 0 0.0 219 0.0 55 0.0
SDB 1 0.0 1 0.0 0 0.0 0 0.0 0 0.0
SN 299 0.0 40 0.0 29 0.0 96 0.0 6 0.0
SOLAR_LIKE 1 0.0 0 0.0 0 0.0 0 0.0 0 0.0
SYST 6 0.0 1 0.0 0 0.0 1 0.0 0 0.0
WD 23 0.0 17 0.0 1 0.0 3 0.0 0 0.0
YSO 152 0.0 6 0.0 0 0.0 1 0.0 0 0.0
Not overlapping 0 0 916265 90.0 542118 97.5 951413 86.7 87595 66.4
Notes. ACV|CP|MCP|ROAM|ROAP|SXARI class

Owing to the selection rules applied to build the DSC sample in the qso_candidates table, this comparison has several biases and is not necessarily representative of the effective classification performance of OA. On the one hand there is a selection bias in that most of the sources featuring DSC results in the qso_candidates tables are selected based on their quasar class, and 80% of the DSC sources end up having this class in the table. On the other hand, as illustrated e.g. in Figure 12.7, it is expected that a large part of the DSC sources are in fact stellar. With these limitations in mind, the agreement regarding classification as quasar appears to be good, with two third of the OA objects classified as quasar having the same label in DSC. This is less so for the other OA classes, whereby the DSC quasar class is also the one most assigned, probably for the reasons explained here. Concerning the comparison with the Vari-Classification, the largest source overlap corresponds to a match between the ‘AGN’ and the OA quasar classes (55% of the OA sources matched to the ‘AGN’ class). A full overview of the OA classification results is given in Section 11.3.12.

galaxy_candidates table

DSC classes against other Galaxy modules

Table 12.13 shows the overlapsOverlaps between between the various Galaxy modules and the DSC ‘Combmod’ classes. The formatting is the same as that used for the corresponding quasar table in Table 12.7.

Table 12.13: Overlaps between the DSC ‘Combmod’ classes and the other Galaxy modules in the galaxy_candidates table. The percentages have the same meaning as in Table 12.7.
DSC Class DSC (%) Surface (%) Vari- (%) UGC (%)
brightness Classification
quasar 12933 0.3 249 0.0 1706 0.1 2458 0.2
galaxy 3566085 73.7 529825 57.9 1529591 62.4 1350301 98.8
star 1254732 25.9 384172 42.0 920008 37.5 13461 1.0
whitedwarf 64 0.0 0 0.0 0 0.0 15 0.0
physicalbinary 1288 0.0 1 0.0 4 0.0 8 0.0
unclassified 6697 0.1 47 0.0 55 0.0 910 0.1
Not overlapping 0 0 543 0.1 0 0.0 0 0.0

Again, the majority of the sources of a given module overlap with the DSC ‘Combmod’ galaxy class, although the corresponding fractions are not as high as for the quasar modules, and do not exceed about 2/3 of the samples based on Surface brightness or Vari-Classification analyses. The overlap with UGC is very high simply because of the way UGC sources are selected (see Section 11.3.13). The rest of the sources are mostly labelled as stars, and a small fraction appear as quasar.

Similar to the quasar sample, Table 12.14 shows the same overlap matrix using this time the classes from the DSC-Joint sample in the galaxy_candidates table.

Table 12.14: Overlaps between the DSC-Joint classes and the other Galaxy modules in the galaxy_candidates table. The percentages have the same meaning as in Table 12.7.
DSC-Joint Class DSC-Joint (%) Surface (%) Vari- (%) UGC (%)
brightness Classification
quasar 234 0.0 0 0.0 0 0.0 180 0.0
galaxy 251063 5.2 54931 6.0 70518 2.9 184724 13.5
unclassified 4590502 94.8 859363 93.9 2380846 97.1 1182249 86.5
Not overlapping 0 0 543 0.1 0 0.0 0 0.0

Vari-Classification classes against other Galaxy modules

Similar to Table 12.9, one can check the distribution of the various classes assigned by the Vari-Classification analysis against the other Galaxy modules. This is summarised in Table 12.15, showing that only a small fraction of sources are labelled differently from ‘GALAXY’, and that they essentially correspond to sources labelled as ‘AGN’. Again, essentially all sources overlapping with the other Galaxy modules are labelled as ‘GALAXY’, and their completeness level against these modules varies between 40% and 70%.

Table 12.15: Overlaps between the Vari-Classification classes and the other Galaxy modules in the galaxy_candidates table. The percentages have the same meaning as in Table 12.7. See the vari_classifier_class_definition table for a description of what each of these classes correspond to.
Class Vari- (%) Surface (%) DSC (%) UGC (%)
Classification brightness
AGN 20845 0.8 44 0.0 19688 0.5 9614 0.7
BE|GCAS|SDOR|WR 3 0.0 0 0.0 3 0.0 1 0.0
CEP 15 0.0 0 0.0 15 0.0 0 0.0
CV 166 0.0 0 0.0 163 0.0 5 0.0
DSCT|GDOR|SXPHE 292 0.0 0 0.0 292 0.0 0 0.0
ECL 3131 0.1 0 0.0 3107 0.1 135 0.0
GALAXY 2451364 98.9 634550 69.4 1529594 41.0 972929 71.2
LPV 28 0.0 0 0.0 27 0.0 10 0.0
RCB 2 0.0 0 0.0 2 0.0 0 0.0
RR 1271 0.1 29 0.0 1246 0.0 153 0.0
RS 2 0.0 0 0.0 2 0.0 0 0.0
S 114 0.0 0 0.0 113 0.0 8 0.0
SN 36 0.0 0 0.0 36 0.0 1 0.0
SYST 2 0.0 0 0.0 2 0.0 1 0.0
WD 2 0.0 0 0.0 2 0.0 0 0.0
Not overlapping 0 0 280214 30.6 2172134 58.3 384290 28.1

Outlier Analysis classes against other Galaxy modules

The same analysis as that shown in Table 12.10 is illustrated in Table 12.16, using the same meta-classes.

Table 12.16: Overlaps between the OA classes and the other Galaxy modules in the galaxy_candidates table. The percentages have the same meaning as in Table 12.7.
OA Class OA (%) Surface (%) DSC (%) UGC (%) Vari- (%)
brightness Classification
OA Star 54847 2.9 7088 0.8 47826 1.3 4539 0.3 1815 0.1
OA WD 3596 0.2 4 0.0 3498 0.1 210 0.0 44 0.0
OA Quasar 85176 4.5 2962 0.3 76700 2.1 9121 0.7 5334 0.2
OA Galaxy 1757407 92.4 424826 46.4 712385 19.1 176713 12.9 1063672 43.4
Not overlapping 0 0 479957 52.5 2886139 77.4 1176570 86.1 1380499 56.3

Unlike for the qso_candidates table, the majority of the OA sources featured in the galaxy_candidates table are labelled as galaxy following our OA meta-class scheme. Also, nearly all of the sources in common with the other Galaxy modules are labelled as galaxy according to this scheme, indicating a good performance of the OA clustering for that particular class. The apparent completeness against DSC or UGC is poor, but this is likely rather due to the large amount of stellar contamination in these samples.

Outlier Analysis classes against DSC and Vari-Classification classes

Similar to what is shown in Table 12.11 and Table 12.12, we check here how the OA classes compare to those of the respective DSC ‘Combmod’ and Vari-Classification classes compare for the sources present in the galaxy_candidates table. We use the same meta-class scheme as above to organise the OA class labels. Table 12.17 and Table 12.18 compile the corresponding source overlaps. Overall, the match between the OA Galaxy class and the corresponding galaxy classes of DSC and Vari-Classification is better than the equivalent for the quasar class: 90% (respectively 99%) of OA source matches to the DSC ‘galaxy’ class (respectively Vari-Classification ‘GALAXY’ class) correspond to the OA Galaxy class.

Table 12.17: Overlaps between the respective OA and DSC ‘Combmod’ classes for the galaxy_candidates table. The percentages have the same meaning as in Table 12.7.
DSC Class DSC (%) OA Star (%) OA WD (%) OA Quasar (%) OA Galaxy (%)
quasar 12933 0.3 1593 2.9 382 10.6 3128 3.7 2080 0.1
galaxy 3566085 73.7 30351 55.3 2542 70.7 73493 86.3 708253 40.3
star 1254732 25.9 22206 40.5 585 16.3 7911 9.3 1045455 59.5
whitedwarf 64 0.0 12 0.0 41 1.1 0 0.0 0 0.0
physicalbinary 1288 0.0 19 0.0 21 0.6 13 0.0 1084 0.1
unclassified 6697 0.1 666 1.2 25 0.7 631 0.7 535 0.0
Not overlapping 0 0 0 0.0 0 0.0 0 0.0 0 0.0
Table 12.18: Overlaps between the respective OA and Vari-Classification classes for the galaxy_candidates table. The percentages have the same meaning as in Table 12.7.
Class Vari-Classification (%) OA Star (%) OA WD (%) OA Quasar (%) OA Galaxy (%)
AGN 20845 0.8 326 0.6 6 0.2 1682 2.0 13821 0.8
BE|GCAS|SDOR|WR 3 0.0 0 0.0 0 0.0 1 0.0 0 0.0
CEP 15 0.0 1 0.0 0 0.0 6 0.0 2 0.0
CV 166 0.0 8 0.0 1 0.0 113 0.1 20 0.0
DSCT|GDOR|SXPHE 292 0.0 192 0.4 26 0.7 5 0.0 0 0.0
ECL 3131 0.1 512 0.9 32 0.9 343 0.4 304 0.0
GALAXY 2451364 98.9 1815 3.3 44 1.2 5334 6.3 1063672 60.5
LPV 28 0.0 0 0.0 0 0.0 0 0.0 2 0.0
RCB 2 0.0 0 0.0 0 0.0 0 0.0 1 0.0
RR 1271 0.1 23 0.0 2 0.1 253 0.3 186 0.0
RS 2 0.0 0 0.0 0 0.0 0 0.0 0 0.0
S 114 0.0 3 0.0 0 0.0 31 0.0 21 0.0
SN 36 0.0 0 0.0 0 0.0 0 0.0 13 0.0
SYST 2 0.0 0 0.0 0 0.0 0 0.0 0 0.0
WD 2 0.0 1 0.0 0 0.0 0 0.0 0 0.0
Not overlapping 0 0 51958 94.7 3485 96.9 77372 90.8 679339 38.7