gaia data release 3 documentation

13.2 Data Consolidation

13.2.2 Data accountability

Author(s): Alex Hutton and Enrique Utrilla

The integration process takes sources from different datasets and combines them based on their source ID. During this process, using information provided by the IDU cross-match, previously existing sources can be deleted e.g. if they were based on observations that have been identified as spurious. Other sources might have been tagged as superseded if they were split into two or more new sources, or if several old sources were merged into a single one.

Traceability of the sources

The result of the previously described operations is summarised in Table 13.1.

Table 13.1: Traceability of the sources.
Type Input Deleted Removed by Split Removed by Merge Added Output
IGSL Source 603 038 774 3 732 663 1 815 549 16 118 938 581 371 624
New source (IDU-01) 1 333 804 445 105 501 144 5 597 468 41 726 981 1 180 978 852
HPM Addendum 200 50 0 27 123
New source (IDU-02) 645 771 010 81 006 080 4 348 988 35 842 395 524 573 547
New source (IDU-03)
    as a new source 194 550 000 194 550 000
    as result of merge 45 887 691 45 887 691
    as result of split 25 260 950 25 260 950
Total 2 582 614 429 190 239 937 11 762 005 93 688 341 265 698 641 2 552 622 787

Combining the input data from data reduction cycle 02, the new sources from IDU-03 and discarding the deleted and the superseded sources, the output dataset has 2 552 622 787 sources. All sources were assigned at least one observation by the IDU cross match.

Updates from AGIS-03

The outputs of the second AGIS run for the cycle 3 (AGIS-03.2) were integrated into the dataset described in the previous subsection, which included sources with positions and proper motions supplied by different systems - including AGIS-01, AGIS-02, IDU-01, IDU-02, IDU-03 as well as data from the IGSL catalogue.

The distribution of the AGIS-03 updates against the types of source records is shown in Table 13.2.

Table 13.2: Distribution of updates from AGIS-03.
Not Not
Source Type Origin 2-params 5-params 6-params Converged Processed Total
NEW IDU-03 179 039 547 768 236 13 017 569 1 724 648 194 550 000
NEW (merge) IDU-03 12 472 283 33 051 532 339 443 24 433 45 887 691
NEW (split) IDU-03 22 849 878 1 688 465 673 951 48 656 25 260 950
PERSISTENT IDU-02 215 419 351 211 457 213 92 521 229 4 289 332 886 422 524 573 547
HPM-Addendum 13 104 3 3 123
IDU-01 531 101 741 95 298 720 519 412 259 33 613 607 1 552 525 1 180 978 852
IGSL 66 253 555 278 706 238 235 588 193 784 981 38 657 581 371 624
Total 1 027 136 368 585 462 275 883 029 917 52 718 886 4 275 341 2 552 622 787
Notes. In this context, ‘Not Processed’ means that AGIS did not find any suitable observations matching the required quality criteria for the given source.

Of the sources retained in the integration, that is, those which have not been deleted or superseded by IDU-03, most received AGIS-03 updates. In the few cases where they did not, their previous values for positions and proper motions are maintained.

On a per-source level there is no mixing between AGIS solutions from different cycles. This means that a 2-parameter solution from AGIS-03 will remove any proper motions that were applied to a source from AGIS-02. Nevertheless proper motions from IDU and the IGSL are retained to provide support for future cross-match processing.

Distribution of position and proper motion originators

The following two tables summarise the origin of the positions and proper motions for the sources in the dataset generated prior to filtering in the archive for Gaia EDR3.

The distribution of the positions of the sources is shown in Table 13.3.

Table 13.3: Distribution of position origins.
Position Origin Count Percentage
AGIS-03 2 495 628 560 97.77%
IDU-03 15 828 700 0.62%
AGIS-02 6 261 410 0.25%
IDU-02 3 249 943 0.13%
HPM-Addendum 1 0.00%
AGIS-01 4 027 522 0.16%
IDU-01 27 205 498 1.07%
IGSL 421 153 0.02%
Total 2 552 622 787 100.00%

The originators of the proper motions of the integrated sources is shown in Table 13.4.

Table 13.4: Distribution of proper motion origins.
Proper Motion Origin Count Percentage
AGIS-03.2 1 468 492 192 57.53%
IDU-03 14 182 341 0.56%
AGIS-02 360 423 0.01%
IDU-02 3 015 135 0.12%
HPM Addendum 12 0.00%
AGIS-01 108 0.00%
IDU-01 0 0.00%
IGSL 21 569 221 0.84%
None 1 045 003 366 40.94%
Total 2 552 622 798 100.00%

Population of radial velocity

The radial velocity in the integrated source is populated by the Integrator when it receives a CU6 spectroscopic barycentric radial velocity which has been marked as valid. The distribution of the sources with radial velocities marked as valid by source type is shown in Table 13.5. This is prior to filtering in the archive for Gaia DR3.

Table 13.5: Sources with radial velocity from CU6 in the integrated source.
Type Count
IGSL Source 5 789 569
New source (IDU-01) 449 140
HPM Addendum 56
New source (IDU-02) 26 917 575
New source (IDU-03)
    as a new source 7 302
    as result of merge 699 625
    as result of split 20 188
Total 33 883 455

Data from the CUs in the integrated sources

The Integrator updates the integrated source with data from other CUs. Table 13.6 summarises the inputs received from the CUs for the integrated source. Note that this data represents the input prior to filtering in the archive for Gaia DR3.

Table 13.6: Data from other CUs in the integrated source.
IGSL Source 580 780 798 539 999 572 538 567 080 546 622 982 6 664 850 4 155 645 534 521 746
New source (IDU-01) 1 163 259 818 891 276 489 872 608 175 949 316 823 724 722 1 578 904 795 805 103
HPM Addendum 123 118 118 117 59 8 116
New source (IDU-02) 515 233 543 396 200 219 387 369 534 430 218 198 29 763 432 5 933 840 367 894 882
New source (IDU-03)
    as a new source 177 971 228 71 150 885 68 675 906 117 674 944 12 174 11 386 44 193 787
    as result of merge 45 340 150 39 594 956 39 493 629 40 030 787 1 035 054 366 218 38 606 046
    as result of split 24 006 082 8 310 908 7 820 615 10 651 757 26 713 22 785 6 793 574
Total 2 506 591 742 1 946 533 147 1 914 535 057 2 094 515 608 38 227 004 12 068 786 1 787 815 254
Notes. CU5 G, RP and BP refers to the photometric data in the G broad band and integrated GBP and GRP bands and SSC refers to the spectral-shape coefficients (see Chapter 5). DSC refers to the Apsis DSC module (see Section 11.3.2).

Integrated quasar and galaxy tables

The MDB Integrator also combines data from different CUs to produce integrated quasar and galaxy tables. See Chapter 12 for further details.

Integrated astrophysical parameter table

The Integrator collects the data provided by Apsis (see Section 11.3) to produce an integrated astrophysical parameter table. Not all sources receive input from every Apsis module. The number of sources for the different combinations of Apsis modules is shown in Table 13.7. As for other integrated tables, note that this represents the immediate output from Apsis which is subject to filtering prior to ingestion in the archive.

Table 13.7: Data input to the integrated astrophysical parameter table.
141 071
× 3 190 854
× 274
× × 41 790
× 79
× × 129
× × × 50
× 3 156 604
× × 192 174 980
× × 667 388
× × × 85 404 325
× × 12 270
× × × 5 315 500
× × × 13 862
× × × × 9 253 167
× × 11
× × × 5 071
× × × 41
× × × × 1 147
× × × 15
× × × × 250 190
× × × × 255 962
× × × × × 170 895 981
× 30 391
× × 2 147 132
× × 561
× × × 138 596
× × 26
× × × 210
× × × 1 483
× × × × 14 734
× × 855 070
× × × 27 039 477
× × × 196 544
× × × × 9 902 197
× × × 35 989
× × × × 4 443 557
× × × × 53 370
× × × × × 4 512 625
× × × 880
× × × × 24 565
× × × × 2 180
× × × × × 26 402
× × × × 911 790
× × × × × 21 601 305
× × × × × 50 080 499
× × × × × × 1 195 014 910
× × × × 1
× × × × × 275
× × × × × 18 457
× × × × × × 19 612
× × × × × × 11 097 866
× × × × × × × 753 671 322
Total 2 552 622 787