# 7.5.2 Data used

The data used for the various tests performed is as follows:

• 2D KLD: We make use of the gbin files from GaiaSource to compare two datasets: edr3int3 to Gaia DR2. We do not apply any spatial selection, i.e., we consider the whole sky region altogether. For computing the KLD from (7.1), we consider limits that contain 99% of the data in each subspace, in order to exclude outliers.

• 2D KLD subsets: We also perform tests on two magnitude limited subsets of edr3int3. These are bt11($G<11$) & ft20 ($G>20$). Again, we do not apply any spatial selection on these, and use the 99% limits to exclude outliers for our KLD calculation.

• KLD patches: Finally, we perform our tests on four circular regions on the sky, labelled ‘patch-a’, ‘patch-b’, ‘patch-c’, and ‘patch-d’, each with a radius of 5 deg, and centred on ($l$,$b$) = (-90, -45), (-90, 45), (90, -45) and (90, 45) respectively. The regions contain $\sim 350000$ stars on average. They are chosen such that ‘patch-a’ and ‘patch-d’ cover regions of high number of transits, while ‘patch-b’ and ‘patch-c’ have fewer transits. No filtering has been applied to the data, however, since Gaia EDR3 has a larger range of G magnitudes when compared to Gaia DR2, we apply a cut at phot_g_mean_mag $<20$.