# 7.5.1 Three-dimensional KLD: Gaia EDR3 vs Gaia DR2

We perform Kullback–Liebler Divergence (KLD) tests in order to check for correlations and clustering between observables. This allows us to identify where combinations of observables (subspaces) exhibit unexpected behaviour. For two-dimensional subspaces, the KLD is given by,

 $KLD=-\int d^{2}xp(x)logp(x)/q(x)$ (7.1)

where x is a subspace of observables, p(x) is the joint distribution of observables in the dataset, and $q(x)=\Pi_{i}p_{i}(x_{i})$, i.e. the product of marginalised 1D distribution of each of the observables.

The KLD measures clustering in n-dimensional data. KLD values are only meaningful when compared to each other. When comparing KLD values from different datasets, a 1-to-1 relation is expected if both were drawn from the same underlying distributions. A higher(lower) KLD value means more(less) clustering. Here we focus on comparing Gaia EDR3 to Gaia DR2, in particular by computing the KLD in 2D for the whole Gaia EDR3 and Gaia DR2 and the 3D KLD in small regions to explore spatial dependence on various photometric and astrometric quantities.

Summary of the results:

• We find that the two and three-dimensional distributions are consistent between pairs of patches which are symmetric with respect to the scanning law.

• When compared with Gaia DR2, we find Gaia EDR3 to be systematically less clustered in subspaces containing positional information. We interpret this as an improvement on the systematics introduced by the scanning law.