# 7.6.5 Processing steps

## The main fitting procedure

The first step in the processing of an EB LC is to determine whether there is a minimum of 3 data points in each eclipse. If not, the source is rejected as no reliable solution can generally be computed in such a case. This step relies on the Gaussian fits made during variability processing (Section 10.7.1) which makes a first determination of the positions and durations of the minima.

The next step invokes the global optimisation algorithm which returns the top $N_{\mathrm{LC}}=10$ best-matching LCs (cf. Section 7.6.4). The eclipse positions and durations of these LCs may not necessarily overlap with those of the Gaussian curves from variable processing, so it is checked once again that there are at least 3 data points in each eclipse of the best-matching LC, too. If not, then the source is rejected.

The last step of the main processing involves the local optimisation, as described in Section 7.6.4. A solution is calculated for each of the $N_{\mathrm{LC}}=10$ LCs, and the one with the lowest $F2$ value is retained as the best. For completeness, it is noted that when the EB simulator is called during local optimisation, the requested number of surface elements for the bigger component is set to $N=3\,000$, the limb darkening is set to square root law, and the “accurate” algorithm is selected for the calculation of the mutual irradiation effects.

At this point, the LC solution has been obtained. However, there are a number of additional steps that can be taken to bring the solution to a more useful form, and are described below.

## Relabeling of the components

During main processing, the components are labeled so that the primary component is eclipsed by the secondary component at $t_{0}$. However, for the sake of usability, the convention used in the published solution is that the first component is the one with the higher luminosity ($L_{1}>L_{2}$) in $G$. This is obviously something that can only be done after the solution has been found. Therefore, the indices 1 and 2 are changed accordingly at this point, if they have to.

## Temperature ratio versus luminosity ratio

As mentioned in Section 7.6.4, the relevant quantity when fitting a single-passband LC is the luminosity ratio of the components and not their individual temperatures. For this reason, the $G$-band luminosity ratio, $L_{2}/L_{1}$, is also output as a convenience for the end user. Since the first component was defined to be the more luminous one in $G$, it follows that $L_{2}/L_{1}\leq 1$.

However, the luminosity ratio is a derived quantity for which there is no entry in the covariance matrix of the solution (recall from Section 7.6.4 that it is the effective temperature of the first component that is fitted while that of the second component is held constant). In the interest of providing the covariance matrix of all the fitted parameters, it was decided to also output the ratio of the fitted temperature to that of the non-fitted temperature, taking care to appropriately scale the relevant covariance matrix elements. The reason why it was decided to publish temperature ratios rather than individual temperatures was to avoid possible confusion. Indeed, individual temperatures have no physical relevance in this context, and their values are unrelated to the true temperatures of the observed systems (except by chance). The fact that they are calculated for a blackbody atmosphere also implies that it is probably more pertinent to directly employ luminosity ratios for any further analysis, unless there is an explicit interest in the ratio of the effective temperatures or the associated uncertainty.

As mentioned above, at the end of local optimisation the labeling of the components is changed so that the first component is always the more luminous one in $G$. This means that star 1 is no longer necessarily the one whose the temperature was adjusted. Consequently, sometimes the published temperature ratio refers to $T_{\mathrm{eff},1}/T_{\mathrm{eff},2}$ while some other times it refers to $T_{\mathrm{eff},2}/T_{\mathrm{eff},1}$. This is indicated by the bitIndex boolean mask, which is also listed in the solution. The luminosity ratio, calculated after the labelling has been finalised, does not have this complication.

## Fraction of variance unexplained (g_rank)

The goodness of fit of a solution is normally quantified via the $F2$ statistic. However, the $F2$ values of the calculated EB solutions do not serve this purpose well, for reasons that are not well understood but are discussed in Section 7.6.6. To alleviate this problem, an additional statistic is computed, namely the g_rank. It is defined as a linear transformation of the logarithm of the fraction of variance unexplained (FVU):

 $\mathrm{g\_rank}=-0.11\left[\log_{10}(\mathrm{FVU})-3.45\right]\qquad\mathrm{% with}\qquad\mathrm{FVU}=\frac{\sum_{i=1}^{N}\left(m_{i}^{\mathrm{obs}}-m_{i}^{% \mathrm{model}}\right)^{2}}{(N-1)\,\mathrm{var}\{{m_{i}^{\mathrm{obs}}\}}},$ (7.45)

where $m_{i}^{\mathrm{obs}}$ and $m_{i}^{\mathrm{model}}$ are the magnitudes of the $i=1,\ldots,N$ observations and model predictions, respectively, and var$\{m_{i}^{\mathrm{obs}}\}$ is the biased variance of the observed LCs.

The g_rank statistic has been used for a similar purpose in variability processing. Since the FVU is defined in terms of unweighted observation values, the g_rank offers the advantage of being unaffected by possible inaccuracies in the determination of the photometric uncertainties, which is one of the possible explanations for the large values of $F2$ (cf. Section 7.6.6). The values of g_rank tend towards 0 when the goodness of fit is bad, and towards 1 when it is good.

## Time of periastron (${\mathbf{t}_{Z}}$) instead of ${\mathbf{t}_{0}}$

The time of periastron passage, $t_{Z}$, is a more physically meaningful parameter to use as a time reference than $t_{0}$, and is therefore the one that is published. See, however, Section 7.6.6.

## Filtering of the solutions

At this point in the processing, approximately $1.6\times 10^{6}$ LC solutions have been calculated out of the $\sim\!\!2.2\times 10^{6}$ EBs identified by variability processing, the difference being due to systems rejected for not having enough data points in eclipse, or for failing to be properly processed for technical reasons. A preliminary appraisal of these solutions, both by statistical means and by visual inspection, revealed several issues that raised concerns about their quality. Some of these issues could be related to inadequacies of the EB processing code while others could be due to wrong periods, false EB detections, etc. Unfortunately, the available time to address these issues before Gaia DR3 was limited. It was therefore decided to apply instead a rather drastic filtering process with the aim of maximising the fraction of good solutions in the Gaia DR3 sample at the cost of severely limiting the completeness of the sample. In this context, the following filters were applied to the aforementioned $\sim\!\!1.6\times 10^{6}$ solutions (which had already passed the requirement of having at least 3 data points in eclipse):

1. 1.

The goodness of fit, as expressed by the $F2$ statistic, should be no greater than 50. This filter is applied to all NSS solutions.

2. 2.

The rank of the solution, as defined in Section 7.6.5, should be $\texttt{g\_rank}\geq 0.6$. This rather high value was selected on the basis of the experience obtained in variability processing, where limiting the rank of the Gaussian fits to $\geq 0.6$ resulted in a low number of false positives at the cost of excluding many good solutions. Applied to NSS processing, this criterion seems to have a similar effect, and is indeed responsible for the vast majority of the rejected solutions.

3. 3.

The efficiency of the solution should be in the range (0.2, 1]. The efficiency (Eichhorn 1989) is a measure of the amount of correlation among the estimated parameters. No correlation corresponds to $\texttt{efficiency}=1$, while values near 0 indicate a significant amount of correlation. By excluding low-efficiency solutions, the majority of LCs with an inadequate phase coverage (especially at critical segments of the LC) or otherwise badly constrained solutions are filtered out without having to resort to heuristic criteria that could have hard-to-understand side effects.

As a result of this filtering, the final number of Gaia DR3 EB solutions is $\sim\!\!86\,918$. One obvious objective for the future data releases is to maximise this number while keeping the number of bad solutions to a minimum.