# 3.3 Calibration models

Author(s): Sergei Klioner

## 3.3.1 Overview

Author(s): Lennart Lindegren

The astrometric principles for Gaia were outlined already in the Hipparcos Catalogue ESA (1997, Vol. 3, Ch. 23) where, based on the accumulated experience of the Hipparcos mission and the general principle of a global astrometric data analysis was succinctly formulated as the minimization problems (see Lindegren et al. (2012)):

 $\min_{\boldsymbol{s},\,\boldsymbol{n}}~{}\lVert\boldsymbol{f}^{\text{obs}}-% \boldsymbol{f}^{\text{calc}}(\boldsymbol{s},\boldsymbol{n})\rVert_{\,\mathcal{% M}}\,.$ (3.38)

Here $\boldsymbol{s}$ is the vector of unknowns (parameters) describing the barycentric motions of the ensemble of sources used in the astrometric solution, and $\boldsymbol{n}$ is a vector of ‘nuisance parameters’ describing the instrument and other incidental factors which are not of direct interest for the astronomical problem but are nevertheless required for realistic modelling of the data. The observations are represented by the vector $\boldsymbol{f}^{\text{obs}}$ which could for example contain the measured detector coordinates of all the stellar images at specific times. $\boldsymbol{f}^{\text{calc}}(\boldsymbol{s},\boldsymbol{n})$ is the observation model, e.g., the expected detector coordinates calculated as functions of the astrometric and nuisance parameters. The norm is calculated in a metric $\mathcal{M}$ defined by the statistics of the data; in practise the minimization will correspond to a weighted least-squares solution with due consideration of robustness issues. The statistical weight $W_{l}=w_{l}/(\sigma_{l}^{2}+\epsilon_{l}^{2})$ of individual observations $l$ is composed of a contribution from the formal standard uncertainty of the observation $\sigma_{l}$ and the excess noise $\epsilon_{l}$ represents modelling errors and should ideally be zero. However, it is unavoidable that some sources do not behave exactly according to the adopted astrometric model (Section 3.3.3), or that the attitude (Section 3.3.4) sometimes cannot be represented by the model used to sufficient accuracy.

The excess noise term $\epsilon_{l}$ is introduced to allow these cases to be handled in a reasonable way, i.e., by effectively reducing the statistical weight of the observations affected. It should be noted that these modelling errors are assumed to affect all the observations of a particular star, or all the observations in a given time interval. (By contrast, the down-weighting factor $w_{l}$ is intended to take care of isolated outliers, for example due to a cosmic-ray hit in one of the CCD samples.) This is reflected in the way the excess noise is modelled as the sum of two components,

 $\epsilon_{l}^{2}=\epsilon_{i}^{2}+\epsilon^{2}_{a}(t_{l})\,,$ (3.39)

where $\epsilon_{i}$ is the excess noise associated with source $i$ (if $l\in i$, that is, $l$ is an observation of source $i$), and $\epsilon_{a}(t)$ is the excess attitude noise, being a function of time. For a ‘good’ primary source, we should have $\epsilon_{i}=0$, and for a ‘good’ stretch of attitude data we may have $\epsilon_{a}(t)=0$. Calibration modelling errors are not explicitly introduced in Equation 3.39, but could be regarded as a more or less constant part of the excess attitude noise. The estimation of $\epsilon_{i}$ is described in Section 3.3.3, and the estimation of $\epsilon_{a}(t)$ in Section 3.3.4.

## 3.3.2 Aligning the Gaia reference frame

Author(s): Lennart Lindegren

Gaia measures the positions of star images in the focal plane, from which the astrometric solution reconstructs the celestial positions, proper motions, and parallaxes of the sources together with the attitude and calibration parameters. Because the measurements are essentially relative within the field of view, the resulting coordinate system of positions and proper motions is also relative in the sense that its orientation and spin with respect to the celestial reference system (ICRS) are not precisely defined from the measurements themselves. The source and attitude parameters obtained in AGIS are thus expressed in a reference system that is internally consistent and well-defined, but slightly different from ICRS.

The purpose of the frame alignment process is to correct the source and attitude parameters so that they are expressed in a reference system that coincides with ICRS as closely as possible. Gaia observations of extragalactic sources (quasars) are used for this. Quasars are sufficiently distant that their peculiar motions can be neglected; therefore, they define a kinematically non-rotating reference system. Moreover, a subset of the quasars, known as the ICRF, have accurate positions in ICRS determined by radio–interferometric (VLBI) observations. Gaia observations of their optical counterparts allow the origins of $\alpha$ and $\delta$ to be aligned with the ICRS.

Section 6.1 in Lindegren et al. (2012) gives the rigorous definition of the rotation correction, then derives a linear approximation applicable to the small corrections that exist in practise. Only the small-angle approximation is described here, as it is sufficient in (nearly) all practical cases and much easier to explain. A similar process was used to align the Hipparcos Catalogue with the extragalactic reference frame (Lindegren et al. 1992).

The ICRS may be represented by the vector triad $\mathsf{C}=[\boldsymbol{X}~{}\boldsymbol{Y}~{}\boldsymbol{Z}]$, where $\boldsymbol{X}$, $\boldsymbol{Y}$, and $\boldsymbol{Z}$ are unit vectors pointing towards $(\alpha,\,\delta)=(0,\,0)$, $(90^{\circ},\,0)$, and $(0,\,90^{\circ})$, respectively. Since ICRS is non-rotating relative to distant quasars, the directions of these vectors are fixed. The reference system of the source and attitude parameters resulting from the astrometric solution can similarly be represented by a vector triad $\mathsf{\tilde{C}}=[\boldsymbol{\tilde{X}}~{}\boldsymbol{\tilde{Y}}~{}% \boldsymbol{\tilde{Z}}]$ which deviates slightly from $\mathsf{C}$. Moreover, $\mathsf{\tilde{C}}$ may rotate slowly with respect to $\mathsf{C}$. At any particular time $t$ the difference between the two systems can be represented by a rotation vector $\boldsymbol{\varepsilon}$ such that

 $\mathsf{C}=\mathsf{\tilde{C}}+\boldsymbol{\varepsilon}\times\mathsf{\tilde{C}}% +O(\varepsilon^{2})\,.$ (3.40)

The last term indicates that we are in the regime of the small-angle approximation, meaning that terms of the order of $|\boldsymbol{\varepsilon}|^{2}$ can be neglected. If  $\mu$as precision is aimed at, that $|\boldsymbol{\varepsilon}|$ must consequently be less than about 0.5 ${}^{\prime\prime}$. Note that $\boldsymbol{\varepsilon}$ is time-dependent, due to the rotation of $\mathsf{\tilde{C}}$, and that it is defined in the sense of a correction to the orientation of $\mathsf{\tilde{C}}$.

The astrometric source model (Section 3.3.3) constrains the sources to move uniformly through space, which implies that the time-dependence of $\boldsymbol{\varepsilon}$ must be linear. As long as the correction remains small, it can therefore be written

 $\boldsymbol{\varepsilon}(t)=\boldsymbol{\varepsilon}_{0}+(t-t_{\text{ep}})% \boldsymbol{\omega}\,,$ (3.41)

where $\boldsymbol{\varepsilon}_{0}$ is the orientation correction at $t=t_{\text{ep}}$ (the reference epoch of the catalogue), and $\boldsymbol{\omega}$ is the spin correction.

The celestial position $(\alpha,\,\delta)$ and proper motion $(\mu_{\alpha*},\,\mu_{\delta})$ of a source in the ICRS (relative to $\mathsf{C}$) are defined by means of Equation 3.3 and Equation 3.4, where the components of the vectors $\boldsymbol{\bar{u}}_{\text{B}}(t_{\text{ep}})$ and $\text{d}\boldsymbol{\bar{u}}_{\text{B}}/\text{d}t|_{t=t_{\text{ep}}}$ are actually the projections of these vectors on $\mathsf{C}$. The corresponding source parameters obtained in the astrometric solution, denoted $(\tilde{\alpha},\,\tilde{\delta})$ and $(\tilde{\mu}_{\alpha*},\,\tilde{\mu}_{\delta})$, are similarly defined by the projections of the same vectors on $\mathsf{\tilde{C}}$. It is readily shown that

 \left.\begin{aligned}\displaystyle(\tilde{\alpha}-\alpha)\cos\delta&% \displaystyle=(\boldsymbol{\varepsilon}_{0}\times\boldsymbol{r})^{\prime}% \boldsymbol{p}=\phantom{-}\boldsymbol{q}^{\prime}\boldsymbol{\varepsilon}_{0}% \\ \displaystyle\tilde{\delta}-\delta&\displaystyle=(\boldsymbol{\varepsilon}_{0}% \times\boldsymbol{r})^{\prime}\boldsymbol{q}=-\boldsymbol{p}^{\prime}% \boldsymbol{\varepsilon}_{0}\end{aligned}\quad\right\} (3.42)

and

 \left.\begin{aligned}\displaystyle\tilde{\mu}_{\alpha*}-\mu_{\alpha*}&% \displaystyle=(\boldsymbol{\omega}\times\boldsymbol{r})^{\prime}\boldsymbol{p}% =\phantom{-}\boldsymbol{q}^{\prime}\boldsymbol{\omega}\\ \displaystyle\tilde{\mu}_{\delta}-\mu_{\delta}&\displaystyle=(\boldsymbol{% \omega}\times\boldsymbol{r})^{\prime}\boldsymbol{q}=-\boldsymbol{p}^{\prime}% \boldsymbol{\omega}\end{aligned}\quad\right\} (3.43)

where $\boldsymbol{p}$, $\boldsymbol{q}$, $\boldsymbol{r}$ are the unit vectors introduced in Section 3.3.3. With $\varepsilon_{0X}$, $\varepsilon_{0Y}$, $\varepsilon_{0Z}$, $\omega_{X}$, $\omega_{Y}$, $\omega_{Z}$ denoting the components of $\boldsymbol{\varepsilon}_{0}$ and $\boldsymbol{\omega}$ in $\mathsf{C}$, these relations can be written in matrix form as

 $\begin{bmatrix}(\tilde{\alpha}-\alpha)\cos\delta\\ \tilde{\delta}-\delta\end{bmatrix}=\begin{bmatrix}-\sin\delta\cos\alpha&-\sin% \delta\sin\alpha&\cos\delta\\ \sin\alpha&-\cos\alpha&0\end{bmatrix}\begin{bmatrix}\varepsilon_{0X}\\ \varepsilon_{0Y}\\ \varepsilon_{0Z}\end{bmatrix}$ (3.44)

and

 $\begin{bmatrix}\tilde{\mu}_{\alpha*}-\mu_{\alpha}\\ \tilde{\mu}_{\delta}-\mu_{\delta}\end{bmatrix}=\begin{bmatrix}-\sin\delta\cos% \alpha&-\sin\delta\sin\alpha&\cos\delta\\ \sin\alpha&-\cos\alpha&0\end{bmatrix}\begin{bmatrix}\omega_{X}\\ \omega_{Y}\\ \omega_{Z}\end{bmatrix}\,.$ (3.45)

Given the position and proper motion differences for the same sources expressed in $\mathsf{\tilde{C}}$ and $\mathsf{C}$, these equations can be used to obtain a least-squares estimate of $\boldsymbol{\varepsilon}_{0}$ and $\boldsymbol{\omega}$. The same equations can then be used to correct the source parameters so that they refer to $\mathsf{C}$ instead of $\mathsf{\tilde{C}}$.

The astrometric solution that produced source parameters relative to $\mathsf{\tilde{C}}$ also produced an attitude estimate in the same reference system. This attitude should also be aligned with $\mathsf{C}$ if it will be used for the further analysis (e.g., the secondary source update). This is done by applying a time-dependent transformation of the attitude parameters, as described in Section 6.1.3 of Lindegren et al. (2012).

In practise $\boldsymbol{\varepsilon}_{0}$ is obtained by comparing the radio positions of ICRF sources ($\alpha$ and $\delta$ in the above equations) with the positions of their optical counterparts as obtained in the astrometric solution ($\tilde{\alpha}$, $\tilde{\delta}$). The second version of the ICRF, ICRF2 (Fey et al. 2015) contains at least some 2000 sources that are optically bright enough to be measured by Gaia.

The determination of $\boldsymbol{\omega}$ can use a much larger set of quasars, because Equation 3.45 does not require their precise positions in ICRS to be known, only their proper motions in ICRS ($\mu_{\alpha*}$, $\mu_{\delta}$). The latter are expected to be very small, but not zero. Apart from various random effects, including the peculiar motions of quasars and optical variability causing centroid displacements (Bachchan et al. 2016), the most important systematic proper motion pattern for quasars is expected to be produced by the secular variation of stellar aberration due to the galactocentric acceleration of the solar-system barycentre (Kopeikin and Makarov 2006). The galactocentric acceleration vector $\boldsymbol{g}$ should have a magnitude of $g\simeq 2\times 10^{-10}$ m s${}^{-2}$, pointing more or less towards the galactic centre. The effect on the quasars is an apparent streaming motion towards the galactic centre, described by

 \left.\begin{aligned}\displaystyle\mu_{\alpha*}&\displaystyle=\boldsymbol{p}^{% \prime}\boldsymbol{a}\\ \displaystyle\mu_{\delta}&\displaystyle=\boldsymbol{q}^{\prime}\boldsymbol{a}% \end{aligned}\quad\right\} (3.46)

where $\boldsymbol{a}=\boldsymbol{g}/c$ and $c$ is the speed of light. The magnitude of the effect is thus $a=g/c\simeq 4.3$  $\mu$as. It is not wise to assume that this effect is precisely known; instead the components of $\boldsymbol{a}$ in the ICRS should be introduced as additional unknowns when solving for $\boldsymbol{\omega}$. This leads to an augmented version of Equation 3.45 with six unknowns,

 $\begin{bmatrix}\tilde{\mu}_{\alpha*}\\ \tilde{\mu}_{\delta}\end{bmatrix}=\begin{bmatrix}-\sin\delta\cos\alpha&-\sin% \delta\sin\alpha&\cos\delta&-\sin\alpha&\cos\alpha&0\\ \sin\alpha&-\cos\alpha&0&-\sin\delta\cos\alpha&-\sin\delta\sin\alpha&\cos% \delta\end{bmatrix}\begin{bmatrix}\omega_{X}\\ \omega_{Y}\\ \omega_{Z}\\ a_{X}\\ a_{Y}\\ a_{Z}\end{bmatrix}\,.$ (3.47)

If the quasars are reasonably distributed over the celestial sphere, the correlation between the least-squares estimates of $\boldsymbol{\omega}$ and $\boldsymbol{a}$ will be small (Bachchan et al. 2016). The additional unknowns will therefore not weaken the solution for $\boldsymbol{\omega}$, but are essential to eliminate possible biases caused by the acceleration terms.

The orientation, spin, and acceleration parameters represent the lowest-order terms of a general expansion of position and proper motion differences in terms of vector spherical harmonics (Mignard and Klioner 2012). Other terms are expected to be negligible and can be used to check the internal consistency of the Gaia reference frame.

For Gaia DR1, Equation 3.44 was used to align the Gaia reference frame to ICRF at the reference epoch J2015.0, using the optical counterparts of about 2000 sources in ICRF2. However, Equation 3.47 was not used because the short duration of the data ($\sim$14 months) was insufficient to determine the proper motions of quasars. Instead, it was assumed that the Hipparcos Catalogue was aligned with ICRF at the epoch of the Hipparcos observations (J1991.25). The reference frame of Gaia DR1 was thus aligned with the ICRF at the two epochs J1991.25 and J2015.0, and the proper motion system of Gaia DR1 should then be non-rotating with respect to ICRF to a precision set (mainly) by the alignment of the Hipparcos Catalogue to the ICRF at J1991.25. In this process the effect of the galactocentric acceleration was neglected. Details are given in Section 4.3 of Lindegren et al. (2016).

## 3.3.3 Astrometric source model

Author(s): Lennart Lindegren

The astrometric model is a recipe for calculating the proper direction $\boldsymbol{u}_{i}(t)$ to a source ($i$) at an arbitrary time of observation ($t$) in terms of its astrometric parameters $\boldsymbol{s}_{i}$ and various auxiliary data, assumed to be known with sufficient accuracy. The auxiliary data include an accurate barycentric ephemeris of the Gaia satellite, $\boldsymbol{b}_{\text{G}}(t)$, including its coordinate velocity $\text{d}\boldsymbol{b}_{\text{G}}/\text{d}t$, and ephemerides of all other relevant solar-system bodies. The details of the model have been outlined in Section 3.2 of Lindegren et al. (2012) or Section 3.1.4 and only a short introduction is given here.

As explained in Section 3.1.3, the astrometric parameters refer to the ICRS and the time coordinate used is TCB. The reference epoch $t_{\text{ep}}$ is preferably chosen to be near the mid-time of the mission in order to minimize statistical correlations between the position and proper motion parameters.

The transformation between the kinematic and the astrometric parameters is non-trivial (Klioner 2003), mainly as a consequence of the practical need to neglect most of the light-propagation time $t-t_{*}$ between the emission of the light at the source ($t_{*}$) and its reception at Gaia ($t$). This interval is typically many years and its value, and rate of change (which depends on the radial velocity of the source), will in general not be known with sufficient accuracy to allow modelling of the motion of the source directly in terms of its kinematic parameters according to Equation 3.1. The proper motion components $\mu_{\alpha*i}$, $\mu_{\delta i}$ and radial velocity $v_{ri}$ correspond to the ‘apparent’ quantities discussed by in Sect. 8 of Klioner (2003).

The coordinate direction to the source at time $t$ is calculated with the same ‘standard model’ as was used for the reduction of the Hipparcos observations (ESA (1997), Vol. 1,  Sect. 1.2.8), namely

 $\boldsymbol{\bar{u}}_{i}(t)=\bigl\langle\boldsymbol{r}_{i}+(t_{\text{B}}-t_{% \text{ep}})(\boldsymbol{p}_{i}\mu_{\alpha*i}+\boldsymbol{q}_{i}\mu_{\delta i}+% \boldsymbol{r}_{i}\mu_{ri})-\varpi_{i}\boldsymbol{b}_{\text{G}}(t)/A_{\text{u}% }\bigr\rangle$ (3.48)

where the angular brackets signify vector length normalization, and $[\boldsymbol{p}_{i}~{}\boldsymbol{q}_{i}~{}\boldsymbol{r}_{i}]$ is the ‘normal triad’ of the source with respect to the ICRS (Murray 1983). In this triad, $\boldsymbol{r}_{i}$ is the barycentric coordinate direction to the source at time $t_{\text{ep}}$, $\boldsymbol{p}_{i}=\langle\boldsymbol{Z}\times\boldsymbol{r}_{i}\rangle$, and $\boldsymbol{q}_{i}=\boldsymbol{r}_{i}\times\boldsymbol{p}_{i}$. The components of these unit vectors in the ICRS are given by the columns of the matrix

 $\mathsf{C}^{\prime}[\boldsymbol{p}_{i}~{}\boldsymbol{q}_{i}~{}\boldsymbol{r}_{% i}]=\begin{bmatrix}-\sin\alpha_{i}&~{}-\sin\delta_{i}\cos\alpha_{i}&~{}\cos% \delta_{i}\cos\alpha_{i}\\ \phantom{-}\cos\alpha_{i}&~{}-\sin\delta_{i}\sin\alpha_{i}&~{}\cos\delta_{i}% \sin\alpha_{i}\\ 0&~{}\cos\delta_{i}&~{}\sin\delta_{i}\end{bmatrix}.$ (3.49)

$\boldsymbol{b}_{\text{G}}(t)$ is the barycentric position of Gaia at the time of observation, and $A_{\text{u}}$ the astronomical unit. $t_{\text{B}}$ is the barycentric time obtained by correcting the time of observation for the Römer delay; to sufficient accuracy it is given by

 $t_{\text{B}}=t+\boldsymbol{r}_{i}^{\prime}\boldsymbol{b}_{\text{G}}(t)/c\,,$ (3.50)

where $c$ is the speed of light. See Section 3.2 of Lindegren et al. (2012) for further details.

The iterative updating of the sources is described in Section 5.1 of Lindegren et al. (2012). There the method of identifying potential outliers and estimating the excess source noise $\epsilon_{i}$ is also outlined together with the calculation of the partial derivatives of the coordinate direction with respect to the source parameters.

## 3.3.4 Attitude model

Author(s): David Hobbs

The Gaia mission allows for astrometric accuracies at the microarcsecond ($\mu$as) level in stellar positions. To achieve this an accurate determination of the satellite’s attitude is required, including the effects of instantaneous disturbances. The attitude specifies the instantaneous orientation of Gaia in the celestial reference frame, that is, the transformation from the CoMRS defined by $\boldsymbol{C}=[\boldsymbol{X}~{}\boldsymbol{Y}~{}\boldsymbol{Z}]$ to the SRS defined by $\boldsymbol{S}=[\boldsymbol{x}~{}\boldsymbol{y}~{}\boldsymbol{z}]$ (see Section 3.1.3) as a function of time. The spacecraft is controlled to follow a scanning law, which for most of the mission will be the Nominal Scanning Law (NSL; de Bruijne et al. (2010)), designed to provide good coverage of the whole celestial sphere while satisfying a number of other requirements. The model used to accurately determine the attitude of Gaia is briefly described in (Section 3.3;Lindegren et al. (2012)).

The attitude model used relies on quaternions (Appendix A; Lindegren et al. (2012)) to model the instrument’s axes expressed directly in the celestial reference frame. Quaternions are an extension of complex numbers (Hamilton 1843) and are commonly used today for spacecraft attitude control, representing spatial rotations in a compact and convenient manner. The instantaneous attitude is represented by a unit quaternion $\mathbf{q}$, which has only four components, $\bigl\{q_{x},q_{y},q_{z},q_{w}\bigr\}$, satisfying the single constraint $q_{x}^{2}+q_{y}^{2}+q_{z}^{2}+q_{w}^{2}=1$. This is easily enforced by normalization.

The attitude quaternion $\mathbf{q}(t)$ gives the rotation from the CoMRS ($\mathsf{C}$) to the SRS ($\mathsf{S}$). Using quaternions, their relation is defined by the transformation of the coordinates of the arbitrary vector $\boldsymbol{v}$ in the two reference systems,

 $\left\{\mathsf{S}^{\prime}\boldsymbol{v},0\right\}=\mathbf{q}^{-1}\left\{% \mathsf{C}^{\prime}\boldsymbol{v},0\right\}\mathbf{q}\,.$ (3.51)

In the terminology of (Appendix A.3; Lindegren et al. (2012)) this is a frame rotation of the vector from $\mathsf{C}$ to $\mathsf{S}$. The inverse operation is $\left\{\mathsf{C}^{\prime}\boldsymbol{v},0\right\}=\mathbf{q}\left\{\mathsf{S}% ^{\prime}\boldsymbol{v},0\right\}\mathbf{q}^{-1}$.

In the attitude processing the components of the quaternions are modelled using B-splines (Appendix B; Lindegren et al. (2012)) giving

 $\mathbf{q}(t)=\left\langle\,\textstyle\sum_{n=\ell-M+1}^{\ell}\mathbf{a}_{n}B_% {n}(t)\,\right\rangle\,,$ (3.52)

where $\mathbf{a}_{n}$ ($n=0\dots N-1$) are the coefficients of the B-splines $B_{n}(t)$ of order $M$ defined on the knot sequence $\left\{\tau_{k}\right\}_{k=0}^{N+M-1}$. The function $B_{n}(t)$ is non-zero only for $\tau_{n}, which is why the sum in (Equation 3.52) only extends over the $M$ terms ending with $n=\ell$. Here, $\ell$ is the so-called left index of $t$ satisfying $\tau_{\ell}\leq t<\tau_{\ell+1}$. The normalization operator $\langle\,\rangle$ guarantees that $\mathbf{q}(t)$ is a unit quaternion for any $t$ in the interval $[\tau_{M-1},\tau_{N}]$ over which the spline is completely defined. Although the coefficients $\mathbf{a}_{n}$ are formally quaternions, they are not of unit length. The attitude parameter vector $\boldsymbol{a}$ consists of the components of the whole set of quaternions $\mathbf{a}_{n}$.

Cubic splines ($M=4$) are currently used in this attitude model. Each component of the quaternion (before the normalization in Equation 3.52) is then a piecewise cubic polynomial with continuous value, first, and second derivative for any $t$; the third derivative is discontinuous at the knots. When initializing the solution, it is possible to select any desired order of the spline. Using a higher order, such as $M=5$ (quartic) or $6$ (quintic), provides improved smoothness but also makes the spline fitting less local, and therefore more prone to undesirable oscillatory behaviour. The flexibility of the spline is in principle only governed by the number of degrees of freedom (that is, in practise, the knot frequency), and is therefore independent of the order. One should therefore not choose a higher order than is warranted by the smoothness of the actual effective attitude, which is difficult to model a priori see Appendix D.4 of Lindegren et al. (2012). Determining the optimal order and knot frequency may in the end only be possible through analysis of the real mission data (see Section 3.3 of Lindegren et al. (2012) for further details).

The iterative updating of the attitude is described in Section 5.2 of Lindegren et al. (2012). There the calculation of the partial derivatives of the coordinate direction with respect to the attitude quaternion are outlined together with the normalization constrains added during the update process to guarantee that the calculated quaternion is always of unit length. This is the final step in the attitude processing and is know as On-Ground Attitude-3 (OGA3).

The excess attitude noise $\epsilon_{a}(t)$ is outlined in Lindegren et al. (2012, Section 5.2.5) and accounts for modelling errors in the attitude representation and could be caused for example by (unmodelled) micro meteoroid impacts, ”clanks” due to sudden redistributions of satellite inertia, propellant sloshing, thruster noise, or mechanical vibrations (Lindegren et al. 2012, Appendix D.4).

### Detecting and handling micro-clanks

During early data processing examination of the Gaia attitude rate reconstructed from AF observations reveals numerous irregularities that can be attributed to discrete jumps in the physical AL attitude angle. These jumps are in the following called ‘micro-clanks’. They are seen in both the AL and AC attitude components, but most easily recognised AL thanks to the higher frequency and accuracy of the AL observations.

Micro meteoroid hits cause discontinuities in the physical attitude rate (angular velocity), rather than in the attitude angles, and therefore produce signatures in the data that are distinctly different from those produced by the micro-clanks. Nevertheless, the detection and handling of both effects are sufficiently similar that they can be treated together. In the following we refer to the micrometeoroid hits as ‘micro-hits’, and use ‘micro-events’ as a collective term for micro-clanks and micro-hits.

The existence of micro-clanks was not unexpected — similar effects had been seen in Hipparcos data, and they were explicitly included in the pre-launch Gaia dynamical attitude model (DAM; Risquez et al. (2012)), although their sizes and frequencies were at that time essentially unknown. Based on fairly arbitrary assumptions, the standard 5-year DAM simulation contains some 3100 micro-clanks exceeding 4.4 mas in the AL direction. As will be seen, this is an order of magnitude less frequent than actually observed.

Micro-events can be detected either from the residuals of an astrometric solution, or from rate estimates calculated by numerical differentiation of the field angles derived from successive CCD observations of a source transiting the AF. The latter approach has significant advantages: it does not require a converged astrometric solution (providing an accurate attitude), it is not restricted to the primary sources of the astrometric solution, and it is less sensitive to LSF/PSF calibration errors. It does require a reasonable geometric calibration, which could however be taken from a previous astrometric solution or even from ODAS. It is therefore natural to do the micro-event detection as part of the AGIS pre-processing.

The discontinuities in angle or rate caused by micro-events can in principle be handled by introducing multiple knots at the appropriate times in the spline representation of the attitude quaternion (Appendix A;Lindegren et al. (2012)). A problem that has been left unsolved is how to handle the gated observations in this context. As described by Bastian and Biermann (2005), the effective attitude to be used in the astrometric solution is slightly different depending on which gate is used, and the differences are particularly important when there are rapid variations in rates or angles, as there will be in connection with micro-events. However, once a micro-event has been detected for a certain time, and its amplitude around each axis estimated from the rate data, the specific effects of that event for any gate can be accurately computed from simple models, and does not require any further fitting to the observations. This realisation, and the discovery that micro-clanks are much more numerous than expected, and more easily detected in the rate data than in the residuals of an AGIS solution, motivates a radical change in the strategy for the overall handling of micro-events.

Attitude dead time handling is implement in AGIS to avoid periods where large disturbances occur that would render the attitude processing useless. Additionally, if the attitude is completely incorrect in some period this disturbance can cause adjacent regions to also be affected during the iterative updating of the attitude which can in extreme cases cause the entire solution to diverge. The cause of such regions can be many, a few obvious examples are, telescope refocusing, mirror heating to remove contamination, orbital manoeuvres, micro-meteoroid hits, missing or poor observational data, etc. The time span for the dead time can vary significantly from some days, e.g. mirror heating, to a few minutes, e.g. micro-meteoroid hits. As part of the processing, such unstable intervals have been identified manually and the time and duration of the event has been entered into a dead time file which is read when constructing the attitude knot sequence. Dead times are also introduced when no observations are available or when the observations are unmatched in IDT. Gaps are then introduced in the attitude and multiple knots are added at the edge of the gaps with multiplicity equal to the spline order. This allows the spine to be gracefully terminated at the beginning of the gap and restarted after the end of the gap.

## 3.3.5 Geometric instrument model

Author(s): Lennart Lindegren

The geometric instrument model (or geometric calibration model) is an accurate description of the CCD layout in the Scanning Reference System (SRS; Section 3.1.3) $\mathsf{S}=[\boldsymbol{x}~{}\boldsymbol{y}~{}\boldsymbol{z}]$, or equivalently in instrument angles $(\varphi,\zeta)$ or field angles $(f,\eta,\zeta)$. The three systems are equivalent because a given direction $\boldsymbol{u}$ can be represented in either system by means of the relations

 $\mathsf{S}^{\prime}\boldsymbol{u}=\begin{bmatrix}u_{x}\\ u_{y}\\ u_{z}\end{bmatrix}=\begin{bmatrix}\cos\zeta\cos\varphi\\ \cos\zeta\sin\varphi\\ \sin\zeta\end{bmatrix}=\begin{bmatrix}\cos\zeta\cos(\eta+f\Gamma_{\text{c}})\\ \cos\zeta\sin(\eta+f\Gamma_{\text{c}})\\ \sin\zeta\end{bmatrix}$ (3.53)

where $f=\text{sign}(u_{y})$ is the field index ($f=+1$ in preceding field of view, $-1$ in following field of view) and $\Gamma_{\text{c}}=106.5^{\circ}$ is the conventional basic angle. Conversely, the $xy$ plane of the SRS and the origin of the along-scan (AL) instrument angle $\varphi$ are implicitly defined by the geometric instrument model, or more precisely by certain constraints imposed on the model. The geometrical instrument model is based on the calibration model described in Section 3.4 of Lindegren et al. (2012).

A central concept for the geometric instrument calibration is the observation line, which is an imaginary curve extending over the full width of the CCD image area in the across-scan (AC) direction (Figure 3.8). For ungated observations (gate index $g=0$), where all $\simeq\,$4500 AL pixels are used to integrate the image, the observation line is nominally situated $\simeq\,$2250 TDI lines prior to the serial register (see Table 3.2 for exact numbers). For gated observations using the first gate ($g=12$), only the last $\simeq\,$2900 TDI lines are used for the integration, and the observation line is consequently situated $\simeq\,$1450 TDI lines prior to the serial register. The AC pixel coordinate $\mu$ is a continuous variable in the range $[13.5,\,1979.5]$, with $\mu=14.0$ when the image is centrally located in the AC direction of the first pixel column (with the smallest AC field angle $\zeta$), and $\mu=1979.0$ when the image is centrally located in the AC direction of the last (1966th) pixel column.

The elementary astrometric measurement, obtained from the transit of a given source over a single (SM or AF) CCD, is the observation time $t_{\text{obs}}$ and AC pixel coordinate of the image, $\mu_{\text{obs}}$. The observation time is calculated, in on-board time, as the read-out time of the reference pixel of the observation window, corrected for the AL offset of the image centroid from the reference pixel, minus the exposure mid-time offset for the relevant $g$ (Table 3.2). ($t_{\text{obs}}$ is subsequently converted to TCB using the time ephemeris; see Section 3.1.3.) The observed AC pixel coordinate $\mu_{\text{obs}}$ is obtained by correcting the AC coordinate of the window reference pixel for the AC offset of the image centroid from the reference pixel, but is only available for observations using a two-dimensional window.

The optical design of the Gaia telescopes and the mechanical layout of the focal-plane assembly are such that the TDI lines of all the CCD are very nearly parallel to lines of constant AL field angle $\eta$ in the SRS. (This is a necessary condition for the TDI operation of all the CCDs using the same TDI period.) Thus, to a first approximation the observation lines are short segments of great-circle arcs with a fixed $\eta$ for a given CCD and gate. However, in reality the structure of an observation line is much more complex, as suggested by the ’magnifying glass’ in Figure 3.8. For a given CCD/gate combination, the observation lines are different in the two fields of view, due to the optical distortions being different, and they vary with time due to thermal-mechanical changes in the optics and focal-plane assembly. Additional dependencies (e.g, on window class $w$) are discussed below.

The observation line for a given combination of field index $f$, CCD index $n$ (e.g., in the range 1 through 62 in the AF), gate $g$, and window class $w$ is defined in parametric form as

 \left.\begin{aligned}\displaystyle\eta&\displaystyle=\eta_{fngw}(\mu,\,t)\\ \displaystyle\zeta&\displaystyle=\zeta_{fngw}(\mu,\,t)\end{aligned}\quad\right% \}\,,\quad 13.5\leq\mu\leq 1979.5\,. (3.54)

The time argument here should be understood as representing the slow variation of the observation lines with time, including for example the basic-angle variations — these are ’slow’ in comparison with the precisions involved in measuring $t_{\text{obs}}$, which are in the $\mu$s to ms range. The calibration requirements are much stricter AL than AC, and the models for $\eta_{fngw}(\mu,\,t)$ and $\zeta_{fngw}(\mu,\,t)$ are separately described hereafter. The particular models described here are the ones used for Gaia DR1 (see Appendix A.1 in Lindegren et al. 2016); more elaborate models will be used for subsequent releases.

The geometric instrument model makes use of shifted Legendre polynomials $L^{*}_{l}(x)=P_{l}(2x-1)$, where $P_{l}(x)$ are the usual (non-shifted) Legendre polynomials. The shifted Legendre polynomials are orthogonal on $[0,\,1]$ and reach $\pm 1$ at the end points. The first four polynomials are

 \left.\begin{aligned}\displaystyle L^{*}_{0}(x)&\displaystyle=1\\ \displaystyle L^{*}_{1}(x)&\displaystyle=2x-1\\ \displaystyle L^{*}_{2}(x)&\displaystyle=6x^{2}-6x+1\\ \displaystyle L^{*}_{3}(x)&\displaystyle=20x^{3}-30x^{2}+12x-1\end{aligned}% \quad\right\}\,. (3.55)

### AL geometric instrument model

The AL geometric instrument model consists of a nominal part, a constant part, and a time-dependent part:

 $\eta_{fngw}(\mu,\,t)=\eta^{0}_{ng}(\mu)+\Delta\eta_{fngw}(\mu)+\Delta\eta_{fn}% (\mu,\,t)\,,$ (3.56)

where $\eta^{0}_{ng}$ is the nominal geometry depending on the CCD index and gate, $\Delta\eta_{fngw}$ is the constant part depending additionally on the field index and window class, and $\Delta\eta_{fn}$ is the time-dependent part.

The constant part is further decomposed as

 $\Delta\eta_{fngw}(\mu)=\sum_{l=0}^{2}\Delta\eta^{G}_{lfng}L^{*}_{l}(\tilde{\mu% })+\sum_{l=0}^{2}\Delta\eta^{W}_{lfnw}L^{*}_{l}(\tilde{\mu})+\sum_{l=0}^{1}% \Delta\eta^{B}_{lfnb}L^{*}_{l}(\tilde{\mu}_{b})\,,$ (3.57)

where the superscripted constants are the calibration parameters and $\tilde{\mu}=(\mu-13.5)/1966$ is the normalised AC pixel coordinate. The dependence on CCD gate (’G’) depends on $f$ due to the slightly different effective focal lengths in the preceding and following fields of view. The effect of the window class (’W’) also depends on $f$, due to the different PSF/LSF calibration models used for the different window classes and fields. The third term (’B’) in Equation 3.57 represents the intermediate-scale irregularities of the CCD that cannot be modelled by a polynomial over the full AC extent of the CCD. In practise the medium-scale irregularities are largely associated with the discrete stitching blocks resulting from the CCD manufacturing process. The stitching blocks are 250 pixel columns wide, except for the two outermost blocks which are 108 columns wide; the exact block boundaries are therefore $\mu=13.5$, $121.5$, $371.5$, $\dots$, $1621.5$, $1871.5$, $1979.5$. The intermediate-scale errors are modelled by a separate linear polynomial for each stitching block, depending on the block index $b=\lfloor(\mu+128.5)/250\rfloor$ (where $\lfloor x\rfloor$ is the floor function, i.e. the largest integer $\leq x$) and the normalised intra-block pixel coordinate $\tilde{\mu}_{b}=(\mu-\mu_{b})/(\mu_{b+1}-\mu_{b})$. Here, $[\mu_{b},\,\mu_{b+1}]$ are the block boundaries given above for $b=0\dots 8$. Small-scale irregularities, which vary on a scale of one or a few CCD pixel columns, are not modelled in Gaia DR1 but will likely be included in the improved models used for subsequent releases.

The time-dependent part of the AL calibration needs to take into account the joint dependence on $\mu$ and $t$, which quite generally can be expanded in terms of the products of one-dimensional basis functions. With $\tilde{t}_{j}=(t-t_{j})/(t_{j+1}-t_{j})$ denoting the normalised time coordinate in calibration interval $j$, we have

 $\Delta\eta_{fn}(\mu,\,t)=\sum_{l=0}^{L}\sum_{m=0}^{M_{l}}\Delta\eta^{(m)}_{% lfnj}L^{*}_{l}(\tilde{\mu})L^{*}_{m}(\tilde{t}_{j})\,,$ (3.58)

where $L$ is the maximum degree of the polynomial in $\mu$ and $M_{l}$ is the maximum degree of the polynomial in $t$ that is combined with a polynomial in $\mu$ of degree $l$. The current model uses $L=2$, as for the constant part, and $M_{0}=1$, $M_{1}=M_{2}=0$; thus Equation 3.58 simplifies to

 $\Delta\eta_{fn}(\mu,\,t)=\sum_{l=0}^{2}\Delta\eta^{(0)}_{lfnj}L^{*}_{l}(\tilde% {\mu})+\Delta\eta^{(1)}_{0fnj}L^{*}_{1}(\tilde{t}_{j})\,.$ (3.59)

In analogy with Equation (19) in Lindegren et al. (2012), the basic-angle offset can be computed from the calibration parameters as

 $\Delta\Gamma(t)=\frac{1}{62}\sum_{f}\sum_{n\in\text{AF}}\left(\Delta\eta^{(0)}% _{0fnj}+\Delta\eta^{(1)}_{0fnj}L^{*}_{1}(\tilde{t}_{j})\right)f\,,$ (3.60)

where $f=\pm 1$ for the preceding and following field of view, respectively.

### AC geometric instrument model

For the AC calibration we have in analogy with Equation 3.56

 $\zeta_{fng}(\mu,\,t)=\zeta^{\,0}_{fn}(\mu)+\Delta\zeta_{fng}(\mu)+\Delta\zeta_% {fn}(\mu,\,t)\,.$ (3.61)

The AC model has fewer breakpoints for the time dependence, no dependence on window class, and no intermediate or small-scale irregularities. Thus,

 $\displaystyle\Delta\zeta_{fng}(\mu)$ $\displaystyle=\sum_{l=0}^{2}\Delta\zeta^{\,G}_{lfng}L^{*}_{l}(\tilde{\mu})$ (3.62) $\displaystyle\Delta\zeta_{fn}(\mu,\,t)$ $\displaystyle=\sum_{l=0}^{2}\Delta\zeta^{\,(0)}_{lfnk}L^{*}_{l}(\tilde{\mu})+% \Delta\zeta^{\,(1)}_{0fnk}L^{*}_{1}(\tilde{t}_{k})\,,$ (3.63)

where $\tilde{t}_{k}=(t-t_{k})/(t_{k+1}-t_{k})$ are normalised time coordinates relative to the breakpoints $t_{k}$ for the AC calibration time intervals.

### Summary of calibration parameters

The model as described applies to the 62 CCDs in the astrometric field (AF); for the sky mappers (SM) the nominal calibration $\eta^{0}_{ng}(\mu)$, $\eta^{0}_{ng}(\mu)$ is not updated in the current solution as the SM observations are not used in this release.

Table 3.3 summarises the number of parameters of the different kinds. The total number of calibration parameters is 76 632 for the AL model and 46 500 for the AC model.

### Constraints

In order to render all the geometric model parameters uniquely determinable, they must satisfy a number of constraints. Some of them effectively define the origins of the field angles $(\eta\,\zeta)$, and hence the SRS, while others are needed to define a unique division between the different components of the model.

The origin of $\eta$ is defined through the constraint

 $\sum_{f}\sum_{n\in\text{AF}}\Delta\eta^{(0)}_{0fnj}=0\quad\text{for each AL % calibration time interval j.}$ (3.64)

Roughly speaking, the mean displacement of the observation lines in the AL direction from their nominal locations should be zero at all times, when averaged over both fields of view and the 62 CCDs of the AF.

The origin of $\zeta$ is defined through the constraints

 $\sum_{n\in\text{AF}}\Delta\zeta^{(0)}_{0fnk}=0\quad\text{for each AC % calibration time interval k and each field direction f.}$ (3.65)

Thus, the mean displacement in the AC direction from the nominal locations should be zero at all times, when averaged over the 62 CCDs of the AF. Here, the condition must be separately satisfied in each field of view, in order to define the $xy$ plane of the SRS.

Additional constraints are needed to ensure a unique division between the the different components of the AL and AC models. These are TBD.

### COMA terms

The geometric instrument model only contains terms that depend on the geometry of observation, such as the CCD index, gate, and window class. Formally, it cannot include terms that depend on source parameters such as colour and magnitude; indeed, for consistency the transformation from celestial coordinates to pixel coordinates must be purely geometric.

The image of a point source, as recorded by Gaia, is nevertheless slightly different depend on the colour of the object (e.g., due to wavelength-dependent diffraction effects in the optics) and its magnitude (e.g., due to charge transfer inefficiency in the CCD, which depends on the flux level). These differences will eventually be included in the (colour- and magnitude-dependent) PSF and LSF calibrations, so that the centroid positions obtained by fitting the appropriate PSF/LSFs to the images should be independent of colour and magnitude. Complete elimination of these effects by means of the PSF/LSF calibration will require several iterations of the cyclic processing loop.

In the first few cycles, and especially in the astrometric solution for Gaia DR1, residual colour- and magnitude-dependent effects are therefore expected. For diagnostic purposes it is then useful to add colour- and magnitude-dependent terms, hereafter called COMA terms, in the AL instrument calibration model. In the simplest model the effects are considered to be linear and independent, but different depending on the field of view ($f$), CCD ($n$), and time interval ($j$), in which case the augmented form of Equation 3.56 reads

 $\eta_{fngw}(\mu,\,t)=\cdots+(\nu_{\text{eff}}-\nu_{\text{eff}}^{\text{ref}})% \chi_{fnj}+(G-G^{\text{ref}})\psi_{fnj}\,.$ (3.66)

$\chi_{fnj}$ and $\psi_{fnj}$ are the COMA parameters, $\nu_{\text{eff}}$ is a effective wavenumber of source (or the observation of the source), and $G$ its magnitude. The reference wavenumber $\nu_{\text{eff}}^{\text{ref}}$ and reference magnitude $G^{\text{ref}}$ can in principle be chosen arbitrarily, but for computational reasons they should roughly represent the mean (or median, or mid-range) values of the corresponding quantities. The physical meaning of the reference quantities is that the calculated attitude and (non-COMA part of the) instrument model refer to a source with wavenumber $\nu_{\text{eff}}^{\text{ref}}$ and magnitude $G^{\text{ref}}$.

It is important that the COMA terms are only used internally in AGIS. They are not used, e.g., in the PSF/LSF calibration, where one needs the expected location of an image in the absence of COMA effects, precisely in order to include these effects in the PSF/LSF calibration.

Inclusion of COMA terms in the instrument model for the astrometric solution is however useful for detecting colour- and magnitude-dependent systematics that are not taken care of by the PSF/LSF calibration. They also help to eliminate COMA effects from the computed source, attitude, and geometric calibration parameters, so that they interfere less with the subsequent PSF/LSF calibration. Ideally, $\chi_{fnj}$ and $\psi_{fnj}$ should be insignificant in the final astrometric solution.

## 3.3.6 Global parameters

Author(s): Sergei Klioner

The astrometric software AGIS can be used to fit arbitrary global parameters, that is, parameters that depend on all or most of the data. A flexible software package in AGIS allows one to fit such parameters in different groups and modes. The primary use of this block is the determination of physical parameters, like the PPN parameter $\gamma$. This will be done in the future data releases. Another application of the global block is to determine instrumental (calibration) parameters characterizing Gaia astrometric instrument as a whole. Examples here are the Velocity and Basic Angle Calibration (VBAC) and Focal Length and Optical Distortion Calibration (FOC). VBAC is intended to determine, to the largest possible extent, Basic Angle variations directly from the astrometric data. FOC determines arbitrary differential distortions of the Gaia astrometric instrument. In Gaia DR1, VBAC was used in the verification of the solution (Lindegren et al. 2016, Appendix E.3). One of the simple variants of VBAC was used for the Gaia DR1 verification and allowed one to determine the coefficients of a Fourier polynomial for the basic angle. Although the fitted coefficients were not used for the released astrometric solution, VBAC has demonstrated that the difference between the fitted coefficients and those determined from the BAM data don’t differ by more than $\sim 50~{}\mu$as. This was important to demonstrate the correctness of the BAM measurements. In the future releases both VBAC and FOC will be used to verify and possibly improve the astrometric solutions.

## 3.3.7 Extended PSF/LSF model

Author(s): Michael Davidson

The cyclic reprocessing environment of IDU allows a more sophisticated LSF/PSF modelling to be performed, relative to the calibration determined in the real-time system (see Section 2.3.2). In this first data release the approach, however, mirrors that used in FL LODC with a few indirect improvements. For example, the bias and background models available in IDU are more complete and have fewer artifacts due to joins between OBMT intervals. The increased processing power available also enables a more flexible solution to be determined (extra spline knots and parameter dependencies).

In future data reductions the main difference with respect to the early LSF/PSF will be the use of a full 2D PSF model, rather than an ALxAC LSF approximation, which is known to limit the performance, particularly when the PSF is asymmetric. This full 2D PSF model will be conceptually similar to the LSF model but it shall use a linear combination of 2D basis functions instead of 1D basis functions. Knowledge of the scene will also permit the full 2D PSF mapping to be extended into the ‘far’ wings, well outside the region sampled by the stellar windows. These diffraction features cause a great number of false detections on-board and so form a valuable data set, along with data from Virtual Objects.

The ultimate aim for the extended LSF/PSF model is to incorporate a Charge Distortion Model (CDM) to handle the effects of Charge Transfer Inefficiency (CTI). The CDM is a very complex problem to solve, and it requires an accurate knowledge of the input signal, so this remains under development. Fortunately the level of radiation damage is significantly lower at this stage of the mission than feared before launch (by a factor of $\sim 10$).