5.3.5 External calibration
Author(s): Paolo Montegriffo
The two main tasks of the external calibration are: i) provide an instrument model to allow for a physical interpretation of the BP and RP mean spectra, and ii) transform the mean spectra to fully calibrated (flux and wavelength) spectral energy distributions.
The external calibration is based on the following main assumptions:
all differential effects across the focal plane (including variations with time) are completely accounted for by the internal calibration;
the internal reference system is uniquely defined for all sources, independently from source colours and magnitudes;
the internal reference system can be represented by an instrument model that is close to the physical instrument at an arbitrary position on the focal plane.
The dispersed image of a point-like source in the data space can be expressed as:
is the continuous coordinate in the along scan (AL) direction;
is the continuous coordinate in the across scan (AC) direction;
is the telescope pupil area in units of ;
is the source photon flux distribution (SPD) in units of ;
is the effective monochromatic point spread function (PSF) at wavelength ;
is the dispersion function;
is the overall instrument response function.
This model assumes that i) the dispersion of the prism is perfectly aligned with the AL direction, and ii) charge transfer inefficiency (CTI) effects can be neglected. The BP and RP spectra are originated by collapsing the 2D image along the AC direction, thus giving:
is the internally calibrated mean spectrum in units of ;
is the effective monochromatic LSF at wavelength .
The classical approach of deriving an instrument response by computing a ratio between the observational spectra and the corresponding SPD over a limited number of featureless calibrators cannot be applied here because of the large LSF width compared to the wavelength scale of response variations. As a consequence LSF and dispersion relation must be derived together with the instrument response. This can be achieved in principle using an arbitrarily large number and variety of standard stars (sources with known spectral energy distribution) by tuning the model till a match is obtained between the model prediction and the corresponding observational mean spectra.
However, the stringent criteria set to select reliable flux calibrators (see Section 5.6) led to a selection of stars spanning a limited range in astrophysical parameters, whose SED shapes are not sufficiently independent from each other, and hence leave a relevant number of instrument components undetermined. This limitation is mitigated in two ways: i) by building the instrument model upon a nominal model based on the pre-launch knowledge of the actual instrument, and ii) by employing in the calibration process a large set of additional sources that escape the severe selection criteria adopted for the standard flux calibrators, but exhibit narrow emission lines in their spectra (possibly) all along the covered wavelength range: candidate sources of this kind are QSOs, Be stars and other emitters, Wolf-Rayet stars etc. For the Gaia DR3 a total of 211 such sources have been selected, including 188 QSO from the SLOAN survey, 17 young stellar objects from the XSHOOTER database and 6 emission line sources from STELIB (Le Borgne et al. (2003)). Most of these objects have SEDs only partially covering the Gaia wavelength range: this limitation had consequences on the processing strategy, as described in Section 5.3.5.
The monochromatic LSF is obtained by integrating in the AC direction. The LSF model is based on a linear combination of the product of two sets of basis functions modelling respectively the AL and the wavelength dependency; these bases have been derived with the Generalised Principal Component Analysis (GPCA) on a large set of theoretical LSFs simulated for the nominal instrument. The framework for PSF/LSF simulations, as developed by Lindegren (Lindegren (2006)), was applied to the present case of the BP/RP spectra calibration, and is described in detail in the technical note Montegriffo (2017).
Theoretical PSFs were computed with an accuracy in the AC direction of up to pixel from the centre, and the final wavelength grid contains 100 points distributed between and nm. A total number of 5000 sets of PSF were randomly generated; for each set, a corresponding set of monochromatic LSFs was obtained by summing the PSFs along the AC direction and multiplying by an appropriate scaling factor. Each LSF is then considered twice by reversing the AL axis in order to preserve the intrinsic AL symmetry of the problem.
However, such LSFs only span a limited AL interval of pixel, therefore a proper model for the wings was derived and applied also in order to correctly normalise the LSFs. The wings for polychromatic and quasi-monochromatic LSF were modelled by a tail function which is strictly zero for , decreases as for thus reproducing the expected behaviour of a true LSF, and is a fourth-degree polynomial in the interval , where is the AL coordinate in pixel, and , are the values at wich the tail function joins the LSF. To make the analytical LSF representation continuous at a proper scaling factor is applied. As an example, the reconstruction of a LSF at nm and nm is shown in Figure 5.9.
GPCA algorithm is able to preserve the spatial locality of pixels into an image by projecting the images to a vector space that is the tensor product of two lower-dimensional vector spaces. Each simulated LSF is arranged in a matrix whose columns are the monochromatic LSFs sampled on a given grid; the mean LSF is computed and subtracted from each LSF, and the residuals are arranged in a stack of 10000 matrices which are decomposed through GPCA into a 2D set of bases designed by two matrices and modelling respectively the dependence of LSF in sample and wavelength spaces. The numerical LSF can be approximated as:
or, in extended notation:
The above relation represents the implemented model for the LSF representation: it is 2D-interpolated to continuous variables by 1D-interpolation of U and W bases separately. The interpolation function for the U bases must satisfy the ’shift invariant sum’ condition, i.e. preserve the underlying function normalisation independently from the sub-pixel position of the sampling grid: the most suitable interpolation method is given by the S-spline representation described in Section 3.3.5. The interpolation for the right basis functions W is achieved with a cubic spline. A complete description of the model can be found in Montegriffo (2017).
For each field of view, dispersion functions are provided for the centre of each CCD in the form of the coefficients of the expansion:
denotes the AL image position in mm (in the Y FPRS direction),
in denotes the inverse wavelength, and
for BP and for RP.
The AL position in pixel units can be obtained by dividing Equation 5.7 by the pixel size of a
CCD in the along-scan direction .
The mean instrument dispersion function model is defined by:
denotes the AL image position in pixel units;
the model parameters and represent respectively the wavelength zero-point and scale: the zero-point is by construction the AL position corresponding to the reference wavelength. The nominal values assumed are , and and zero for any higher order term for both XP instruments.
This dispersion model defines the origin of the monochromatic LSFs as a function of wavelengths (see Equation 5.4). However, the origin of the LSF does not necessarily coincide with its centroid, because in general the LSF is not symmetric, its shape can change with the wavelength and hence introduces some degeneracy between the chromaticity and the dispersion (Lindegren (2006)). This degeneracy does not affect the instrument model because the dispersion model and the LSF origin are consistently defined. However, for a physical interpretation of the data it is necessary that the dispersion relation provides the centroid of the monochromatic LSF as a function of wavelength. This dispersion relation is provided as a lookup table where at each wavelength the centroid is computed using the Tukey’s bi-weight function by solving the non-linear equation:
for , where:
The dispersion curve for BP and RP is available from the DR3 area in the Gaia Cosmos portal.
The response model is built as the combination of a model for the nominal response with a parametrised cut-off and a distortion model to account for the deviations with respect to the current response
The nominal photonic response for XP instruments is modelled as the product of the following quantities:
is the telescope (mirrors) reflectivity;
is the attenuation due to rugosity and molecular contamination of the mirrors;
is the CCD QE;
is the prism transmittance curve which includes filter coating on their surface.
Figure 5.10 shows the nominal curves with an highlight on the photometer transmittance contribution to the overall curves. All these quantities were measured by Airbus DS during on-ground laboratory test campaigns. However, the filter wavelength cut-ons and cut-offs in the prism transmittance curves change over the focal plane due to uneven thickness of the prism coating producing bandwidth non uniformity: the expected variation is of about 3 nm in the position of the BP cut-off and about 2 nm in the position of the RP cut-on (hereafter for convenience cut-off will be used for both instruments). Moreover, combining measurements taken all over the focal plane results in a mean instrument passband with smeared cut-offs. To account for these effects the prism transmittance curve is modelled as follows:
where is the tabulated transmittance curve provided by Airbus DS truncated right before and after the cut-off wavelength for BP and RP, respectively:
and is a parametric curve based on the Gauss error function and its complementary function:
where represents the nominal cut-off wavelength ( nm for BP, nm for RP), and represents the nominal cut-off slope ( for BP, for RP). The parameter controls the wavelength position of the cutoff while the parameter sets the steepness of the cutoff slope. When the two parameters are set to zero the nominal transmittance curve is reproduced to a relative accuracy better than .
Figure 5.11 gives a visual representation of the proposed photometer transmittance model: the nominal transmittance model from GPDB (dotted blue lines) is compared to the models of Equation 5.13, Equation 5.14 and to two different realisations of the model obtained by setting the quantity respectively to 700 for BP and 600 for RP and setting the quantity to 5 (black solid line) and to 20 (black dotted line). These parameters values have been chosen as mere examples to illustrate the model behaviour.
The distortion model in Equation 5.11 is implemented as the exponential of a linear combination of basis functions in the sample space:
where is the nominal dispersion relation. The basis functions used for Gaia DR3 are spline functions of the order.
The internally calibrated instrument model can be expressed as:
where the kernel is a combination of the LSF and dispersion models. We define the effective SPD as the product between the source SPD and the instrument response:
The previous integral equation can be used to estimate the effective SPD given the mean spectrum sampled on some pseudo-wavelength grid, then the effective SPD is represented in some parametric form, e.g. a linear combination of a proper set of basis functions :
that can be rewritten as:
This equation shows that, since the observables are available in sample space and not in wavelength space, the choice of an orthonormal set of basis for SED representation does not guarantee that the corresponding instrument-transformed set represents as well an orthonormal reference system.
Furthermore, Equation 5.17 is a Fredholm integral equation of the first kind, which is generally difficult to solve because the problem is intrinsically ill-conditioned: there are many solutions which satisfy an integral solution slightly perturbed from the original one (the observed spectrum is affected by noise).
However, the current representation of the internally calibrated mean spectra described in Section 5.3.4 is based on a linear combination of Hermite functions roughly centred on the observed spectra:
where is the centre of the spectra in the AL coordinate system, and is a scaling factor to match the width of the observed spectrum with the natural width of the Hermite functions. The similarity between Equation 5.23 and Equation 5.21 suggests the possibility to define a set of functions such that:
These functions are the instrument-deconvolved versions of the bases used for the mean spectra representation: hereafter they will be called inverse bases. With this set of functions it is then possible to reconstruct externally calibrated SEDs using the same set of coefficients provided by the internal calibration for the representation of mean spectra: mean spectra and externally calibrated SEDs can then be sampled to some pixel or wavelength grid by simply changing the set of bases used:
The advantages of this approach are:
minimum impact on archive storage requirements;
the inversion process is made just once within DPAC calibration activities;
the instrument inversion process is not applied to measured mean spectra, but instead to analytic functions that are noise-free;
a simple tool can be provided to end-users to reconstruct sampled mean spectra either in the internal system or in absolute quantities.
Equation 5.17 shows that the effective SPD represents the observational spectrum smeared by the LSF and transformed to the pseudo-wavelength space, preserving the basic features of the observational spectrum. Therefore, each right inverse basis of Equation 5.24 is best modelled by the use of the same family of functions that represent the left bases, i.e. a linear combination of Hermite functions with a proper scaling factor and the same centre:
The complete description of the method used to derive the inverse basis parameters can be found in Montegriffo et al. (2023).
Figure 5.12 show as an example the and order Hermite functions and the corresponding inverse bases for BP (Top panels) and RP (Bottom panels) as well as the relative numerical residuals in the reconstruction of the original Hermite function.
As explained in Section 5.3.4 the mean spectra representation has been optimised to concentrate most of the information in the first coefficients, so to provide an efficient way to cope with noisy spectra by truncating the latter terms. This optimisation is realised by a rotation of the Hermite bases by the use of a square orthogonal matrix. The inverse basis function method is not affected by this optimisation as long as the same rotation is applied to the inverse bases.
As can be seen in Equation 5.25 the reconstructed SED depends on the inverse of the response function, therefore to limit errors in the reconstruction the sampling grid can be safely limited to the range nm for BP and nm for RP. The two partially overlapping spectra are merged into a single distribution by computing a weighted mean in the overlapping region nm with the weight and defined as:
Optimisation of model parameters is achieved by minimising in a least squares sense a based cost function:
where r represents the vector of residuals between the observations and the model values, and is the matrix of weights which is equal to the inverse of the covariance matrix.
The optimisation process is carried out using an implementation of the Differential Evolution Algorithm (DEA) as described by Storn and Price (1997). Due to the complexity of the model, great care has been taken to avoid as much as possible local minima in the space, especially in the first steps of the processing. The adopted strategy has been to begin the first iterations with a simplified version of the model (lower number of parameters and symmetric LSF), and progressively increase complexity as far as convergence was achieved. The general processing scheme is outlined in Figure 5.13 where six different stages are identified: 1) initialisation of the dispersion relation and the large scale modifications of the nominal response; 2) modelling of the interactions between dispersion and LSF core (chromaticity effects); 3) modelling of chromaticity variation with wavelength; 4) modelling of LSF wings; 5) modelling of fine structures in the response model; 6) final overall optimisation and generation of the ensemble of models. Table 5.2 summarises the number of parameters for BP and RP instrument models at different stages of the processing.
|LSF||BP||(0, 0)||(4, 1)||(4, 2)||(7, 3)||(7, 3)||(7, 3)|
|RP||(0, 0)||(4, 1)||(4, 2)||(4, 2)||(4, 2)||(4, 2)|
Each optimisation cycle consists of thousand DEA iterations, each one involving 50 walkers (different realisations of model parameters, initially distributed randomly around the starting parameters set), that are eventually stopped when the individual costs from the walkers converge to a common value. The parameter set with the lowest cost is used to initialise the following optimisation cycle and the procedure is repeated till convergence in parameter values is achieved, thus switching to the following stage. In the first stage only a selection of featureless SPSS is used, the response cut-off parameters are kept fixed to the nominal values and the cost evaluation is done avoiding the cut-off regions, while the dispersion relation is fitted using emission line (EL) calibrators. Since EL sources are not reliable flux calibrators (they include variable objects), a grey calibration is applied at each iteration: the input SPD of each EL source is scaled by a parameter that is evaluated to minimise the squared residuals between the current model prediction and the corresponding observational mean spectrum. Stages 2 and 3 are used to initialise the modelling of LSF, while a full optimisation is performed in the fourth stage. In stage 5 the basis inversion process is performed. Inverse bases are used to reconstruct the effective SPD for SPSS; by dividing effective SPD by the source SPD it is possible to trace the overall instrument response as shown in Figure 5.14.
Two aspects are worth noticing:
the signature left in the data by Balmer lines (, , in BP, in RP) and some Paschen lines (in RP beyond ) are clearly visible: this is due to the error in LSF modelling causing systematic differences in the profile of the absorption lines;
a wavy pattern with a well defined spatial frequency is present in the residuals, clearly visible in the BP data while in the RP case the data look more noisy.
It is not fully understood whether the wavy pattern originates from wiggles in the mean spectra (De Angeli et al. (2023)). We have used this data to upgrade the instrument response distortion model by increasing the number of spline knots to model the wiggles especially in the range [500, 800] nm, taking care to exclude the signatures due to absorption lines (the response model must not be a function of any source astrophysical parameter). The model curves represented in the plot have been computed after this upgrade. The final location of spline knots for the BP and RP response models is given in Table 5.3.
The last stage is used to create the ensemble of models: the DEA solver is run on the last instrument model till parameter relaxation is achieved (typically after iterations), then all the 50 walkers are saved into the database. While the walker that achieves the lowest cost is selected to represent the instrument model, the ensemble is needed to estimate the uncertainties in the simulated spectra, which are especially useful for other CUs to derive source astrophysical parameters.
The final instrument models for BP and RP are shown in Figure 5.15 where the corresponding instrument matrices sampled on regular wavelength and pixel grids are visualised together with the instrument response (black curves) and dispersion (dashed green curves).
Wavelength calibration accuracy
The accuracy of the wavelength calibration has been estimated by comparing the BP and RP mean spectra of a dataset of 102 QSOs with simulated spectra obtained by the convolution of the ground based spectra of these objects with the instrument model. The comparison, based on the wavelength position of 263 emission lines, shows that in general the accuracy is within 1%, except for 400 nm, where it is worse due to the scarceness of QSOs with emission lines in that range, and the presence of a systematic difference of 1 nm in RP, caused by the inability of the LSF model (see Section 5.3.5) to reproduce the instrument chromaticity. For more details, see Section 6.3 of Montegriffo et al. (2023).
Resolution of calibrated spectra
The spectral resolution of the calibrated spectra is shown in Figure 5.16. An accurate description of the process to determine it can be found in Section 6.2 of Montegriffo et al. (2023), here we just report the findings: the variation of the resolution with the wavelength is more uniform in RP than in BP; below 650 nm there is a sudden decrease in the RP resolution of the externally calibrated spectra, while BP shows small oscillations around 600 nm, probably caused by ripples in the response profiles. The resolution of a typical Gaia externally calibrated spectrum falls from to in the wavelength range nm, then jumps up to to decrease smoothly to at longer wavelengths.
The validation of the instrument models and the calibrated spectra was carried out by comparing the observed Gaia mean BP and RP spectra with the spectra obtained by feeding known SED to the instrument models. This comparison can be performed on sampled internally calibrated spectra (see Section 5.3.5) or on externally calibrated spectra (see Section 5.3.5). The dataset used for this purpose contains the SPSS and PVL sources (see Section 5.6 and Section 5.6.1) and the Next Generation Spectral Library (NGSL, Heap and Lindler (2016)), consisting of about 350 sources within the range [1.97–12.0].
Sampled mean spectra residuals
Figure 5.17 is an example of comparison between a mean and a simulated spectrum. This was carried out on the whole dataset and the residual, defined as percentage difference between the observed spectrum and the model prediction, was calculated. Figure 5.18 shows the residuals for the whole dataset, where each curve is colour-coded for the colour of the source. The grey vertical line marks the boundary between BP and RP wavelength ranges. The red and yellow curves represent the median and the and percentiles distributions. The two plots show different magnitude ranges as labeled in the figure, corresponding to the different window classes in the BP and RP spectra. The fact that the results are slightly different for the window classes implies that the internal calibration of bright sources did not converge fully to the mean reference system (see Section 5.3.4), mostly dominated by faint sources observed as 1D windows. For more details see Section 7.0 of Montegriffo et al. (2023)
Sampled calibrated SEDs residuals
Figure 5.19 shows two example of comparison between a Gaia externally calibrated spectrum and the model SED obtained by applying the instrument model to the corresponding high resolution external spectrum of the same source: in both cases there is good agreement between the observed and simulated spectra, however the left spectrum exhibits wiggles. The presence of the wiggles is due to the basis inversion process (see Section 5.3.5) and their intensity can change significantly. Figure 5.20 shows the percentage residuals for the whole validation dataset, obtained comparing the externally calibrated spectra with the reference spectra, colour coded for the colour of the sources. While the wiggles are clearly visible, the median of the distribution (red line) behaves very similar to the corresponding median of the internally calibrated mean spectra (Figure 5.18).
Comparison with external Johnson-Kron-Cousins photometry
The convolution of the calibrated Gaia low resolution spectra with filters in the wavelength range covered by the Gaia spectra allows to obtain wide and medium band synthetic photometry. The method and its validation is fully described in Gaia Collaboration et al. (2023g). An example is shown in Figure 5.21 where the BVRI magnitudes in the Johnson-Kron-Cousins system for a set of 32781 Landolt standard stars is compared with their synthetic BVRI photometry computed on externally calibrated BP and RP spectra. In the left panels of this figure is evident the presence of a magnitude dependent term for sources with 16, probably due to a systematic in the background estimation (see also De Angeli et al. (2023)).