# 6.4 Processing steps

The detailed processing steps applied for this release are described in sections 4, 5 and 6 of Eyer et al. (2017) and will not be repeated here. Here we list the main components, their dependencies and output to the Gaia DR1.

## 6.4.1 Initial light curves pre-processing

Author(s): Lorenzo Rimoldini, Berry Holl, Jonathan Charnas

### Definition of observation time

Observation times are expressed in units of Barycentric JD (in TCB) in days $-2\,455\,197.5$, computed as follows. First, the observation time is converted from On-board Mission Time (OBMT) into Julian date in TCB (Temps Coordonnée Barycentrique). Next, a correction is applied for the light-travel time to the Solar system barycentre, resulting in Barycentric Julian Date (BJD). Finally, an offset of 2 455 197.5 days is applied (corresponding to a reference time $T_{0}$ at 2010-01-01T00:00:00) to have a conveniently small numerical value. Although the centroiding time accuracy of the individual CCD observations is (much) below 1 ms, the per-field-of-view observation times processed and published in this Gaia DR1 are averaged over typically 9 CCD observations taken in a time range of about 44 sec.

### Conversion from flux to magnitude

In the variability pipeline, both flux and magnitudes are used in different processing modules. The calibrated photometry provided by CU5 is provided in units of flux. To convert to magnitude, we use the $G$-band zero-point magnitude of 25.525 in the Vega system (see Section 5.3.4).

### Observation filtering

The observations related to spurious outliers and anomalous uncertainties were filtered out as explained in section 4.3 of Eyer et al. (2017).

## 6.4.2 Statistical parameter computation

Author(s): Leanne Guy

#### Input

The statistical parameter computation is the first step in the scientific processing chain following conversion from flux to magnitude and cleaning (see Section 6.4.1 and Section 6.4.1) of the time series of the selected sources (Section 6.2.1).

#### Method

The statistical parameter module encompasses the computation of a number of basic descriptive, inferential and correlation statistics of all light curves. These statistics provide a first general overview of the data and their distributions and are used to determine whether variability is present in a time series of Gaia observations. For more details see sections 2.2.1, 5.2, and 6.2 of Eyer et al. (2017).

#### Configuration parameters

• variance, skewness and kurtosis have been computed with a sample-size bias correction.

#### Published output

See Gaia DR1 table: phot_variable_time_series_gfov_statistical_parameters.

## 6.4.3 Variability Detection

Author(s): Isabelle Lecoeur-Taïbi

#### Input

The statistical parameters on a preliminary dataset preceding the Gaia DR1 data is discussed in sections 2.2.2 and 5.3 of Eyer et al. (2017) for more details.

#### Method

Variability analysis was performed using a Random Forest classifier trained on an equal number of OGLE-IV GSEP variable and constant objects (Soszyński et al. 2012).

#### Configuration parameters

• Classification attributes, described in section 5.3 of Eyer et al. (2017)

• Random Forest classifier with 500 trees.

#### Published output

No data from this processing step was published in Gaia DR1.

## 6.4.4 Period search and time series modelling

Author(s): Leanne Guy, Jan Cuypers, Joris De Ridder

#### Input

The time series and statistical parameters for sources identified as variable.

#### Method

The process of frequency (i.e. period) search and time series modelling, referred to collectively as Variability Characterisation, aims to characterise the variability behaviour of time series of Gaia data using a classical Fourier decomposition approach. The model to fit is given by Equation 6.1. The Characterisation process takes as input all time series identified as variable by the preceding Variability Detection module (see Section 6.4.3). The goal is to produce, in an automated manner, the simplest and statistically most significant model of the observed variability. See Sections. 2.2.3, 5.4, and 6.3 of Eyer et al. (2017) for more details.

#### Configuration parameters

• Frequency search

• only started if more than 9 observations available

• no de-trending was done prior to the frequency search

• frequency search method: least squares i.e. the generalised Lomb-Scargle periodogram (Zechmeister and Kürster 2009)

• minimum frequency: $2(\Delta T)^{-1}$ with $\Delta T$ the total time span of each time series

• maximum frequency: $3.9d^{-1}$ (to avoid aliases and parasite frequencies)

• frequency step : $(20\Delta T)^{-1}$ with $\Delta T$ the total time span of each time series

• the refinement of frequency search was done to the level $10^{-6}$.

• Modelling

• the polynomial part of Equation 6.1 was limited to degree zero

• no weights were applied in the fitting

• non-linear fitting (Levenberg-Marquard) was applied.

#### Published output

No data from this processing step was publishing in DR1.

## 6.4.5 Classification

Author(s): Berry Holl, Lorenzo Rimoldini

#### Input

The time series (to compute additional attributes), statistical parameters, and period and time series model for sources identified as variable.

#### Method

Supervised classification was used in the initial identification of candidate Cepheids and RR Lyrae stars. In specific, Gaussian Mixtures (GMs), Bayesian Networks (BNs), and Random Forests (RFs) supervised classifiers were constructed using the training sets (see Section 6.2.2) and applied to a preliminary dataset preceding the Gaia DR1 data, see sections 2.2.4, and 5.5 of Eyer et al. (2017) for more details.

#### Configuration parameters

• Classification training set, see section 5.5.1 of Eyer et al. (2017)

• Classification attributes, see section 5.5.2 of Eyer et al. (2017)

• Single stage Random Forest classifier with 150 trees, see section 5.5 of Eyer et al. (2017)

• Single stage Gaussian Mixtures classifier with 1 to 3 components per class, see section 5.5 of Eyer et al. (2017)

• Multi-stage Bayesian Networks classifier, see section 5.5 of Eyer et al. (2017).

#### Published output

No data from this processing step was published in Gaia DR1.

## 6.4.6 Specific Object Studies

Author(s): Nami Mowlavi, Gisella Clementini

#### Input

The time series for sources classified as Cepheid or RR Lyrae by classification and visual inspection, as described in Clementini et al. (2016).

#### Method

A detailed description of the SOS pipeline processing Cepheids and RR Lyrae stars is given in section 2 of Clementini et al. (2016).

#### Configuration parameters

The configuration parameters used in the SOS Cep&RRL processing is also given in section 2 and 3 of Clementini et al. (2016) with the method description.

#### Published output

All 3194 Gaia DR1 Cepheid and RR Lyrae stars have an entry in the following DR1 tables:

• gaia_source (phot_variable_flag set to VARIABLE, and to NOT_AVAILABLE for all other sources)

• variable_summary

• phot_variable_time_series_gfov

• phot_variable_time_series_gfov_statistical_parameters

• cepheid (for the 599 Cepheids)

• rrlyrae (for the 2595 RR Lyrae stars).