Flavour Tagging in LHCb, Including Calibration and Control

Many of the precise measurements expected from LHCb will be based on the observation of time-dependent CP asymmetries, which rely on flavour tagging. This article shows that the high precisions aimed for at LHCb impose stringent conditions on both flavour tagging performance and its calibration. The algorithms that have been developed for flavour tagging at LHCb are described, as well as the procedure for obtaining the required level of calibration by using control channels.


INTRODUCTION
Flavour tagging algorithms aim for the determination of the flavour of reconstructed B mesons at their production. Tagging algorithms can be classified in two groups: same side algorithms (SS) and opposite side algorithms (OS). SS algorithms exploit the correlation of the charge of mesons produced in the fragmentation chain with the original flavour of the signal B, i.e., the B decay of interest. OS algorithms exploit the perfect anti-correlation between the flavours of the two B hadrons produced in the event, via the partial reconstruction of a flavour-specific final state of the other B hadron in the event (the tagging B).
The performance of tagging algorithms is characterized by their efficiency ( tag ) and the probability of the tagging assignment to be wrong (ω). The effective tagging efficiency ( ef f ) indicates the statistical degradation of the information due to tagging dilution, and is given by tag (1 − 2ω) 2 [1].
This article is organized as follows. Section 2 reviews the flavour tagging algorithms developed in LHCb. Section 3 shows how the event-by-event ω is computed, and describes the LHCb tagging performance as expected from Monte Carlo (MC) simulation. Section 4 demonstrates that ω needs to be precisely calibrated in order to measure CP asymmetries at the precision that LHCb aims for, and shows how control channels will be used for that purpose. Section 5 illustrates how the value * hugo.ruiz@cern.ch of ω measured using a control channel needs to be corrected to obtain the estimation of ω on a signal channel. Conclusions are given in Section 6.

FLAVOUR TAGGING ALGORITHMS IN LHCb
A brief description of the flavour tagging algorithms developed for LHCb follows. More details can be found elsewhere [2].

Opposite side tagging
Tagging information from the other B meson in the event is obtained from the charge of the lepton from a semileptonic decay, from the charge of a kaon from the b → c → s decay chain, or from the charge of the tracks from a secondary vertex reconstructed in the event.
The selection of tagging leptons and kaons is based on both kinematical and topological cuts and particle identification, all tuned to maximize ef f . A transverse momentum (p T ) and momentum (p) cut is applied at 1.2 (5), 1 (5) and 0.4 (3) GeV/c respectively for muons, electrons and kaons. An extra cut on impact parameter significance (IPS) is applied in the case of kaons, in order to reduce the large background from kaons from the primary vertex. Particle identification algorithms provide an efficiency (purity) of 85 (75%), 75 (85%) and 80 (80%) for muons, electrons and kaons, respectively [3]. If lepton or kaon tagging candidates are not found in the event, an iterative secondary vertexfinding algorithm is executed. The algorithm finds a vertex 56% of the time. On average, 70% of the tracks in the vertex correspond to the decay of the tagging B. A "weighted charge" of the vertex is computed by assigning to each track a weight, the magnitude of which slightly increases with p T . Only if the summed weighted charge is significantly different from zero, a flavour tag is assigned by assuming that the sign of the vertex charge corresponds to that of the b quark.

Same side tagging
The charges of the hadrons which are close in momentum space to the signal B provide information about the B flavour at production. In the fragmentation cascade, the accompanying quark in the B meson makes available a quark with opposite flavour, which tends to form a pion (kaon) for B 0 (B 0 s ) with a definite charge. In addition, around 10% of B 0 mesons are expected to be produced via the decay of a B * * + /B * * − state, accompanied by a pion from the decay which has the same charge as correlated pions from fragmentation.
The selection of pion and kaon candidates requires a p T (p) greater than 0.2 (2) and 0.4 (4) GeV/c respectively. The IPS of the particle is required to be lower than 3 and 2.5 respectively, to avoid candidates from secondary interactions. In addition, a set of kinematical cuts is applied in order to select pions or kaons which are produced close in momentum space to the signal B hadron (see Table 1).

COMBINATION OF TAGGERS AND TAGGING PERFORMANCE IN LHCb
In the case that several taggers are present in a given event (either OS or SS), they need to be combined into a single flavour tag. In addition, the likelihood from the different taggers needs to be combined to obtain an event-by-event ω, as this improves the statistical power of the fits that extract CP parameters.
The procedure followed in LHCb for the combination of tags and for the calculation of event-by-event ω is the following. For each event, a wrong tag probability ω i is computed for each of the tagging candidates present. This is done by means of a neural network that has as input several of the properties of the tagger (for example its p T and IPS) and that has been trained on independent MC samples. Then, the global tag of the event is obtained by the combination of the taggers using their individual right-tag probabilities given by (1 − ω i ). Finally, the ω of the event is computed by the combination of such probabilities. The overall ef f is about 25% higher than would be obtained if a single ω was assigned to the whole sample.
The tagging performances for each individual tagger, as well as for the combination, are shown in Table 2 for two typical hadronic decays of B 0 and B 0 s mesons. Note that the overall ef f is significantly higher for B s channels, as a consequence of the fact that the SS kaon selection is purer than that of SS pions. On the other hand, there are signficant differences between the performance of individual OS taggers for B 0 and B 0 s . The explanation is given in Section 5.

CALIBRATION AND CONTROL
The amplitude of CP asymmetries is diluted by experimental effects according to [1]: The first dilution factor corresponds to the effect of wrong tags and is given by The second dilution factor corresponds to proper time resolution and it is only relevant for the case of B 0 s oscillation, as for the case of B 0 the period of the oscillation is much larger than the expected proper time resolution of LHCb.
From Equations 1 and 2, it follows that ω enters as a first-order correction in the measurement of CP asymmetries. As a consequence, not only ω has to be kept as small as possible, but its actual value has to be well known to avoid introducing systematic uncertainties. Table 1 Cuts relating the kinemaical properties of the signal B and the SS tagging candidate. Here Δη is the rapidity difference and Δφ is the difference in azimuthal angle between the candidates. For the case of pions, MC studies show that the effective tagging efficiency is not improved by cutting on these two variables.
Δη Table 2 Flavour tagging performance for two typical hadronic B decays. As an example, the requirement that the uncertainty caused by ω is smaller than half the statistical uncertainty for the measurements of sin2β in LHCb [4] for 2f b −1 of data, yields a requirement of δω/ω ≤ 1%.

Control channels
MC simulation does not allow LHCb to reach the required level of precision in δω/ω, due to large uncertainties on parameters such as the relative contribution of bb production mechanisms and the average event multiplicity, which strongly affect the tagging performance. The calibration of ω has therefore to be performed by studying real data. Control channels with flavour-specific B decays and high yields will be used for this purpose. In the case of control channels with B + /B − decays, ω can be directly extracted by the comparison of the result of the tagging algorithm with the flavour indicated by the decay. For the case of B 0 and B 0 s decays, ω must be extracted by fitting the known oscillation pattern. Note that B + /B − can only be used for calibration of OS tagging.
The method of fitting the known oscillation pattern is expected to have important systemat-ical uncertainties for the case of B 0 s , as the oscillation is fast and hence a very good knowledge of lifetime resolution is required. An alternative procedure for calibrating ω for B 0 s channels consists of assuming that the OS tagging performance is the same as in B 0 or B + /B − channels, and measuring the SS tagging performance by studying events in which both OS and SS tagging information is present (double-tagged events). Table 3 shows the annual yield, the background to signal ratio, B/S, and the expected relative statistical uncertainty on ω (δω stat /ω) for a selection of control channels that have been considered in LHCb. The value of δω stat /ω is extracted from a realistic fit of the observed oscillation pattern including (uncorrelated) background [4,5]. For B + channels [6], only the δω stat /ω from OS is shown. It is computed by assuming a binomial distribution of right and wrong tags and neglecting the effect of background.
For the B 0 s → D ( * ) s μ + ν channel, the value of δω stat /ω shown in Table 3 corresponds to SS tagging only, obtained by using double tagged events, Table 3 Performance expected from a selection of control channels in LHCb, for 2f b −1 of data. For B 0 → J/ψ(μμ)K * 0 and B 0 s → D + s π − , a realistic fit including (uncorrelated) background has been performed to compute δω stat /ω. For B + channels, the computation is restricted to OS, a binomial distribution is assumed and background is not considered. For B 0 s → D as explained above.

TIS, TOS and TOB
The value of ω measured from control channels cannot be directly assumed for signal channels. MC simulation has shown that sizeable relative differences in ω of the order of 5% can appear between signal and control samples. Relative differences are still of the order of 2-3% when channels with similar topologies are considered. The reason for the discrepancies is that different trigger and offline selections bias the tagging information differently.
As an example of the possible biases, let us consider a channel which is difficult to trigger on, like a purely hadronic decay [7]. The fraction of the triggered events in which particles from the decay of the tagging B contribute to the trigger decision is higher than for the case of channels which trigger easily on the decay of the signal B. As a result, tagging particles are found in higher proportion in hadronic channels, and the tagging efficiency is enhanced. As an illustration, ef f is 4.46 ± 0.09% for B 0 → J/ψ(μ + μ − )K s and 5.05 ± 0.22% for B 0 → π + π − . A procedure has been developed to make possible the comparison of tagging performance between control and signal channels. The first step of the procedure consists of splitting the selected samples of both control and signal into different categories, according to which particles were used for the trigger decision. The three categories correspond to events that were triggered on signal (TOS), independently of signal (TIS) and those that triggered on both, that is, where the trigger selection required particles from both signal and the rest of the event (TOB).
In the sub-sample of TOS events for a given channel, the three-momentum distribution of the tagging B is determined by the offline and trigger selections on the signal B, via the kinematical correlations between the two B hadrons in the event. Hence, all tagging properties are expected to be the same on average for all channels in a given point in momentum space of the signal B.
In the sub-sample of TIS events of a given channel, the three-momentum distribution of the tagging B is determined by both the offline selection of the signal B and the fact that particles from the decay of the tagging B could have trigger the event. As in the case of TOS, all tagging properties are expected to be the same in average for all channels in a given point of momentum space of the signal B. Therefore, if all events within one of the TIS or TOS categories from a control channel are re-weighted in such a way that the three-momentum distribution of the B mesons reproduces that of the signal channel in the corresponding category, the ω obtained from the re-weighted control sample can be equated to that of the signal, within the given category. Indeed, the tests performed on MC data show that a simple p T re-weighting equalizes ω for each of the categories within current MC statistical uncertainties [8,9].
In conclusion, this procedure is expected to allow the calibration of ω for the TIS and TOS subsambles of signal, by using control channels. For the TOB sample, correlations between signal and tagging B depend on the channel, and no solution has yet been defined for the comparison of their ω dependence. Nevertheless, the fraction of such events is naturally low in LHCb (in the range 5-15%, depending on the channel).
Note that TIS and TOS categories are not exclusive, as a given event can be accepted by a number of different trigger lines. In order to avoid duplication of events, events which are TIS and TOS are placed in only one of the categories. As long as all events are consistently placed in the same category for control and signal channels, the calibration of ω remains unbiased.

CONCLUSIONS
The algorithms developed for flavour tagging in LHCb will allow for an effective tagging efficiency of 9.5% and 5% for B 0 s and B 0 mesons respectively. A strategy has been designed to allow a precise determination of the tagging performance without the use of Monte Carlo simulation, by using control channels. Comparison of tagging performance between channels will be possible via classification of signal and control events according to how each event was accepted by the trigger selection, followed by an event re-weighting that equalizes the momentum distributions of the B mesons between signal and control samples.