New Developments in Data-driven Background Determinations for SUSY Searches in ATLAS

High-energy jets, missing transverse energy ( Emiss T ) and possibly leptons are the typical signature of R-parity conserved SUSY events at the LHC. The o bs rvation of an excess of events with these signatures with respect to the Standard Mo el prediction may manifest the presence of SUSY. Due to the poor knowledge of St andard Model crosssections, parton distribution functions, underlying even t a d parton showering at the LHC energy scale, as well as insufficient knowledge of the det ector itself, a reliable prediction of the Standard Model backgrounds should be deri ved mainly from the experimental data with reduced reliance on Monte Carlo simu lations. In ATLAS the searches for a SUSY signal in events with a multijet and Emiss T signature are classified according to the number of reconstructed leptons (electron s or muons). In this note we present an overview of data-driven methods recently deve loped by the ATLAS collaboration to estimate the Standard Model background in SUSY searches in the one-lepton mode. The detailed description of the cuts appli ed in the one-lepton mode can be found in [1]. The most important Standard Model proces ses contributing to the background in this mode are tt̄ and W plus jets production.


INTRODUCTION
High-energy jets, missing transverse energy (E miss T ) and possibly leptons are the typical signature of R-parity conserved SUSY events at the LHC.The observation of an excess of events with these signatures with respect to the Standard Model prediction may manifest the presence of SUSY.Due to the poor knowledge of Standard Model crosssections, parton distribution functions, underlying event and parton showering at the LHC energy scale, as well as insufficient knowledge of the detector itself, a reliable prediction of the Standard Model backgrounds should be derived mainly from the experimental data with reduced reliance on Monte Carlo simulations.In ATLAS the searches for a SUSY signal in events with a multijet and E miss T signature are classified according to the number of reconstructed leptons (electrons or muons).In this note we present an overview of data-driven methods recently developed by the ATLAS collaboration to estimate the Standard Model background in SUSY searches in the one-lepton mode.The detailed description of the cuts applied in the one-lepton mode can be found in [1].The most important Standard Model processes contributing to the background in this mode are t t and W plus jets production.In SUSY events, the E miss T is produced mainly by the pair of non-interacting lightest supersymmetric particles and the restriction on m T does not hold.Following this, a signal region is defined by m T 100 GeV, while the events with m T 100 GeV constitute the control sample.The number of Standard Model events in the signal region and at high values of E miss T is derived from the control sample, applying the efficiency of the m T cut.This efficiency is estimated in the low-E miss T region (normalization region), where the contribution of SUSY events is expected to be low.
The method is based on two assumptions.First, it requires low contribution of SUSY events both in the control sample and in the normalization region.Second, the method requires that E miss T and m T are uncorrelated.The first assumption might not hold especially for low-mass SUSY.The second assumption is not true mainly because the fractions of semi-leptonic and di-leptonic top events depends on m T .However the correlation between E miss T and m T is not large, the linear correlation factor is about 4%.Due to the SUSY contamination in the control sample and in the normalization region, the estimate delivered by this method is above the true Standard Model background.The correlation between E miss T and m T leads to background under-estimation for high-mass SUSY and to over-estimation for low-mass SUSY.
One can correct for the signal contamination in the control sample in the following way.First, the background in the signal region is estimated, and the remaining events are taken as the signal estimate.Then the signal estimate is translated back to the control region (m T 100 GeV) and subtracted from the background.Figure 1 compares the true Standard Model background to the estimated one in the presence of mSUGRA SUSY signal with mass scale of 600 GeV.The left (right) figure shows the comparison before (after) correcting the estimate for the SUSY contamination in the control sample.The correction procedure may not work for low-mass SUSY, as in this case the control sample contamination would be too large.Another disadvantage of the method is the need of an explicit assumption on the signal shape to translate it from the signal region to the control region.

COMBINED FIT
This method [1] estimates separately the contribution of the various Standard Model processes.Each Standard Model process, semi-leptonic t t, di-leptonic t t and W (+ jets) Distributions of E miss T , m T and m top in semi-leptonic and di-leptonic t t events, W (+ jets) events and SUSY events for mSUGRA point with SUSY scale of 600 GeV.Superimposed lines are the parametrizations used in combined fit method.
is parametrized in terms of the distributions of three variables, E miss T , m T and m top1 .The parametrization is based on the Monte Carlo prediction for the distribution of the variables, as well as on the analysis of dedicated control samples.The contribution of SUSY signal is described with a "generic shape", derived from the shapes of the ATLAS benchmark SUSY points [2].The simultaneous fit of the distributions of E miss T , m T and m top is performed for the sum of background components and the "generic shape" of the SUSY contribution.The fit parameters include the normalizations of each background and SUSY contributions as well as parameters of the backgrounds parametrization and the SUSY "generic shape".Figure 2 shows the distributions of the variables E miss T , m T and m top compared to the chosen parametrization for the background components and the ATLAS mSUGRA benchmark point with a mass scale of 600 GeV.The advantage of this method with respect to the m T -method is the explicit accounting of the correlation between the variables.

TILES METHOD
In the Tiles Method [3] the m eff -m T plane 2 is divided in quadrants, called tiles.Each tile has contributions from SUSY signal and from the background.The fractions of the Standard Model events entering each tile is estimated from the Monte Carlo.The yields of SUSY and background events is estimated by solving the system of equations Here A, B, C and D denote four tiles, n X is the measured number of events in the tile X , f X is the fraction of Standard Model events in the tile X , n SM is the unknown number of Standard Model events in all tiles and SU SY X is the unknown number of SUSY events in the tile X .The last equation comes from the assumption of no correlation between m eff and m T for signal events.The correlation of the variables for the Standard Model contribution is fully taken into account by the values of f X .The method can be generalized to a N ¢ N tiles configuration.In such a configuration the system of equations ( 1) is over-constrained and a solution can be found by minimizing the extended log-likelihood estimator constructed from the measured number of events in the tile and the expected number of events calculated as the sum of Standard Model and SUSY contributions.
The statistical error of the method depends on the tiles configuration and can be optimized for a given luminosity.With a small number of tiles the system only has a few degrees of freedom and the statistical fluctuations of the number of measured events in the tiles have a large influence on the estimated yields of signal and Standard Model events.In the configuration with many tiles, statistics in each tile is low, thus the statistical error of the estimate is high.The optimal number of tiles for a luminosity of 1 fb 1 is found to be 8 ¢ 8.A configuration with many tiles has the advantage of the possibility to add new parameters to the fit, for example, a linear correlation factor, or separation of the various background contributions.

CONCLUSION
Several data-driven methods for the estimation of Standard Model background to the one-lepton SUSY searches are presented.The conventional m T -method has been studied and several developments have been proposed.The over-estimate due to the presence of SUSY events in the control sample can be corrected by an iterative procedure.The combined fit technique and the Tiles Method allow the explicit accounting of variable correlation and presence of SUSY.
m T -METHOD: REVERSING THE m T CUT In Standard Model processes the main source of the E miss T and isolated leptons is the decay of the W boson (produced directly in the pp collisions or in the top quark decays).Consequently the invariant mass constructed from the transverse component of the isolated lepton and E miss T (called transverse mass, m T ) should not exceed the mass of the W boson in an Standard Model event.
FIGURE 2. Distributions of E missT , m T and m top in semi-leptonic and di-leptonic t t events, W (+ jets) events and SUSY events for mSUGRA point with SUSY scale of 600 GeV.Superimposed lines are the parametrizations used in combined fit method.