ATLAS pixel detector timing optimisation with the back of crate card of the optical pixel readout system

As with all detector systems at the Large Hadron Collider (LHC), the assignment of data to the correct bunch crossing, where bunch crossings will be separated in time by 25 ns, is one of the challenges for the ATLAS pixel detector. This document explains how the detector system will accomplish this by describing the general strategy, its implementation, the optimisation of the parameters, and the results obtained during a combined testbeam of all ATLAS subdetectors.


Context
The ATLAS experiment is one of the general purpose detectors at the LHC that will begin operating in 2007. After starting at a center-of-mass energy of 900 GeV, the energy will be increased to 14 TeV. One of the major challenges for the experiments at the LHC is the high bunch crossing rate which is needed to observe rare physics events. After a short ramp-up phase, the bunch crossing rate will reach 40 MHz. This imposes the requirement that the detector data is assigned to the right 25 ns timing window with high efficiency.
The ATLAS pixel detector will act as the vertex detector within the ATLAS experiment. Although not included in the first level trigger system, it will be the device used to tag b-jets by finding secondary vertices. This is a key technique used in many physics analyses being prepared for the next years. For this purpose, the spatial resolution of the pixel detector is the most important feature, and this leads to the fact that over 90% of the approximately 90 million data acquisition channels of the ATLAS experiment are concentrated in the pixel subsystem, which occupies only 1/400000th of the volume.
To have any chance of exploiting sophisticated tracking algorithms, the hits detected by the sensor need to be assigned to the correct bunch crossing. As the pixel detector will not be read out -1 -

JINST 2 P04003
every clock cycle, hits wrongly assigned are lost or, if the event to which they have been incorrectly assigned is read out, they will appear as ghost hits.
For the ATLAS pixel detector, as the innermost tracking device, there are several contributions to the timing to be taken into account. As a silicon detector, the collection of the charge released by the passing particles is below 10 ns [8]. This will degrade with irradiation, but can be treated as a constant offset for all pixel cells. The processing of the signal within the first electronic stages is limited by the permitted power budget.
The timing of each individual ATLAS pixel detector module is important in the sense that the individual clock phases need to be adjusted with a precision of less than 1 ns to not lose detection efficiency or produce ghost hits by assigning them to the wrong bunch crossing. To be able to separately adjust each of the 1744 modules within the pixel system, each is driven by its own clock. To cope with all contributions to the timing differences, this clock phase can be delayed in steps of 300 ps with respect to clock delivered from the LHC, to which all internal pixel system clocks will refer. This report shows the results of a study made during the ATLAS combined test beam using a beam structured in bunches synchronized to a 40 MHz frequency as will be the case with the LHC. A technique to determine the optimal setting for the clock phases is described.

The detector system
As a high energy physics general purpose detector, ATLAS provides muon spectrometry, high performance calorimetry, and a very high resolution tracking system. The innermost part of the tracking system is the pixel detector [1]. It is comprised of 1744 hybrid detector modules. Each module is an assemblage of: a 2 × 6 cm 2 silicon sensor, implementing 46080 sensor cells, most of 50 × 400 µm 2 area; 16 readout chips per module, designed in a radiation tolerant deep sub-micron process; and a Module Controller Chip (MCC) to steer the module, mounted on top of a flexible circuit providing the intra-module connectivity. As particles traverse the detector, a hit is detected in the sensor. The electrons collected are read out by front end chips, in which the signal is amplified and digitized. The energy information of the signal is translated into Time over Threshold (ToT) information in the digital signal. In the event of a trigger, the MCC collects and formats the data of the chips and sends it to the readout system.
The modules are connected to the readout via an optical link. The on-detector end of this link are the optoboards which are connected to the modules electrically. Each optoboard serves six or seven modules. The complete readout chain is described in the next section.

ATLAS pixel readout system
The off-detector readout electronics is connected to the on-detector electronics via 80 m long optical fibres. Clock, commands, and physics data are transmitted via these lines. The signals to the modules are generated at the Readout Driver (ROD) and converted into optical signals on the Back of Crate card (BOC). Clock and command data for each individual module are combined into a BiPhase Mark (BPM) signal and sent optically to an optoboard, where the decoding is performed. The optoboard forwards the data and the clock to the module.
In the reverse direction, the module data are transferred to the optoboard, converted into optical signals, and sent to the Back of Crate card where the data is received and prepared for further formatting and event building.
The Back of Crate cards and the Readout Driver are paired in VME crates. In total there are 132 pairs connected to 272 optoboards needed for the detector. For each module there is one fibre for the Timing, Trigger and Control data (TTC), and one or two 1 fibres for the module data.
The data recorded by the pixel detector modules is buffered inside the front end chips for 2.5 µs. If after this time a level one (LV1) trigger appears, the data is read out by the MCC, formatted, and sent to the readout electronics. The trigger received by the MCC initiates the readout of the data which corresponds to the triggered bunch crossing. Therefore, in the front end chip a timestamp is stored for the data and the trigger. If these timestamps match, taking into account a constant latency between data and trigger, the data is read out.
The data format contains a header, LV1 Trigger ID, bunch crossing ID, and chip identifier, followed by the row, column, and time over threshold (ToT) information of each hit pixel. This structure is repeated for each chip on the module and it is finalised by a trailer. The LV1 Trigger ID and bunch crossing ID are module internal counters which are controlled and reset by the Readout Driver. The ROD identifies the module data corresponding to a given trigger by these identifiers. It assembles and formats the data of all modules for the same trigger into one event, including the event identifiers required for the offline analysis [2].
The readout electronics of the ATLAS pixel detector consists of four major parts: • The Timing Interface Module (TIM, [3]) receiving the ATLAS-wide timing, trigger, and control signal, and distributing it to the BOC cards as clock and command information.
• The ROD -preparing the control data sent to the modules -formatting (event fragment building) of the data sent by the modules -performing on-line histograming and calibration.
To fulfil this, the ROD makes use of a chain of FPGAs for treating the data streams, and independent DSPs for the on-line monitoring and calibration calculations.
• The BOC acts as the off-detector opto-electrical interface. This is the main component of the timing. The modules are served with individual clock signals. These are derived from the TTC clock as received from the TIM.
• The optoboard, acting as the on-detector opto-electrical interface.

Back of crate card
This card was developed under the leadership of the Cavendish Laboratory, Cambridge (UK), for use in the ATLAS Silicon Tracker and adapted to pixel detector usage by the authors. It's main  function is the opto-electrical conversion of the signals sent to and received from the detector [6]. Conversion and alignment of received data for the first processing stages is done here.
For the steering of the detector, clock and command data for the modules are combined into a single biphase mark encoded signal, which is sent optically to the optoboard. In figure 2 the principle of the encoding is shown. The clock for all modules connected to one BOC is generated out of the received system clock on the BOC [4].
On the detector, the optoboard [7] decodes the BPM signal into clock and command data, which is sent electrically to the modules. As the decoding is done synchronously, this enables the adjustment of the phase of the clock applied to each individual module. For this, the output stage of the BOC is able to delay each individual output stream (one per module) in steps of 300 ps, up to a maximum of 40 ns. This will allow one to compensate for the differences in cable lengths used, the variations in time of flight caused by the relative positions of the modules, and other module to module runtime variations.

Front end electronics
The analogue part of each pixel electronics cell is made of a current feedback amplifier, a pulse shaper, and a discriminator applying a threshold to eliminate noise.

Time over threshold
The current feedback of the amplifier inside the pixel electronic circuit allows for measuring the charge collected within the connected sensor cell by counting the number of clock cycles the signal stays above threshold. Therefore, two timestamps are stored for each hit. One for the time the signal of the amplifier goes over the threshold, and the second one when the signal falls below the threshold. The time for which the signal is above threshold can be calculated in units of clock cycles inside the front end logic.
Nominally, the feedback current is adjusted to result in a ToT value of 30 clock cycles (of 25 ns) for a charge corresponding to the traversal of a minimum ionizing particle (20 ke − ). It is important to realize that for most hits, the charge will be shared by up to four detector cells. Thus, this charge measurement improves the spatial resolution [5] by means of a centroid algorithm which will be applied off-line. The ratio of the ToT value for the hit pixel cells might vary between equally shared and one cell with high ToT and up to three with rather low values. The most interesting case is having four cells with equal amount of charge and, therefore, equal ToT readings. This would mean that if it is not possible to register this amount of charge, the hit would be lost for the off-line analysis.
The averaged translation of the measured ToT values into the amount of charge collected is given in figure 3. The ToT is measured in number of clock cycles; thus, in discrete numbers. The calibration of the translation into charge is sensitive to many parameters, such as supply voltages and temperature. Therefore, this translation is not applied in the following, but rather all analyses are done in ToT values.

Time walk
The rising edge of the pulses at the output of the amplifier and pulse shaper stage in each pixel circuit of the front end electronics has a slope that depends on the signal amplitude. Low amplitude signals are flatter than high amplitude pulses. This leads to a difference in time to cross the discriminator threshold for different amplitude signals. In the Atlas pixel system this difference can be up to three clock cycles.
In figure 4, this is illustrated by superimposing a high and a low output signal of the amplifier/shaper stage of a pixel. The slope of the falling part of the signal can be adjusted by the feedback current of the amplifier. This enables the tuning of the ToT. This tuning is performed with changing DAC values inside the front end chips and can therefore be done on the installed modules. The tuning provides a detector in which each pixel cell reacts similarly to hits with same energy deposition in the connected sensor cell. To first order, one can presume that the maximum output of the analogue stage is reached at the same time after particle traversal, independent of the amount of charge collected.
Due to the time-walk effect, meaning that low energy hits will be registered later in time, the assigned time stamp for crossing the threshold might belong to the next clock cycle and with this to the next bunch crossing. This will cause a shift of hits with a ToT below a certain value into following events/bunch crossings. This is unavoidable, but needs to be minimized.

Measurement
In 2004, a slice of the barrel section of the ATLAS experiment was built up at CERN. The different subdetector parts were situated similar to their configuration in the ATLAS detector. In front of the innermost detector part, the pixel detector, a wire target was installed. Measurements with different kinds of particles (electrons, protons, pions) at an energy between 1 GeV and 350 GeV have been performed. The measurement presented here were done using a pion beam with an energy of around 100 GeV. The pixel subsystem itself was comprised of six detector modules arranged in three layers. Each module had an incident angle consistent with what will exist in the barrel section of the final system. This results in a most likely cluster size of two to three pixels per particle traversing. A prototype of the trigger system was installed in the combined test beam setup. It enabled the recording of track data with the detector system. The measurements which are shown here were obtained with the pixel subsystem only. For these measurements, an additional feature of the pixel detector front end readout electronics was used. For each trigger received by the module, it is possible to cause internally up to 16 continuous clock cycles to be read out. (Because of bandwidths limits, it will not be possible to make use of this feature at an occupancy given by the ATLAS environment.) As the trigger signal has a fixed latency of 100 clock cycles, this can be used to read out not only the one clock cycle (bunch crossing) that caused the trigger, but also a selectable number before and after this trigger cycle. For the presented measurements 8 continous clock cycles have been read out the pixel detector modules per received LV1 trigger.  all hits are spread over three clock cycles.

Efficiency
One may define a registering efficiency as the fraction of hits with a given ToT value assigned to the expected bunch crossing, relative to all hits with this ToT. For one ToT value, the dependence of this efficiency on the delay is shown in figure 7. Three values can be extracted from this. One is the height of the plateaux, which is the efficiency one can reach for the ToT value inspected. The other two are the delay settings, at which 50% effciency is reached. For the left edge, this is called the "in-time delay", as for the right edge this is called the "out-of-time delay". The difference of these values is the assignment time-window defined by the clock-cycle, which is 25 ns.
If we now inspect the in-time delay for different ToTs, we can fit a function describing the timewalk behaviour of the front end electronics. This function approximates the front end behaviour by presuming that the time when the maximal output of the amplifier stage is reached is independent of the amplitude (see signals sketched in figure 4). Geometrically, the in-time delay is then given by: with t 0 the time of the maximum signal, a the amplitude of the signal, s the threshold applied, and c the offset for which the hits with an "infinite" ToT are registered in time. The minimal runtime is fitted as 18.4 ± 0.15 ns in the example shown. Adding the registering window width of 25 ns, one sees that low ToT hits will be registered too late. This is unavoidable. To optimise the delay setting, a map of all registering efficiencies for all ToT-Delay pairs is evaluated in figure 9. One can think of figure 7 as a horizontal slice of this. The curves shown in this map are derived from the fit to the in-time delay (figure 8) and the registering window width. As one can see, they fit well with the borders of the high efficient region, except of for very low ToT values, which drop below the in-time (the lower) line. Very low ToT values mean very low energy deposited into the pixel sensor cell which causes less charge. For these small pulses inside the electronics, the asumption that the time of maximum output of the amplifier is independent of the amplitude is not true, because the discharging process of the capatitor -loaded by the amplifier signal -smears this out.

JINST 2 P04003
Delay ( The optimal delay setting may now be found in the maximum of figure 10. Here, the sum over all existing ToT values of the efficiencies for a given delay is plotted. The efficiency is the fraction of the registered hits in the correct trigger window over all registered hits. For a given delay this fraction is ToT-dependend. In the plot the normalised sum of the efficiencies for all ToT values between 0 and 60 against the delay is shown. This is the integral of the vertical slices of figure 9. The fitted curve can be parametrised as: between the kinks c 3 right of second kink Where ε tot is the total efficiency, ToT 1 (t) is the in-time ToT value for time t (as given through the left curve in figure 9), ToT 2 is the out-of-time ToT value for time t (as given by the right curve in figure 9), ToT max is the maximal ToT which is included and therefore a normalisation factor, and the constant c is a normalisation coefficient taking into account that the efficiency can not be 100% because of having many hits off-time for each ToT and low ToT hits are always registered for the next bunch crossing. This coefficient c also reflects the three different areas in this curve: 1. The area left the first kink, where no data is recorded (c 1 ), 2. the area between the two kinks of the curve, where more and more data are registered in-time (c 2 is fitted), and 3. the area right of the second kink where the data becomes out-of-time again (c 3 is fitted). The function takes into account the values obtained by the fit to figure 8 and the fact that the ToT range is limited.

Results
As figure 11 demonstrates, with the delay setting obtained by the method described above it is possible for nearly all hits to be registered correctly, except for those of very low charge. This does not affect the detection efficiency, as these hits with a very low amount of charge are primarily caused by unequal charge sharing of neighbouring sensor cells. Figure 12 depicts the efficiency with which a hit can be assigned to the correct time window as a function of ToT value.

Conclusion
A method to optimise the clock phases of the ATLAS pixel detector modules with respect to the correct assignment of hits to bunch crossing has been implemented and tested. It measures the in-time and out-of-time delays and determines the maximum efficiency. This results in a very good performance of the detector system. Further modifications are needed to apply this method to the high occupancy environment of the LHC. While calculations of the signal delay can be used to set the trigger latency approximately to the correct region, an optimisation will need to be done with the data acquisition system. For this purpose, an inverted method minimizing the number of hits which are registered in LHC empty clock cycles (corresponding to the gaps of the LHC proton bunch structure) is envisaged. To trigger on these gaps and minimize the recorded hits (only very low ToT hits will be in this trigger window) is a promising method to find the optimal clock settings.