Trigger Region Unit for the ALICE PHOS calorimeter

Hans Muller\textsuperscript{1}, Alexandra Oltean\textsuperscript{1}, Rui Pimenta\textsuperscript{1}, Dieter Rohrich\textsuperscript{2}, Bernhard Skaali\textsuperscript{3}, Xi Cao\textsuperscript{4}, Qingxia Li\textsuperscript{4}

\textsuperscript{1} CERN, Division PH, 1211 Geneva 23, Switzerland, \textsuperscript{2} University of Oslo, Department of Physics, Norway, \textsuperscript{3} University of Bergen, Department of Physics and Technology, Norway, \textsuperscript{4} Huazhong University of Science and Technology, HUST, Wuhan, Republic of China

Abstract

The Photon Spectrometer (PHOS) of ALICE measures electromagnetic showers of up to 100 GeV via a large matrix of PWO crystals, each read out by an APD. Trigger regions consist of 28*16 crystals, inter-connected via analogue signals generated on front-end cards and transmitted to Trigger Region Units (TRU) which digitize and process the analogue hit information. Eight TRU cards are embedded inside each PHOS module in water-cooled cassettes, each inserted between a block 14 FEE readout cards. Analogue sums are generated by fast summing shapers, with their outputs connected to the TRU via equal-length differential cables. The TRU receives analogue sums on 112 inputs and digitizes these via 12 bit ADCs which are inter-connected with a central FPGA via serial LVDS links. The level-0 and level-1 trigger algorithms are based on pipelined charge summing over 4 consecutive samples and over 4*4 crystal windows. Low latency level-0 decisions and more refined level-1 decisions are generated as a 40 MHz Yes/No sequence which is transmitted to the ALICE Central Trigger Processor. Reconfiguration logic allows detecting and correcting single event upsets even during trigger operation.

I. PHOS TRIGGER AND READOUT ELECTRONICS

The TRU trigger and FEE readout cards of PHOS have been designed for complimentary functions and share the same readout bus and the same power connections. They are operated together within a closed, water-cooled volume inside the PHOS module.

A. PHOS Trigger Approach

The first ideas on PHOS trigger were described in [1]. The key parameters for implementation of the TRU are summarized in Table 1.

<table>
<thead>
<tr>
<th></th>
<th>Latency in FPGA</th>
<th>Method</th>
<th>NRZ</th>
<th>Outputs</th>
</tr>
</thead>
<tbody>
<tr>
<td>Level-0</td>
<td>300 ns</td>
<td>2D: 4*4 crystals, Time: 4 samples, Low threshold OR</td>
<td>1</td>
<td>n.a.</td>
</tr>
<tr>
<td>Level-1</td>
<td>5500 ns</td>
<td>3 high $p_T$ thresholds</td>
<td>3</td>
<td>4 consecutive samples of 8*14 region</td>
</tr>
</tbody>
</table>

Efficiency losses along the TRU region boundaries account for ca 5%. Simulation have shown [2] that at 50 MeV threshold, single channel counting rates are up to 120 Hz. Conditioned by the low level threshold level-0 decisions, the level-1 trigger channel rates are in the order of 1 Hz for thresholds above 0.5 GeV. With a total of 17920 channels, the expected agglomerate level-1 trigger rate is in the order of 20 kHz.

Figure 1: Timing sequence of the level-0 trigger
B. Trigger Region Unit (TRU)

The PHOS trigger consists of 40 TRU modules, each processing the analogue pulse information from 16*28 crystals in 40 distinct trigger regions. Each regional TRU trigger is generated within one single Xilinx FPGA [3].

C. Front End Electronics (FEE)

The pre-amplified APD signals are digitized by 32-channel FEE cards [4] of 14 bit dynamic range. The 32 channels map to 2 parallel rows of 16 PWO crystals. Charge summing shapers combine 2*2 channels into analogue output pulses which are connected via differential cables to the TRU.

II. TRIGGER TASKS

A TRU generates two successive triggers (see table 1) which are derived from the hit information of its trigger region, received on 112 analogue inputs. The input signals are digitized by 12 bit ADC’s [5] at 40 MHz sampling rate. Level-0 is a low-latency trigger OR, using the 5 least significant bits for thresholds between 10…230 MeV. Level-1 processing starts after valid level-0 decisions, using only the upper 7 bits for 3 different programmable thresholds between 0.5 … 30 GeV. In total, a TRU has 4 trigger outputs: one level-0 minimum bias trigger and three level-1 triggers for low, medium and high pT. The level-1 trigger patterns of 4 consecutive samples of all channels are stored in a TRU hit memory.

Figure 2: Analogue sum generation on FEE cards

A. Trigger timing

The level-0 trigger has a time window of 800 ns between particle collisions and trigger signal arrival at the Central Trigger Processor (CTP). The timing sequence is depicted in some more detail in Fig.1. The level-0 decision latency in the FPGA is 300 ns, the level-1 latency is 5500 ns. With 40 m cable distance to the CTP, 200 ns are required for the transmission.

B. Analogue Sums

The trigger efficiency for the search algorithm with 2*2 kernel size, followed by a 4*4 digital sum, yields high trigger efficiency of >95 % at tolerable fake trigger levels [6].

Figure 3: Energy signal upper: CSP lower: analogue sum for TRU

Charge Sensitive Preamplifiers (CSP) on the FEE cards produce energy-proportional voltage steps (Fig.3). The energy sum from 2*2 crystal channels is generated by a summing shaper which takes input from 4 CSPs (see Fig.2). Since 90% of the scintillation light is produced in less than 50 ns, an integration time of 50 ns for the summing shaper is sufficient to produce an energy-equivalent pulse for transmission to the TRU. The ~100 ns pulse envelope is digitized by the TRU over four samples.
III. TRIGGER IMPLEMENTATION

A trigger decision requires that the primary energy information, deposited by showers over neighbouring crystals, recorded by the TRU over ca. 4 ADC samples, is summed up both across TRU channels and over sampling time. This is implemented in the FPGA as a 4*4 sliding window algorithm, taking place in 91 simultaneous instances. The time reference is the 40.078 MHz LHC clock, received by the TTC receiver of the PHOS readout system and distributed to the TRU via the common FEE/TRU readout bus.

A. Input bandwidth to FPGA

With 112 analogue inputs, digitized to 12 bit precision in 25 ns intervals, the continuous input bandwidth for the trigger FPGA is 53.76 Gbit/s. This vast amount of binary information is transmitted in parallel via 112 serial LVDS links which interface the ADCs to the FPGA at a bit rate of 480 MHz.

B. 480 MHz De-Serializer

The 112-fold de-serializer implementation in a Xilinx Virtex-Pro-II FPGA is based on overclocked channel input to two parallel registers at 240 MHz, with opposite clock phases. The result is obtained via multiplexers who align the odd and even bits into a 12 bit result register.

C. Pipelined 40MHz sample sums

All 12-bit samples from 112 inputs are updated in their result registers with the 40 MHz LHC clock. A 5-deep pipeline of these registers keeps a “snapshot” of 4 consecutive samples every 25 ns. Hence a 40 MHz adder over the 4 registers continuously measures the energy-equivalent integral over 4 samples of one TRU channel.

D. Sliding window search

Due to the shower distribution over up to 4*4 crystals, the TRU channel sums need to be added together in 91 combinations, corresponding to a 4*4 kernel size within a 16*28 crystal region. Since results are to be compared with their thresholds every 25 ns, all summing combinations and comparisons are implemented in parallel logic, clocked at the 40 MHz LHC machine rate.

E. Trigger thresholds

All 91 results of 4*4 sums are compared against individually programmable thresholds. The comparisons are in phase with the 40 MHz update rate of the register pipeline. The result of all comparisons is a simple OR: a single positive comparison counts as a trigger. The trigger output is therefore a 40 MHz-serial yes/no code (also called NRZ). The level-0 trigger has only a single NRZ output, whilst the level-1 trigger has 3 outputs of different thresholds.

F. Level-0 trigger

The level-zero output is a 40 MHz serial yes/no information stream. The fixed latency (ca 600 ns after interaction) is defined by a fixed number of 40 MHz LHC clock cycles after which the NRZ output is generated.

G. Peak detection

Trigger decisions are due at 40 MHz output rate, i.e. the trigger instances needs to be determined to a precision of +12.5 ns relative to the 4-sample envelope of the digitized analogue signal. A simple and fast peak-finder method is envisaged, comparing simply the magnitude of samples. This would require however that the 40 MHz phase is pre-adjusted relative to the analogue sum signal.

IV. TRU STATUS

The TRU logic design was finalized at CERN in spring 2005 and the two-sided, 11 layer PCB layouts (Fig. 4) was finalized in September 2005 at HUST in Wuhan. Particularly difficult was the differential, matched impedance routing of 112 LVDS signals between 14 ADCs with 240 MHz double data rate serial outputs to the FPGA’s dense ball grid array of 1152 pins.

Figure 4: TRU board sub-section: central FPGA and satellite ADC’s with analogue sum connectors (one side)

An alternative method for peak finding [7] with a few samples is also considered for the level-1 trigger where more processing time is available. This method is based on knowledge of the invariant shape of the pulse and uses scalar products to determine the peak amplitude and time.

H. Level-1 trigger

Whenever a level-0 comparison of a trigger channel was positive, the same TRU channel qualifies for level-1 processing. The level-1 trigger has a considerable longer FPGA latency (5500 ns) compared to the level-0 process (300 ns). There are three different programmable thresholds dedicated for three different level-0 outputs. If a level-1 “yes” trigger is registered, the 8 bit binary values of all those channels which participate in level-1 processing are written to an 8*14 hit memory.

IV. TRU STATUS

The TRU logic design was finalized at CERN in spring 2005 and the two-sided, 11 layer PCB layouts (Fig. 4) was finalized in September 2005 at HUST in Wuhan. Particularly difficult was the differential, matched impedance routing of 112 LVDS signals between 14 ADCs with 240 MHz double data rate serial outputs to the FPGA’s dense ball grid array of 1152 pins.

Figure 4: TRU board sub-section: central FPGA and satellite ADC’s with analogue sum connectors (one side)

An alternative method for peak finding [7] with a few samples is also considered for the level-1 trigger where more processing time is available. This method is based on knowledge of the invariant shape of the pulse and uses scalar products to determine the peak amplitude and time.

H. Level-1 trigger

Whenever a level-0 comparison of a trigger channel was positive, the same TRU channel qualifies for level-1 processing. The level-1 trigger has a considerable longer FPGA latency (5500 ns) compared to the level-0 process (300 ns). There are three different programmable thresholds dedicated for three different level-0 outputs. If a level-1 “yes” trigger is registered, the 8 bit binary values of all those channels which participate in level-1 processing are written to an 8*14 hit memory.
A. SEU tolerant FPGA reconfiguration

The expected radiation dose in the Alice detector environment is 0.5 Gy over 10 years with a Neutron fluency of $8.6 \times 10^{10}$ N/cm$^2$ [8]. Though the TRU is quasi shielded by 18 cm of PWO (18 radiation length), the big chip size of the FPGA chip with 50,000 logical elements necessitates precautions against Single Event Upsets (SEU). The TRU implements therefore the same auto-reconfiguration of the Xilinx Virtex Pro devices as developed for the Alice RCU [9]. An Actel ProASIC FPGA is used to reload at intervals “frames” of the Virtex configuration from a largely radiation-immune Flash device.

V. IMPLEMENTATION OPTIONS FOR EMCAL

The PHOS TRU and FEE modules are also planned to be used for the ALICE EmCal detector. For this purpose, options were added to the TRU board which allow EmCal to build TRU hierarchies and to receive 10 bit coded level-0 multiplicity input from the Alice V0 detector. For a high bandwidth interconnection between hierarchical TRU’s, four optical “Rocket-I/O” ports of 2.4 Gbps capability are optionally available. The optional 10bit coded multiplicity input requires a multimode 850 ns fibre connection to an RJ-45-style, optical transceiver.

VI. ACKNOWLEDGEMENTS

The TRU project is financed by the Norwegian Research Council and INTAS project 03-5747. The Auto-reconfiguration logic for Xilinx FPGAs was originally designed by the University of Bergen, Department of Physics and Technology. Special thanks to Gerd Troeger of KIP Heidelberg for very valuable assistance with the 240 MHz DDR de-serializer.

VII. REFERENCES

[2] PHOS Rate Simulations, Mai 2004, D.Roehrich and Hondyan Yang, University of Bergen, Norway
[3] XC2VP50 device Virtex-II-Pro Family, see www. Xilinx.com
[6] L0/L1 triggering with PHOS, D.Roehrich et al. presentation to Alice technical board, 2003