DIRAC note 05-12

July 2005

DIRAC DAQ

current state, problems, solutions

V. Olchevskii
Joint Institute for Nuclear Research

A bit of history

The main feature of DIRAC DAQ is that all data of subdetectors coming during the accelerator burst are transferred to VME buffer memory modules or stored inside dedicated electronic modules without any software intervention. Relatively slow operations like read-out, hardware checks and so on are performed during a pause between bursts and hence the maximal operation rate of DAQ electronics is provided.

The 1st version of DIRAC DAQ described in [1] combined hardware reading and event building in one program running on VME processor. At 2000, the first DAQ reached saturation. The DAQ team proposed a solution [2] and greatly reworked DAQ sowftware at 2001. The main idea behind this rework was to move an event building to a separate layer, running on main DAQ host. The schematic layout of this DAQ version is shown on figure 1.

As a result, DAQ2001 gained several advantages:

1. Reduced requirements for CPU and memory resources for VME processor

2. Increased scalability (with separate event building one can use several VME processors and other data sources).

3. Simplified programming of hardware specific readout
Some bottlenecks remained though

- **Time for hardware readout**
  
  The PS supercycle structure is shown on figure 2.
  The hardware should be read-out and initialised during the gap be-

```
PS supercycle 14.4 s duration
```

```
0.4s 1.8s
```

Figure 2: *PS super-cycle as seen by DIRAC setup (year 2000).*

tween bursts (1.8s)
• **Capacity of hardware buffers**
  The overall capacity of VME buffer memories is about 6 MB, but due to non-uniform filling the actual limit is about 3 MB.

  Fortunately, with T4 track analyser we had about 1K of events (standard runs), and without T4 - about 2K of events (special runs).
  With average size of event about 1KB, this gave us about 1MB/cycle for standard runs, and about 2MB/cycle for special runs. Therefore, neither VME speed, nor buffers capacity limited us.

**How the situation changes with DIRAC extension**

Due to higher intensity and another set of detectors we will have increase in both the number of events in one cycle and in the event size.

• **Number of events**
  According to [3] the total efficiency will be increased by a factor of 4 (4K events/spill for standard runs).

• **Event size**
  The modest estimation of event size gives increase by a factor of 2 (from 1 KB to 2 KB). Some reserve is desirable, therefore more safe estimation is 4 KB/event.

  It is also expected that we will work with 2..3 cycles per supercycle and that supercycle will have a duration of about 17 seconds. Number 2..3 means that DAQ should be prepared for working with 3 cycles/supercycle.

So we will have:

• with modest estimation:
  8 MB/cycle  24 MB/supercycle  120 GB/day

• with safe estimation:
  16 MB/cycle  48 MB/supercycle  240 GB/day
It is also planned that some detectors will migrate to new electronics
developed by V.Karpukhin. This electronics will use high-speed USB bus for
data transfer. The first detector which will use USB-readout is SciFi detector
and there are also plans for moving readout for VH and HH to USB.

Bottlenecks in new setup

Limitations in DAQ software
There are (or may be) some limitations in DAQ software (max. number of
events, sizes of data blocks, etc.), but these will be eliminated by DAQ team
during the revision of DAQ source code so no discussion needed.

Limitations which require hardware changes

- **Capacity of VME buffers and max. number of branches**
  We have a total of about 6 MB of VME buffer memories (6x1 MB CES
  HSM memories and 5x128KB LeCroy memories). Also note that CES
  memories can be cascaded, LeCroy memories can not.

  Even with modest estimation in mind, the data from one burst for any
  detector will be more than 128 KB, so we can forget about LeCroy’s.

  The \( \mu \)DC data block likely will be larger than 1MB, the same is true
  for IH also (SciFi not counted because it will use new Karpukhin's elec-
  tronics, not VME). That means that memories for these two branches
  should be cascaded. The consequence is that we may organize 4 VME
  read-out branches at most (now we have 9). That’s an obvious bot-
 tleneck, and the best solution is to move not only SciFi, VH and HH,
  but also drift chambers (without changes in front-end) and ionisation
  **hodoscope** to a new USB read-out as soon as possible, otherwise we
  will be limited with about 1.5K of events/cycle.

- **Network links bandwidth**
  Estimates say that existing network links will be capable for handling
  traffic even for “safe case” (16 MB/spill, 3 spills/supercycle) but this is
close to a limit so further increasing of data sample will require changes in network hardware.

- **VME processor replacement** The VME processor which we currently use (power-pc based CES RIO) has a limited resources (64MB of RAM and 180 MHZ CPU). Both this kind of processor and the operating system (we use LynxOS 2.5.1) are no more supported by CES and CERN ESS. We were advised to migrate to a pentium-based board from CCT. To the moment we already evaluated this board and rented it from CERN pool. Now we are porting DAQ software to it.

- **Hardware for monitoring and DAQ hosts**
  Both DAQ host and the host for online monitoring should be upgraded. For online monitoring, any modern standard desktop PC will be sufficient. For DAQ host, a custom solution (special motherboard and may be extended disk space for local data pool) is required. For reference, now these are 400 MHZ PC’s with 256 MB of RAM each and are 6 years old. To the moment of writing, standard desktop PC has 3GHZ CPU and 512-1024 MB of RAM.

- **Local data pool capacity**
  In current DIRAC setup, we have 70 GB space on a dedicated disk of DAQ host for local storing of data. Early it was sufficient for running several days without central recorder. Now capacity of local pool should be increased upto 1 TB.

**Limitations that are out of scope of DAQ team**

- **2 GB barrier**
  Many filesystems had a 2 GB limitation for file size when the DIRAC started. The off-line processing adds some data blocks to data files. In order to avoid possible problems with 2 GB barrier during off-line processing, it was decided to fix size of original data files at 300 MB.

  With increased data sample, we will write 300 MB in two minutes, or 120 files for standard 4 hours run. This unlikely will be convenient for off-line team and also can cause degrading of performance of computer hosting local data pool (due to filesystem limitations).
The best solution is to revise the off-line programs for eliminating this barrier and to increase a size of original data files to several gigabytes, but this problem must be addressed by an off-line people.

- **Central recorder and resources for batch**
  I do not know much about CASTOR and batch processing, but I suspect that high-capacity dedicated data pool and large number of simultaneous batch jobs might be needed for recording and off-line processing of data.

- **Quality check**
  Quality check uses simplified off-line processing for checking collected data. The old way for performing quality check was “wait until data will go to central recorder, then download and process it with batch jobs on central cluster”. With increased data volume, the process of recording and downloading will take time comparable with time of collecting data (3 hours will be required even for reading daily data from the hard disk)

  If we decide to check the whole data sample, this reason could cause a significant delay of quality check of collected data.

  One solution to save time is to perform quality check on a dedicated computer which is combined with a local data pool. However this will require some efforts from off-line people in order to keep quality check software up-to-date and to perform data processing in time.

  I had asked V.Yazkov about resources needed for such processing. He expects that on today’s LXPLUS CPU (Intel 3GHZ) the full off-line processing of one event will take about 4 ms, reasonably full quality check - 2 ms, and simplified check - about 1 ms. One can see that with 4K of events and three spills it will be 48s/24s/12s per 17 s supercycle if working on one CPU, so such dedicated computer should be rather powerful and should contain 2 CPUs.

**Some cost estimations**

1. **DAQ host**
   Absolutely required. Should contain a motherboard with ISA-bus for driving CAMAC, two network interfaces, at least 1GB of memory and
additional storage for local data pool.
About 4 KCHF (2 KCHF computer + 2 KCHF for additional storage)

2. Monitoring host
   Absolutely required. Standard desktop PC, about 1400 CHF.

3. Quality check host
   Only if off-line team willing to maintain its software. About 2KCHF
   (local pool storage already included in price of DAQ host).

References

