The CERN Data Centre readies for Run 2

While the world waits for Run 2 data with growing anticipation, the CERN Data Centre is battening down the hatches. Run 2 is set to see a significant increase in the amount of data produced by the LHC experiments, with more than one hundred additional petabytes expected over the next three years. How will CERN manage this flood of results? The Bulletin checks in with the IT Department to find out...

 

The CERN Data Centre: the heart of CERN's entire scientific, administrative, and computing infrastructure.

With every second of run-time, gigabytes of data will come pouring into the CERN Data Centre to be stored, sorted and shared with physicists worldwide. To cope with this massive influx of Run 2 data, the CERN Data and Storage Services group focused on three areas: speed, capacity and reliability.

First on the list, the group set out to increase the rate at which they could store data. "During Run 1, we were storing 1 gigabyte-per-second, with the occasional peak of 6 gigabytes-per-second," says Alberto Pace, who leads the Data and Storage Services group within the IT Department. "For Run 2, what was once our "peak" will now be considered average, and we believe we could even go up to 10 gigabytes-per-second if needed."

This increased rate of data storage is thanks, in part, to improvements to the CASTOR storage system. CASTOR prioritises storage on tapes, which are more robust and thus ideal for long-term preservation. New improvements to CASTOR software are allowing CERN's tape drives and libraries to be used more efficiently, with no lag times or delays, thus allowing the Data Centre to increase the rate of data that can be moved to tape and read back.

Reducing the risk of data loss - and the massive storage burden associated with this - was another challenge that the Data and Storage Services team set out to address for Run 2. "We wanted to provide experiments with the ability to choose their storage solution based on the type of data they need to preserve," says Pace. "Thus, we've introduced a data 'chunking' option in our EOS system. This splits the data into segments and enables recently acquired data to be kept on disk for quick access."

"This allowed our online total data capacity to be increased significantly," Pace continues. "We have 140 petabytes of raw disk space available for Run 2 data, divided between the CERN Data Centre in Meyrin and the Wigner Data Centre in Budapest, Hungary. This translates to about 60 petabytes of storage, including back-up files."

Now, in addition to the regular "replication" approach - whereby a duplicated copy is kept for all data - experiments will now have an option to scatter the data across multiple disks. This "chunking" approach breaks the data into pieces. Use of reconstruction algorithms means that content will not be lost even if multiple disks fail. This not only decreases the probability of data loss, but also cuts in half the space needed for back-up storage.

"Although this is not the optimal solution for data subject to heavy access, as reconstructing the data from chunks is more input/output intensive, it is an excellent option for data that is less-frequently accessed," says Pace. "We now have a system allowing us to tune to favour performance or reliability, depending on the type of data."

Finally, the Data and Storage Services group is also aiming to further improve the availability of the EOS system. During the first run of the LHC, EOS was available to users for around 98.5% of the time. However, the group now has the ambitious goal of improving this to more than 99.5% for the duration of Run 2.

From quicker storage speeds to new storage solutions, CERN is well-prepared for all of the fantastic challenges of Run 2.

by Katarina Anthony