OFFLINE

Introduction

Since the last CMS Bulletin, the CMS Collaboration has successfully achieved many milestones and the Offline Group has played a central role in the fulfillment of all of them. While many moments have been of historical importance for the whole Collaboration (the first 7 TeV collisions, the huge media coverage on March 30th, the constant increase in luminosity), we consider the level of efficient and sustained operations the group was able to reach over this period as our main achievement.

First collisions were delivered by the LHC on March 30th. This was an important day not only for the physicists finally seeing their detector recording events of unprecedented energy, but also for the close attention the whole world gave us, with tens of journalists physically present in the CMS Centre. The successful accomplishment of that day, when we were able to display “live” events, starting just 1 minute after the first collisions actually happened, was the result of a huge effort by the Offline team in many areas:
  • the CMSSW_3_5_0 release was ready just a few days before the event, with a CMSSW_3_5_1 released being deployed in the very last hours to fix critical problems,
  • suitable global tags were created,
  • Express Processing was exercised the night before the collisions to provide fast feedback on the beam positions when non-colliding beams were present in the machine,
  • And, at last, the new visualization system for P5/CMSCentre was tested.

This last component had actually just been commissioned, and was working in a substantially different way as compared to during the 2009 Run.

A central processing server located in the CERN Computing Centre is used to reconstruct a selected stream of events coming directly from P5. The selection can be performed either upstream, for which the stream contains only certain HLT paths, or downstream, in which case one can select/discard Beam Splashes, machine induced events etc.. The workflow used is taken directly from the Express Reconstruction, via a DAS call. Computers located at P5 and the CMS Centre have been set to serve quasi-online Event Displays (i.e. with less than 1 minute delay) running both FireWorks and iSpy. The system was set up to be ready for the first Beam Splashes, and subsequently collisions. Since then, it has been fully operational and has proven stable and met all requirements. As the events are processed before the Express Stream sees them, this system brings a clear advantage of offering a “first line” of reprocessing where errors can be spotted very quickly.

Since the Media Event on March 30th, all Offline components have been fully operational. DQM shifts have been run with 24 hour coverage in periods with beam, with the usual sharing of efforts between CERN, DESY and FNAL, and the PVT has certified data and reprocessing on a weekly basis.

New features have been added to reconstruction, such as much improved cleaning of noise and a more precise treatment of ECAL spikes, where attempts are being made to simulate the underlying mechanism. Simulation has improved in many other areas, for example the tracker gains and noise model are now vastly improved; also inputs from the Heavy Ions group have been taken into consideration.

The Analysis Tools group has stabilized the PAT for the whole 2010 Run for the benefit of those doing analysis; the PAT has also been demonstrated to be a viable solution for permitting the fast turnaround of analyses and was used to prepare rapidly and yield results on 7 TeV data for March 30th. The Database Group is performing a major review of the code, whilst at the same time helping the DPG/POG/PAG groups to migrate in a fast and effective way their payloads to use the "dropbox mechanism". This mechanism offers an elegant solution for the uploading of Express, Prompt and Offline payloads guaranteeing data reproducibility, whilst intervals of validity are checked and confirmed automatically before being pushed to the database.

The Fast Simulation group is continuing efforts to match the data coming from the detector, with greater and greater success. The AlcaReco team is working on implementing the prompt calibration loop, and prompt reconstruction will be delayed to make sure data are processed with optimal constants, which will be delivered within 48 hours. Whilst the new system is ready to be used, it was decided to delay its deployment until after ICHEP; the time between now until the end of July will be used for integrating the calibration loop to run at the Tier0, which will allow for a complete automation of the workflows. Also, the concept of validation for fast processing has been introduced: an AlcaReco workflow will run on data and produce DQM results, which will be automatically harvested and their results will go in a validation DB. A subsequent Tier0 workflow will use them, leaving to the DQM experts the final word about the quality of the validation payloads. In the case of positive feedback, Prompt Reconstruction will directly run using these payloads, and results that had previously required a re-reconstruction with up-to-date constants will be available in the 48 hours time frame.

These months have also been hectic also from the release point of view. Release CMSSW_3_5_X was deployed in production just a few days before the start of the Run eventually reaching version 3_5_8. At the same time CMSSW_3_6_0 was being prepared and was finalized on April 15th, with CMSSW_3_6_1 being deployed at the Tier0 as the current production release in the second half of May. The CMSSW_3_7_0 development cycle is now closed, with the first release being made on schedule on May 27th, and at the same time the CMSSW_3_8_X cycle was opened. The agreement with Physics has been not to push any new feature in the 3_6_X cycle, and to go straight for 3_7_X validation; 3_6_X will remain the production release at the Tier0 up to ICHEP, to minimize the risks of jeopardizing data taking.

Two full days were spent by a large fraction of the Computing and Offline management team at FNAL, at the start of May, for a Data Management and Workload Management Workshop. Discussions were focused on the definition of a timeline and a series of milestones toward the deployment of the next generation tools. CRAB2 development has been mostly frozen (apart from few features considered vital for analysis), and all the effort put on making a CRAB3 prototype by the end of the year. The same holds for ProdAgent, which will be replaced on a short time scale (end of Summer) by the new WMAgent-based tool.

As already mentioned, it was decided in the workshop to move the Prompt Calibration Loop to the Tier0 infrastructure, with the BeamSpot calibration being used as the first test case. Other important discussions were on DAS, RunRegistry and on the possibility to produce run-dependent Montecarlo samples; also in these cases milestones have been put. The general feeling was that the workshop had been very useful to plan the future DMWM work programme.

Finally it has been decided that, following the success of last year’s Amsterdam experience, an extended Offline management Meeting will take place in the week starting on July 12th at CERN.

Generators

In the last three months the main objectives of the generator group have been the management of the simulation reprocessing and the tuning activities.

About 500M Spring09 simulated events have been reprocessed with the latest available release, and more than 100M new events produced. In parallel several "data-like" samples, using the most realistic beamspot conditions, have been prepared. These samples have been used to compare with the early data, mostly at 7 TeV, in the last two months. The very good quality of the detector simulation has allowed several differences to be identified due to the physics model for minimum bias and underlying event used in the generators, mostly pythia6.

New experimental tunes have been provided, starting an intense activity which is still ongoing. Work has started to improve this process with the adoption of a more robust tool to document the analyses and scan the parameter space finding optimal combinations to describe data.

In parallel, the routine activity to update the libraries to the state-of-the-art versions of the generators has continued, and an effort has been put in building a new validation procedure to be used both in testing code development and for physics quality assessment of new versions.

Full Simulation

Work has focused on improving Monte-Carlo/Data agreement and on the preparation of tools for future simulation use.  As the 7 TeV data roll in, detailed comparisons between the detector simulation and the actual detector performance can be made at the level of individual detector elements. This has led to improved models of the noise levels in both ECAL and HCAL, and more realistic modeling of hit efficiencies in the muon systems.

The Tracker Group has implemented a much more realistic simulation of charge deposition and saturation using the individual channel gains. This will dramatically improve the simulation of dE/dx loss in the tracker.

Progress has also been made in simulating some of the unanticipated "features" found in the real data. ECAL simulation now includes the potential to simulate highly ionizing particles crossing the APDs, which are thought to generate the "spikes" seen in data. The simulation of HF has been updated to include the generation of energy deposition generated by particles striking the phototube faces. This effort is part of a larger project to optimise the simulation of the forward detectors using shower libraries and/or parametrised showers in order to dramatically improve the simulation performance in this region.

Work continues on the tuning of GFlash to collision data in the hope that it can be deployed at some future time with the effect of reducing the CPU time required to simulate events, and as an improved input for Fast Simulation.

Reconstruction

The reconstruction team continues to adapt the data reconstruction algorithms to detector conditions observed in the 2010 data taking. Examples include further tuning of the track reconstruction, which has been achieved by incorporating knowledge gained from studying collision data, mostly concentrating on the constraint to the origin of the tracks (collision, secondaries, beam-gas, etc.). Improvements have also been made to calorimeter data through the application of noise cleaning algorithms. These have been integrated primarily into the CMSSW_3_6_X cycle and partly back-ported to recent CMSSW_3_5_X releases for more rapid integration into the release used for re-reconstruction productions.

The CMSSW_3_6_X has recently been put into production on the Tier0 for reconstruction processing. Improvements include CSC local reconstruction tuning, the addition of Jet+Track algorithms, a centralized track extrapolation to calorimeter algorithm, new Tau algorithms, and updated electron identification code.

The CMSSW_3_7_X release cycle includes further improvements to particle flow, tcMET, beam halo ID algorithms, and new Hcal rechit flags. From a technical point of view, we continue to streamline the maintenance of the production reconstruction configurations and have a testing and validation procedure in place to help ensure smooth production operations. We continue to collaborate with the performance group on the technical performance of the reconstruction algorithms.

We continue to strive to consolidate the RECO team with new members in order to perform smoothly all the tasks related to reconstruction.

Alignment and Calibration

A major focus has been to ensure a stable setup for the startup of high-energy collisions and the LHC media event. An important step for consolidation of the handling of database constants is the introduction of the offline dropbox, which also allows more thorough consistency checks of the intervals of validity for the constants payloads. The policy of conditions database tags for reprocessing has been further adjusted: while constants covering more recent validity ranges can continuously be appended, frozen copies can be generated at any time for re-processing, maintaining the full traceability of the global tag.

The alignment & calibration software framework has been carefully improved in many details for the 36X and 37X release series. The selection strategies for the AlCaReco skim producers have been adjusted to the development of the trigger menu with increasing luminosity, and the migration to the schema of eight primary datasets. The prompt calibration concept has seen significant further development, which will hopefully lead to its use in production soon after the ICHEP conference.

Database

The CMS online database has been improved in preparation for data taking. Most of the online projects for the DCS have been reviewed. Old historical data have been archived and the data rate has been optimized and reduced to a reasonable value, a compromise between physics needs and space availability. An on-call service has been organized.

Concerning the offline database, a tutorial has been made for the alignment and calibration contacts. The topics covered included a general overview of databases in CMS, use of the offline DB in alignment and calibration, how to design a new offline DB record, how to inspect, modify and duplicate tags with the command line tools and how to insert data in the offline DB: PopCon and the online and offline drop-box. The training was very successful with about twenty people following the course.

The introduction of the offline dropbox has made the insertion of new data simpler and more robust, and it has allowed automation of new data insertion from calibration jobs.

Also the data browsing is being improved thanks to the new offline DB Browser. This supports browsing of data in all accounts and displays the content of each tag.

An offline database and calibration on-call service has been organized to guarantee expert availability during data taking.

Fast Simulation

Fast Simulation has also finally entered the exciting era of real data taking. Years of careful design of fast algorithms and emulators, as well as tuning on full simulation and test beam data, resulted in a tool that could successfully be used for comparison with real data. It has the potential to be very helpful in getting first physics results, based on the Minimum Bias events collected in 2009 and at the beginning of 2010 LHC data taking.

In order to facilitate a comparison between the Fast Simulation and the real data collected by CMS, a filter emulating the effect of the technical triggers, which are not directly simulated in the Fast Simulation, was also developed and provided to the users. Results of comparisons of various distributions (such as track multiplicity, missing ET, etc.) between Fast Simulation and those obtained from the real data and from the Full Simulation were shown during internal meetings and the offline workshop in April. They were found to be in remarkable agreement in most cases. This is an outstanding and encouraging result, taking into account that the Fast Simulation was never specifically tuned for the low-pT physics of Minimum Bias events.

A very good agreement between the Fast and Full Simulation in track multiplicities has led in a particular to the first example of the Fast Simulation usage for obtaining physical results. In fact, recently the Fast Simulation has been extensively used for the rapid production of more than 100 million Minimum Bias events and several representative physics samples at 7 TeV using several specially prepared PYTHIA tunes to study what might be the best choice of parameters for the coming MC production. This is quite an impressive example of the usefulness of Fast Simulation for the physics searches in CMS. Fast Simulation also has quite a strong potential for use in physics searches in other areas, for instance, in quick systematic studies, complex multi-parameter model (like SUSY) scans, large-scale background estimates etc.

In view of the presentation of CMS results at ICHEP and other physics conferences, the Fast Simulation team is now pursuing a campaign to encourage, and in fact to request, the approval and publication in the Physics Notes, as well as in the catalogue of approved CMS results, of plots showing the comparison of the distributions of physical observables in the data with the corresponding FastSim ones, whenever a similar comparison with the Full Simulation is provided. This would require little extra effort to the physics teams involved in the analyses, but these figures will serve as benchmark performance plots, demonstrating the status and the range of applicability of the Fast Simulation in CMS.

In the meantime, Fast Simulation is still an evolving tool, which is being improved functionally to cope with the growing requirements and demands of the LHC physics searches. Some of the latest additions include: the Global Run (GRun) HLT menu has been added to those already supported (1E31 and 8E29); an improvement of the electromagnetic shower simulation in the ECAL endcaps behind the PreShower; an implementation of an emulation of the muon hit association inefficiency in the reconstructed tracks resulting from delta ray emission in the DT and CSC.

Analysis Tools

The Analysis Tools group has been very active in support of the many physics analyses for ICHEP, as well as giving guidance for the analysis of first data taken at 7 TeV. Specific activities include another successful PAT tutorial (including information about accessing data correctly) and the subsequent successful deployment of web-based tutorial activities; a "data processing tutorial" that is intended to instruct new users about the details of data analysis as well as serving as a liaison to the Physics Validation Team; the development of POG selection software that is distributed to users for proper object identification; and finally the maintenance of the Physics Analysis Toolkit which provides the configurations needed to run correctly over the various Monte Carlo and data samples. Furthermore, we have been active in evaluating the analysis interface to the computing model, keeping regular synergy with Analysis Operations and the Primary Dataset Working Group.

Data and Workflow management

The DMWM Project would first of all like to thank Rick Egeland for all his hard work on PhEDEx over the last few years. We would also like to welcome Nicolo Magini aboard as the new L3 Manager for PhEDEx. A constructive handover workshop was held at Bristol, where the development plans for the data transfer system were discussed in order to ensure a smooth changeover.

Development work has been moving on apace on the new Workload Management System, with the Tier 1 processing system nearing rollout for integration and operations.  As soon as it enters the testing phase, the development focus will shift to analysis processing and simulation.

A very successful analysis-centric design and coding sprint was held in Perugia in the spring and will be followed up with sprints during August, September and the remainder of the year to push out new services. Anyone who can churn python code is welcome to attend any of these code sprints, although spaces are limited and preference is given to DMWM developers.

Data Quality Monitoring

The Data Quality Monitoring system (DQM) provides event-data based histograms as well as detailed data certification results from automated data quality checks.

The full system comprises a comprehensive view of all sub-detectors and the trigger, in real-time during online data taking, during the prompt and re-reconstruction at Tier-0 and Tier-1 centers, as well as histograms for Release and Monte Carlo validation. The results are published through a web-based application (DQM GUI) allowing CMS users worldwide to inspect the data of all runs recorded, including reference histograms and by-run trends and history plots of basic histogram quantities.

Central DQM shift persons inspect the data of each run. Two DQM shifts are organized, consisting of 3(4) shifts of 8(6) hours each for online (offline) data, respectively. The shifts, performed at P5 (online), as well as CERN (CMS Center), FNAL and DESY (offline) produce data certification results for each run per detector subsystem or physics object reconstruction. Since the beginning of the year 2010 about 100 different persons have taken DQM shifts.

The Run Registry (RR) is a web-application with database-backend serving the bookkeeping of the shift results, the tracking of the shift and sign-off workflows, the browsing of the results by CMS-users, as well as the generation of the final good-run list. It is capable of handling certification results by run and by dataset. Good-run lists are published weekly after sign-off. The good-run list file format allows for its direct input to analysis (CRAB) jobs.

Since the beginning of the year the DQM system was upgraded for improved performance and user-friendliness and also to incorporate 'by-luminosity section (LS)' tracking of the results. The RR retrieves by-LS conditions data (high voltage as well as beam status) from the database. This information is taken into account for the creation of the good-run lists. By-LS histogram handling was also introduced in offline DQM, in view of the goal to move to a higher level of certification automation, with finer resolution in time (by-LS) and geometry (by sub-detector subcomponent).

Since the beginning of the data taking, the DQM configuration, trigger selection and code was continuously tuned and improved to optimize sensitivity to problems and improved performance. In the area of online DQM the infrastructure was improved for better event selection capabilities and maintainability. Substantial improvements were introduced in the DQM run control code, as well as event server and event processor code. As one of the prominent upgrades of online DQM, track-reconstruction based beam spot monitoring was introduced, providing coordinates of the collisions vertex position and extension in real-time to both CMS and LHC control-rooms.

Further improvements towards automation and scalability are underway. Much of the RR code has been re-factored in view of integrating it with the web-based monitoring (WBM) system, thus making it accessible worldwide for CMS-users. As new additions, the next-version RR will contain results from algorithm-based "automatic" certification by-LS, as well as improved versioning capabilities. An offline instance of the WBM is planned, such that certification results are accessible to CMS-users at all times, even when P5 is down, and for scaling reasons. Furthermore, extensions are underway for the quality monitoring of Monte Carlo production data. In this area the focus is on the monitoring of basic generator quantities as well as the correct reconstruction of the physics objects.

In conclusion, with the beginning of the 2010 collisions data taking, the full end-to-end DQM chain was put into production and has proven to be a reliable and efficient means for problem detection (online) and the assessment of the data certification for physics analysis. Improvements are underway to further optimize performance while reducing maintenance and operations efforts to a level that is sustainable over long periods of time.

Physics Validation Team (PVT)

The Physics Validation Group (PVT) started its operations in Summer 2009 with the goal of providing the collaboration with fully validated datasets for Physics Analyses. It serves as a central forum for discussion and logging of the status of the validation; all practical information relevant to the use of recorded data are collected to help users with their analysis.

The regular PVT meetings serve two main purposes, namely the formal sign-off of Alignment and Calibration (AlCa) constants as well as run-by-run certification results. Specific details, relevant to the determination of constants, are discussed during the AlCa meeting. The effect of these changes, even if sometimes small (e.g. noise modeling, alignment, dead channels), can propagate with a large effect to high level reconstruction objects. This is discussed and presented to the PVT, with the participation of the POGs, before new constants can be deployed for reprocessing. The data quality certification and bookkeeping relies on the DQM system. Every week the list of runs certified as good for analysis is updated and is published in special files (JSON) that are used by the analysis teams. The new version of CRAB already supports the selection of good runs at the level of the CRAB job configuration.

The first task has been the validation of the large scale MonteCarlo production in Summer 2009 and there was a very good response, in time and quality, from the validators. At the end of 2009, proton collisions at 900 GeV were the first road test in an operational environment. In the preparation for the Winter Conferences, the PVT coordinated the validation of seven re-processings of the full datasets as well as the production and validation of the corresponding MonteCarlo samples.

Now the focus is on preparation for the ICHEP conference. The goal is to be able to provide the analysis team with the largest amount of collected data processed with the best available release. The schedule is tight and, as the first pre-approvals are approaching, the situation is well under control. The analyses producing results for ICHEP will use one of the two releases chosen, either the 36X series or 37X. The releases are in the validation phase and feedback is expected soon. The set of calibration and alignments approved for the ICHEP conference has been already applied to the latest round of reprocessing.

It should not be overlooked that the PVT is also committed to provide the validation of the MonteCarlo datasets that are needed for analyses before launching the full production. This was the case already for the large “Summer 2009” production and is now on-going in order to provide the reprocessing (re-digitization/ re-reconstruction) of those samples with the latest software release for use in the context of the ICHEP analyses.

Visualization / event-display

This year began with a major push towards consolidation of all visualization-related efforts (i.e. Fireworks, iSpy, and Iguana). Fireworks was chosen as the baseline event display, with the intention to extend functionality to include the existing features of iSpy and Iguana as well as to support operation from within the full software framework within a cmsRun job.

A new call for user inputs was made to help guide the development priorities (see Requirements document), and a developers' workshop, comprising personnel from all three visualization efforts, was held in February in San Diego to determine the work-plan for the year.

Keeping existing programs functional, to avoid disruptions to data-taking, commissioning, and physics analysis, the new team engaged in transplantation of additional iSpy features into Fireworks [see figure]. At the same time, the core object management system was generalized to allow for better code reuse and for various optimizations in memory and CPU consumption. An alpha version of the consolidated event display was released in early May and the official release will done in mid June.

After that the emphasis will be put on integration with the full framework and on development of detailed geometry browser.





by L. Silvestris, T. Boccali, L. Sexton-Kennedy. Edited by M-C Sawley and J. Harvey, with, contributions from F. Cossutti, F. Stockli, S. Banerjee, M.Hildreth, D.Lange, J-R. Vlimant, G. Cerminara, R. Mankel, S. Beauceron, F. Cavallari, P. Paolucci, V. Innocente, S. Abdoulline, A. Perrotta, B. Hegner, S. Rappoccio, D. Evans, S. Metson, I. Segoni, A. Meyer, G. Della Ricca, P.Azzi, L. Malgeri, A. Yagil, G. Cerminara, R. Mankel, S. Beauceron, F. Cavallari, P. Paolucci, V. Innocente, S. Abdoulline, A. Perrotta, B. Hegner, S. Rappoccio, D. Evans, S. Metson, I. Segoni, A. Meyer, G. Della Ricca, P.Azzi, L. Malgeri, A. Yagil