Cloud and Grid: more connected than you might think?

You may perceive the grid and the cloud to be two separate technologies: the grid as physical hardware and the cloud as virtual hardware simulated by running software. So how are the grid and the cloud being integrated at CERN?

 

CERN Computer Centre.

The LHC generates a large amount of data that needs to be stored, distributed and analysed. Grid technology is used for the mass physical data processing needed for the LHC supported by many data centres around the world as part of the Worldwide LHC Computing Grid. Beyond the technology itself, the Grid represents a collaboration of all these centres working towards a common goal.

Cloud technology uses virtualisation techniques, which allow one physical machine to represent many virtual machines. This technology is being used today to develop and deploy a range of IT services (such as Service Now, a cloud hosted service), allowing for a great deal of operational flexibility. Such services are available at CERN through Openstack.

“The physics community is looking at cloud solutions in order to be able to extend grid services across internal and external clouds,” says David Foster, Deputy Head of IT at CERN. “Layering grid services, for example Batch, on top of a cloud infrastructure is increasingly popular.”

So what does this really mean? Let's say you have a unit of work. The system goes and finds a machine on which to execute that work. This machine could be located anywhere in the world – the grid is a global collaboration of over 160 computer centres worldwide. That said, you can also allocate a virtual machine in the cloud and send an image that will include the work to be done there. In theory, you could have thousands of virtual machines in a cloud. You could treat them as though they were physical machines somewhere and send units of work as system images to them. 

What has made the cloud model so successful is this ability to virtualise computational resources. This could be at the level of a virtual machine - so it looks like a real PC to you - or at the level of a software application. “It gives the impression that you have more physical hardware than you actually have by creating these virtual instances of a software application or a hardware platform,” says David.

It’s this virtualisation technology that allows you to map software in a very flexible way onto a physical hardware infrastructure. “You can connect a set of cloud resources to the grid and use them as though they were physical machines,” says David. This is extremely useful in managing physical resources and repurposing applications as required across different physical hardware platforms.

CERN’s newest computing infrastructure extension is the data centre coming online in Budapest, which will almost double CERN’s computing capacity. CERN is using cloud technologies to manage the data centre’s extension in Budapest and the data centre in Meyrin so that they appear as one massive, agile infrastructure.

The CERN cloud currently comprises some 1,500 machines: the aim for 2015 is to have around 15,000 machines representing 300,000 virtual machines. With more machines being added to the cloud all the time, 90% of the total computing resources of the two sites will be in the cloud!

Like industry, CERN needs to embrace cloud-computing technologies in order to manage our increasing computational demands without extra operational burden. Grid and cloud activities will evolve together and complement each other. The integration and development of these seemingly separate technologies is essential for the physics community and the evolution of the computing models of the experiments.

by Stephanie McClellan