Dr. Rosie Bolton, University of Cambridge
OpenStack is going from the infinitesimal to the infinite. Researchers are using the cloud platform to look inside atoms today, while the builders of an innovative telescope array hope to use the software to peer across the galaxy and back 400 million years in time.
The two projects are very different from each other. What they have in common is that they're generating prodigious amounts of data, and looking to OpenStack to manage it all.
CERN, the European Organization for Nuclear Research, is using OpenStack to manage data for several of its key experiments. These include the Large Hadron Collider, which at 27 kilometres around is "the largest machine on Earth," said Tim Bell, compute and monitoring group leader at CERN, during a keynote at OpenStack Summit Tuesday. The collider fires beams of photons at each other, measuring the results to determine the properties of subatomic particles. One key component for detecting collisions is a machine called the Compact Muon Solenoid. "It's a very strange term, given that it weighs 14,000 tonnes, to call it compact," Bell said.
CERN's computing infrastructure has to be able to handle 1 billion collisions per second. That demand is driving the need for OpenStack, Bell said.
And that's not the only experiment CERN is running. "I have the honour of having an antimatter factory just down the road from my office," Bell said. The apparatus assembles positrons, antiprotons and neutrons to make anti-hydrogen, which scientists experiment on to determine antimatter properties, such as whether antimatter rises in gravity.
All that science drives the need for a lot of compute. CERN stores 160 petabytes of data on tape, including 0.5 PB per day between June and August of this year. The organization anticipates a 60x compute increase by 2023, but the budget outlook for servers and people is flat, Bell said.
OpenStack helps CERN keep up with demand. CERN is using OpenStack on more than 190,000 cores in production, with more than 90% of CERN compute resources virtualized, 5,000 virtual machines migrated from old hardware in 2016, and more than 100,000 cores to be added in the next six months.
After Bell described how OpenStack is exploring the infinitesimal, the University of Cambridge's Dr. Rosie Bolton talked about OpenStack in the infinite. Or near-infinite, at any rate.
Bolton is part of a consortium building the Square Kilometre Array, a vast radio telescope due to go online in 2023. One part of the SKA will be located in the Western Australian desert, with 130,000 individual antennas in 512 clusters, over an 80-kilometre spread. The other part of the SKA will be in the Karoo desert in South Africa, with 197 antennas over 150 kilometres. The antennas will send data to Science Data Processors about 500 kilometres away from their separate antennas -- Perth, Australia and Cape Town, South Africa -- which then distribute the information around the world.
The antennas will be used to pick up signals going 400 billion years back in time, to observe the formation of the first stars. Separately, the SKA will observe several dozen pulsar stars spread around the galaxy. Pulsars send out pulses of radio activity with extremely precise regularity; by observing changes in the radio activity, astronomers hope to be able to detect gravity waves that span the galaxy.
The compute needs for the SKA will be enormous. Computers will need to ingest 400 gigabytes per second, generate and destroy 1.3 zettabytes of data and then preserve and ship 1 petabyte per day of science data, Bolton said.
The SKA consortium will build the compute facility toward the end of the first phase of construction of the telescope arrays, which is due in 2023. Bolton said she hopes to pique the OpenStack community's interest now, so OpenStack becomes a suitable platform for that kind of science when the SKA is ready to build its compute centre. "It's a long way off, but if we start now we hope to get the OpenStack community interested," Bolton said.
As part of its criteria, the SKA is looking to make the compute facility futureproof. It plans to have the telescope arrays online for 50 years, and needs a platform that can mature over that time and not need to be completely replaced