The Worldwide LHC Computing Grid (WLCG), formerly (until 2006) the LHC Computing Grid (LCG), is an international collaborative project that consists of a grid-based
computer network
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
infrastructure incorporating over 170 computing centers in 42 countries, . It was designed by
CERN
The European Organization for Nuclear Research, known as CERN (; ; ), is an intergovernmental organization that operates the largest particle physics laboratory in the world. Established in 1954, it is based in a northwestern suburb of Gene ...
to handle the prodigious volume of data produced by
Large Hadron Collider
The Large Hadron Collider (LHC) is the world's largest and highest-energy particle collider. It was built by the European Organization for Nuclear Research (CERN) between 1998 and 2008 in collaboration with over 10,000 scientists and hundred ...
(LHC) experiments.
By 2012, data from over 300 trillion (3×1014) LHC proton-proton collisions had been analyzed,Hunt for Higgs boson hits key decision point /ref> and LHC collision data was being produced at approximately 25 petabytes per year. the LHC Computing Grid is the world's largest
computing grid
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from co ...
comprising over 170 computing facilities in a worldwide network across 42 countries.
Background
The Large Hadron Collider at CERN was designed to test the existence of the
Higgs boson
The Higgs boson, sometimes called the Higgs particle, is an elementary particle in the Standard Model of particle physics produced by the quantum excitation of the Higgs field,
one of the fields in particle physics theory. In the Stand ...
, an important but elusive piece of knowledge that had been sought by
particle physicists
Particle physics or high energy physics is the study of fundamental particles and forces that constitute matter and radiation. The fundamental particles in the universe are classified in the Standard Model as fermions (matter particles) and b ...
for over 40 years. A very powerful particle accelerator was needed, because Higgs bosons might not be seen in lower energy experiments, and because vast numbers of collisions would need to be studied. Such a collider would also produce unprecedented quantities of collision
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
requiring analysis. Therefore, advanced computing facilities were needed to process the data.
Description
A design report was published in 2005.
It was announced to be ready for data on 3 October 2008.
A popular 2008 press article predicted "the internet could soon be made obsolete" by its technology.
CERN had to publish its own articles trying to clear up the confusion.
It incorporates both private
fiber optic cable
A fiber-optic cable, also known as an optical-fiber cable, is an assembly similar to an electrical cable, but containing one or more optical fibers that are used to carry light. The optical fiber elements are typically individually coated with ...
links and existing high-speed portions of the public
Internet
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
. At the end of 2010, the Grid consisted of some 200,000 processing cores and 150
petabytes
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
of disk space, distributed across 34 countries.
The data stream from the detectors provides approximately 300 GByte/s of data, which after filtering for "interesting events", results in a data stream of about 300 MByte/s. The CERN computer center, considered "Tier 0" of the LHC Computing Grid, has a dedicated 10
Gbit
The bit is the most basic Units of information, unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a truth value, logical state with one of two possible value (computer sc ...
/s connection to the counting room.
The project was expected to generate 27 TB of raw data per day, plus 10 TB of "event summary data", which represents the output of calculations done by the
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
farm at the CERN data center. This data is sent out from CERN to thirteen Tier 1 academic institutions in Europe, Asia, and North America, via dedicated links with 10 Gbit/s or higher of bandwidth. This is called the LHC Optical Private Network.
More than 150 Tier 2 institutions are connected to the Tier 1 institutions by general-purpose national research and education networks.final-draft-4-key
The data produced by the LHC on all of its distributed computing grid is expected to add up to 10–15 PB of data each year. In total, the four main detectors at the LHC produced 13 petabytes of data in 2010.
The Tier 1 institutions receive specific subsets of the raw data, for which they serve as a backup repository for CERN. They also perform reprocessing when recalibration is necessary. The primary configuration for the computers used in the grid is based on
CentOS
CentOS (, from Community Enterprise Operating System; also known as CentOS Linux) is a Linux distribution that provides a free and open-source community-supported computing platform, functionally compatible with its upstream source, Red Hat En ...
Distributed computing
A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
resources for analysis by end-user physicists are provided by the
Open Science Grid
The Open Science Grid Consortium is an organization that administers a worldwide grid of technological resources called the Open Science Grid, which facilitates distributed computing for scientific research. Founded in 2004, the consortium is com ...
,
Enabling Grids for E-sciencE
In psychotherapy and mental health, enabling has a positive sense of empowering individuals, or a negative sense of encouraging dysfunctional behavior. and
LHC@home
LHC@home is a volunteer computing project researching particle physics that uses the Berkeley Open Infrastructure for Network Computing (BOINC) platform. The project's computing power is utilized by physicists at CERN in support of the Large Ha ...