E-Science or eScience is computationally intensive
science
Science is a systematic endeavor that Scientific method, builds and organizes knowledge in the form of Testability, testable explanations and predictions about the universe.
Science may be as old as the human species, and some of the earli ...
that is carried out in highly distributed
network environments, or science that uses immense
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
sets that require
grid computing
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from ...
; the term sometimes includes technologies that enable distributed collaboration, such as the
Access Grid
Access Grid is a collection of resources and technologies that enables large format audio and video based collaboration between groups of people in different locations. The Access Grid is an ensemble of resources, including multimedia large-format ...
. The term was created by John Taylor, the Director General of the United Kingdom's
Office of Science and Technology in 1999 and was used to describe a large funding initiative starting in November 2000. E-science has been more broadly interpreted since then, as "the application of computer technology to the undertaking of modern scientific investigation, including the preparation, experimentation, data collection, results dissemination, and long-term storage and accessibility of all materials generated through the scientific process. These may include data modeling and analysis, electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints, and print and/or electronic publications."
[Bohle, S. "What is E-science and How Should it Be Managed?" Nature.com, Spektrum der Wissenschaft (Scientific American), http://www.scilogs.com/scientific_and_medical_libraries/what-is-e-science-and-how-should-it-be-managed/.] In 2014
IEEE eScience Conference Seriescondensed the definition to "eScience promotes innovation in collaborative, computationally- or data-intensive research across all disciplines, throughout the research lifecycle" in one of the working definitions used by the organizers. E-science encompasses "what is often referred to as
big data hichhas revolutionized science...
uch asthe Large Hadron Collider (LHC) at CERN...
hat
A hat is a head covering which is worn for various reasons, including protection against weather conditions, ceremonial reasons such as university graduation, religious reasons, safety, or as a fashion accessory. Hats which incorporate mech ...
generates around 780 terabytes per year... highly data intensive modern fields of science...that generate large amounts of E-science data include:
computational biology
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
,
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combin ...
, genomics"
and the human
digital footprint for the
social sciences
Social science is one of the branches of science, devoted to the study of society, societies and the Social relation, relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the o ...
.
[DT&SC 7-2: Computational Social Science. https://www.youtube.com/watch?v=TEo0Au1brHs From the DT&SC online course at the University of California: https://canvas.instructure.com/courses/949415]
Turing Award
The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in compu ...
winner
Jim Gray imagined "data-intensive science" or "
e-science" as a "fourth paradigm" of science (
empirical,
theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the
data deluge.
E-Science revolutionizes both fundamental legs of the
scientific method
The scientific method is an Empirical evidence, empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century (with notable practitioners in previous centuries; see the article hist ...
:
empirical research
Empirical research is research using empirical evidence. It is also a way of gaining knowledge by means of direct and indirect observation or experience. Empiricism values some research more than other kinds. Empirical evidence (the record of on ...
, especially through digital
big data; and
scientific theory, especially through
computer simulation
Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be dete ...
model building.
[Hilbert, M. (2015). e-Science for Digital Development: ICT4ICT4D. Centre for Development Informatics, SEED, University of Manchester. ] These ideas were reflected by The White House's Office and Science Technology Policy in February 2013, which slated many of the aforementioned e-Science output products for preservation and access requirements under the memorandum's directive. E-sciences include particle physics, earth sciences and
social simulation
Social simulation is a research field that applies computational methods to study issues in the social sciences. The issues explored include problems in computational law, psychology, organizational behavior, sociology, political science, e ...
s.
Characteristics and examples
Most of the research activities into e-Science have focused on the development of new computational tools and infrastructures to support scientific discovery. Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments. Currently there is a large focus in e-Science in the United Kingdom, where the UK e-Science programme provides significant funding. In Europe the development of computing capabilities to support the
CERN Large Hadron Collider
The Large Hadron Collider (LHC) is the world's largest and highest-energy particle collider. It was built by the European Organization for Nuclear Research (CERN) between 1998 and 2008 in collaboration with over 10,000 scientists and hundr ...
has led to the development of e-Science and Grid infrastructures which are also used by other disciplines.
Consortiums
Example e-Science infrastructures include th
Worldwide LHC Computing Grid
a federation with various partners including th
European Grid Infrastructure the Open Science Grid and th
To support e-Science applications,
Open Science Grid
The Open Science Grid Consortium is an organization that administers a worldwide grid of technological resources called the Open Science Grid, which facilitates distributed computing for scientific research. Founded in 2004, the consortium is comp ...
combines interfaces to more than 100 nationwide clusters, 50 interfaces to geographically distributed storage caches, and 8 campus grids (Purdue, Wisconsin-Madison, Clemson, Nebraska-Lincoln, FermiGrid at FNAL, SUNY-Buffalo, and Oklahoma in the United States; and
UNESP in Brazil). Areas of science benefiting from Open Science Grid include:
*
astrophysics
Astrophysics is a science that employs the methods and principles of physics and chemistry in the study of astronomical objects and phenomena. As one of the founders of the discipline said, Astrophysics "seeks to ascertain the nature of the he ...
,
gravitational physics,
high-energy physics
Particle physics or high energy physics is the study of fundamental particles and forces that constitute matter and radiation. The fundamental particles in the universe are classified in the Standard Model as fermions (matter particles) and ...
,
neutrino physics,
nuclear physics
Nuclear physics is the field of physics that studies atomic nuclei and their constituents and interactions, in addition to the study of other forms of nuclear matter.
Nuclear physics should not be confused with atomic physics, which studies the ...
*
molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
,
materials science,
materials engineering,
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
,
computer engineering
Computer engineering (CoE or CpE) is a branch of electrical engineering and computer science that integrates several fields of computer science and electronic engineering required to develop computer hardware and software. Computer enginee ...
,
nanotechnology
Nanotechnology, also shortened to nanotech, is the use of matter on an atomic, molecular, and supramolecular scale for industrial purposes. The earliest, widespread description of nanotechnology referred to the particular technological goal o ...
*
structural biology
Structural biology is a field that is many centuries old which, and as defined by the Journal of Structural Biology, deals with structural analysis of living material (formed, composed of, and/or maintained and refined by living cells) at every le ...
,
computational biology
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
,
genomics
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
,
proteomics
Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In ...
,
medicine
Medicine is the science and Praxis (process), practice of caring for a patient, managing the diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, and Health promotion ...
UK programme
After his appointment as Director General of the Research Councils in 1999 John Taylor, with the support of the Science Minister
David Sainsbury and the Chancellor of the Exchequer
Gordon Brown
James Gordon Brown (born 20 February 1951) is a British former politician who served as Prime Minister of the United Kingdom and Leader of the Labour Party (UK), Leader of the Labour Party from 2007 to 2010. He previously served as Chance ...
, bid to
HM Treasury
His Majesty's Treasury (HM Treasury), occasionally referred to as the Exchequer, or more informally the Treasury, is a department of His Majesty's Government responsible for developing and executing the government's public finance policy and ec ...
to fund a programme of e-infrastructure development for science which would provide the foundation for UK science and industry to be a world leader in the
knowledge economy
The knowledge economy (or the knowledge-based economy) is an economic system in which the production of goods and services is based principally on knowledge-intensive activities that contribute to advancement in technical and scientific inn ...
which motivated the
Lisbon Strategy
The Lisbon Strategy, also known as the Lisbon Agenda or Lisbon Process, was an action and development plan devised in 2000, for the economy of the European Union between 2000 and 2010. A pivotal role in its formulation was played by the Portugues ...
for sustainable economic growth that the UK government committed to in March 2000.
In November 2000 John Taylor announced £98 million for a national UK e-Science programme. An additional £20 million contribution was planned from UK industry in matching funds to projects that they participated in. From this budget of £120 million over three years, £75 million was to be spent on grid application pilots in all areas of science, administered by the Research Council responsible for each area, while £35 million was to be administered by the
EPSRC as a Core Programme to develop "industrial strength" Grid middleware. Phase 2 of the programme for 2004-2006 was supported by a further £96 million for application projects, and £27 million for the EPSRC core programme. Phase 3 of the programme for 2007-2009 was supported by a further £14 million for the EPSRC core programme and a further sum for applications. Additional funding for UK e-Science activities was provided from European Union funding, from
university funding council SRIF funding for hardware, and from
Jisc
Jisc is a United Kingdom not-for-profit company that provides network and IT services and digital resources in support of further and higher education institutions and research as well as not-for-profits and the public sector.
History
T ...
for networking and other infrastructure.
The UK e-Science programme comprised a wide range of resources, centres and people including the National e-Science Centre (NeSC) which is managed by the Universities of
Glasgow
Glasgow ( ; sco, Glesca or ; gd, Glaschu ) is the most populous city in Scotland and the fourth-most populous city in the United Kingdom, as well as being the 27th largest city by population in Europe. In 2020, it had an estimated pop ...
and
Edinburgh
Edinburgh ( ; gd, Dùn Èideann ) is the capital city of Scotland and one of its 32 Council areas of Scotland, council areas. Historically part of the county of Midlothian (interchangeably Edinburghshire before 1921), it is located in Lothian ...
, with facilities in both cities.
Tony Hey
Professor Anthony John Grenville Hey (born 17 August 1946) was Vice-President of Microsoft Research Connections, a division of Microsoft Research, until his departure in 2014.
Education
Hey was educated at King Edward's School, Birmingham and ...
led the core programme from 2001 to 2005.
Within the UK regional e-Science centres support their local universities and projects, including:
White Rose Grid e-Science Centre(WRGeSC)
Belfast e-Science Centre(BeSC)
Centre for eResearch Bristol(CeRB)
Cambridge e-Science Centre(CeSC)
STFC e-Science Centre(STFCeSC)
e-Science North West(eSNW)
*
National Grid Service (NGS)
OMII-UKLancaster University Centre for e-ScienceLondon e-Science Centre(LeSC)
North East Regional e-Science Centre(NEReSC)
Oxford e-Science Centre(OeSC)
Southampton e-Science Centre(SeSC)
Welsh e-Science Centre(WeSC)
(MeSC)
There are also various centres of excellence and research centres.
In addition to centres, the grid application pilot projects were funded by the Research Council responsible for each area of UK science funding.
The
EPSRC funded 11 pilot e-Science projects in three phases (for about £3 million each in the first phase):
* First Phase (2001–2005) were CombEchem, DAME,
Discovery Net, GEODISE,
myGrid and RealityGrid.
* Second phase (2004–2008) were GOLD and Integrative biology
* Third phase (2005–2010) were PMSEG (MESSAGE), CARMEN and NanoCMOS
The
PPARC/
STFC
The Science and Technology Facilities Council (STFC) is a United Kingdom government agency that carries out research in science and engineering, and funds UK research in areas including particle physics, nuclear physics, space science and astro ...
funded two projects:
GridPP (phase 1 for £17 million, phase 2 for £5.9 million, phase 3 for £30 million and a 4th phase running from 2011 to 2014) and Astrogrid (£14 million over 3 phases).
The remaining £23 million of phase one funding was divided between the application projects funded by BBSRC, MRC and NERC:
*
BBSRC: Biomolecular Grid, Proteome Annotation Pipeline, High-Throughput Structural Biology, Global Biodiversity
*
MRC: Biology of Ageing, Sequence and Structure Data, Molecular Genetics, Cancer Management, Clinical e-Science Framework, Neuroinformatics Modeling Tools
*
NERC: Climateprediction.com, Oceanographic Grid, Molecular Environmental Grid, NERC DataGrid
The funded UK e-Science programme was reviewed on its completion in 2009 by an international panel led by
Daniel E. Atkins
Daniel E. Atkins III is the W. K. Kellogg Professor of Community Informatics at University of Michigan.
He is also a Professor of Information at the University of Michigan School of Information and a Professor of both Electrical Engineering ...
, director of the Office of
Cyberinfrastructure United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing ...
of the US
NSF. The report concluded that the programme had developed a skilled pool of expertise, some services, and had led to cooperation between academia and industry, but that these achievements were at a project level rather than by generating infrastructure or transforming disciplines to adopt e-Science as a normal method of work, and that they were not self-sustainable without further investment.
United States
United States-based initiatives, where the term
cyberinfrastructure United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing ...
is typically used to define e-Science projects, are primarily funded by the
National Science Foundation
The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National ...
office of cyberinfrastructure (NSF OCI) and
Department of Energy (in particular the Office of Science).
The Netherlands
Dutch eScience research is coordinated by th
Netherlands eScience Centerin Amsterdam, an initiative founded b
NWOan
SURF
Europe
Plan-Europe is a Platform of National e-Science/Data Research Centers in Europe, as established during the constituting meeting 29–30 October 2014 in Amsterdam, The Netherlands, and which is based on agreed Terms of Reference. PLAN-E has a kernel group of active members and convenes twice annually. More can be found o
PLAN-E
Sweden
Two academic research projects have been carried out in Sweden by two different groups of universities, to help researches share and access scientific computing resources and knowledge:
* Swedish e-Science Research Center (SeRC):
Kungliga Tekniska högskolan (KTH),
Stockholm University (SU),
Karolinska institutet (KI) and
Linköping University (LiU)
* eSSENCE, The e-Science Collaboration (eSSENCE):
Uppsala University
Uppsala University ( sv, Uppsala universitet) is a public research university in Uppsala, Sweden. Founded in 1477, it is the oldest university in Sweden and the Nordic countries still in operation.
The university rose to significance durin ...
,
Lund University
, motto = Ad utrumque
, mottoeng = Prepared for both
, established =
, type = Public research university
, budget = SEK 9 billion [Umeå University
Umeå University ( sv, Umeå universitet; Ume Sami language, Ume Sami: ) is a public university, public research university located in Umeå, in the mid-northern region of Sweden. The university was founded in 1965 and is the fifth oldest within ...]
Comparison with traditional science
Traditional science is representative of two distinct philosophical traditions within the history of science, but e-Science, it is being argued, requires a
paradigm shift
A paradigm shift, a concept brought into the common lexicon by the American physicist and philosopher Thomas Kuhn, is a fundamental change in the basic concepts and experimental practices of a scientific discipline. Even though Kuhn restricted ...
, and the addition of a third branch of the sciences. "The idea of
open data
Open data is data that is openly accessible, exploitable, editable and shared by anyone for any purpose. Open data is licensed under an open license.
The goals of the open data movement are similar to those of other "open(-source)" movements ...
is not a new one; indeed, when studying the history and philosophy of science,
Robert Boyle
Robert Boyle (; 25 January 1627 – 31 December 1691) was an Anglo-Irish natural philosopher, chemist, physicist, alchemist and inventor. Boyle is largely regarded today as the first modern chemist, and therefore one of the founders of ...
is credited with stressing the concepts of
skepticism
Skepticism, also spelled scepticism, is a questioning attitude or doubt toward knowledge claims that are seen as mere belief or dogma. For example, if a person is skeptical about claims made by their government about an ongoing war then the p ...
, transparency, and reproducibility for independent verification in
scholarly publishing in the 1660s. The scientific method later was divided into two major branches, deductive and empirical approaches. Today, a theoretical revision in the scientific method should include a new branch,
Victoria Stodden advocate
that of the computational approach, where like the other two methods, all of the computational steps by which scientists draw conclusions are revealed. This is because within the last 20 years, people have been grappling with how to handle changes in
high performance computing and simulation."
As such, e-science aims at combining both empirical and theoretical traditions,
while
computer simulations
Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be det ...
can create artificial data, and real-time big data can be used to calibrate theoretical simulation models.
Conceptually, e-Science revolves around developing new methods to support scientists in conducting
scientific research
The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century (with notable practitioners in previous centuries; see the article history of scientific ...
with the aim of making new scientific discoveries by analyzing vast amounts of data accessible over the internet using vast amounts of computational resources. However, discoveries of value cannot be made simply by providing computational tools, a
cyberinfrastructure United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing ...
or by performing a pre-defined set of steps to produce a result. Rather, there needs to be an original, creative aspect to the activity that by its nature cannot be automated. This has led to various research that attempts to define the properties that e-Science platforms should provide in order to support a new paradigm of doing science, and new rules to fulfill the requirements of preserving and making computational data results available in a manner such that they are reproducible in traceable, logical steps, as an intrinsic requirement for the maintenance of modern scientific integrity that allows an extenuation of "Boyle's tradition in the computational age".
Modelling e-Science processes
One view
argues that since a modern discovery process instance serves a similar purpose to a mathematical proof it should have similar properties, namely it allows results to be deterministically reproduced when re-executed and that intermediate results can be viewed to aid examination and comprehension. In this case, simply modelling the
provenance
Provenance (from the French ''provenir'', 'to come from/forth') is the chronology of the ownership, custody or location of a historical object. The term was originally mostly used in relation to works of art but is now used in similar senses i ...
of data is not sufficient. One has to model the provenance of the hypotheses and results generated from analyzing the data as well so as to provide evidence that support new discoveries.
Scientific workflows have thus been proposed and developed to assist scientists to track the evolution of their data, intermediate results and final results as a means to document and track the evolution of discoveries within a piece of scientific research.
Science 2.0
Other views include
Science 2.0 where e-Science is considered to be a shift from the publication of final results by well-defined collaborative groups towards a more open approach, which includes the public sharing of raw data, preliminary experimental results, and related information. To facilitate this shift, the Science 2.0 view is on providing tools that simplify communication, cooperation and collaboration between interested parties. Such an approach has the potential to: speed up the process of scientific discovery; overcome problems associated with academic publishing and peer review; and remove time and cost barriers, limiting the process of generating new knowledge.
See also
*
Citizen science
Citizen science (CS) (similar to community science, crowd science, crowd-sourced science, civic science, participatory monitoring, or volunteer monitoring) is scientific research conducted with participation from the public (who are sometimes re ...
*
Cyberinfrastructure United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing ...
*
Distributed computing
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
*
E-research
The term e-Research (alternately spelled eResearch) refers to the use of information technology to support existing and new forms of research. This extends cyber-infrastructure practices established in STEM fields such as e-Science to cover othe ...
*
e-Science librarianship E-Science librarianship refers to a role for librarians in e-Science.
Early scholars
Early references to e-Science and librarianship involve information studies scholars researching cyberinfrastructure and emerging networked information and knowled ...
*
e-Social Science
*
Grid computing
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from ...
*
List of e-Science infrastructures
This is a list of e-Science infrastructures, that is, of computer systems created to support the computational demands of e-Science.
* World Wide LHC Computing Grid
*European Grid Infrastructure
*Open Science Grid
*Nordic Data Grid Facility
Th ...
*
Science 2.0
*
Scientific workflow system
References
External links
DOE and NSF Open Science GridThe eScience Institute at the University of WashingtonThe Dutch Virtual Laboratory for e-science (VL-e) projectUK Research Council's e-Science programe-science : personnalisation des résultats de recherches Google et sociologies du web UK National Centre for e-Social Scienceand thei
Wiki on e-Social Science{Dead link, date=December 2019 , bot=InternetArchiveBot , fix-attempted=yes
NSF TeraGrid ProjectArts and Humanities E-Science Support Centre (AHESSC)E-Science and Data Services Collaborative (EDSC)The European Commission's e-Infrastructures activitySwedish e-Science Research CentreeSSENCE the e-Science Collaboration
Cyberinfrastructure