Irreproducible
   HOME

TheInfoList



OR:

Reproducibility, also known as replicability and repeatability, is a major principle underpinning the
scientific method The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century (with notable practitioners in previous centuries; see the article history of scientific m ...
. For the findings of a study to be reproducible means that results obtained by an
experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...
or an observational study or in a
statistical analysis Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers propertie ...
of a data set should be achieved again with a high degree of reliability when the study is replicated. There are different kinds of replication but typically replication studies involve different researchers using the same methodology. Only after one or several such successful replications should a result be recognized as scientific knowledge. With a narrower scope, ''reproducibility'' has been introduced in
computational science Computational science, also known as scientific computing or scientific computation (SC), is a field in mathematics that uses advanced computing capabilities to understand and solve complex problems. It is an area of science that spans many disc ...
s: Any results should be documented by making all data and code available in such a way that the computations can be executed again with identical results. In recent decades, there has been a rising concern that many published scientific results fail the test of reproducibility, evoking a reproducibility or replication crisis.


History

The first to stress the importance of reproducibility in science was the Irish chemist
Robert Boyle Robert Boyle (; 25 January 1627 – 31 December 1691) was an Anglo-Irish natural philosopher, chemist, physicist, alchemist and inventor. Boyle is largely regarded today as the first modern chemist, and therefore one of the founders of ...
, in
England England is a country that is part of the United Kingdom. It shares land borders with Wales to its west and Scotland to its north. The Irish Sea lies northwest and the Celtic Sea to the southwest. It is separated from continental Europe b ...
in the 17th century. Boyle's
air pump An air pump is a pump for pushing air. Examples include a bicycle pump, pumps that are used to aerate an aquarium or a pond via an airstone; a gas compressor used to power a pneumatic tool, air horn or pipe organ; a bellows used to encourage ...
was designed to generate and study
vacuum A vacuum is a space devoid of matter. The word is derived from the Latin adjective ''vacuus'' for "vacant" or "void". An approximation to such vacuum is a region with a gaseous pressure much less than atmospheric pressure. Physicists often dis ...
, which at the time was a very controversial concept. Indeed, distinguished philosophers such as
René Descartes René Descartes ( or ; ; Latinized: Renatus Cartesius; 31 March 1596 – 11 February 1650) was a French philosopher, scientist, and mathematician, widely considered a seminal figure in the emergence of modern philosophy and science. Mathem ...
and
Thomas Hobbes Thomas Hobbes ( ; 5/15 April 1588 – 4/14 December 1679) was an English philosopher, considered to be one of the founders of modern political philosophy. Hobbes is best known for his 1651 book ''Leviathan'', in which he expounds an influent ...
denied the very possibility of vacuum existence.
Historians of science The history of science covers the development of science from ancient times to the present. It encompasses all three major branches of science: natural, social, and formal. Science's earliest roots can be traced to Ancient Egypt and Mesop ...
Steven Shapin and Simon Schaffer, in their 1985 book '' Leviathan and the Air-Pump'', describe the debate between Boyle and Hobbes, ostensibly over the nature of vacuum, as fundamentally an argument about how useful knowledge should be gained. Boyle, a pioneer of the
experimental method An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a ...
, maintained that the foundations of knowledge should be constituted by experimentally produced facts, which can be made believable to a scientific community by their reproducibility. By repeating the same experiment over and over again, Boyle argued, the certainty of fact will emerge. The air pump, which in the 17th century was a complicated and expensive apparatus to build, also led to one of the first documented disputes over the reproducibility of a particular
scientific phenomenon A phenomenon (plural, : phenomena) is an observable event. The term came into its modern Philosophy, philosophical usage through Immanuel Kant, who contrasted it with the noumenon, which ''cannot'' be directly observed. Kant was heavily influe ...
. In the 1660s, the Dutch scientist
Christiaan Huygens Christiaan Huygens, Lord of Zeelhem, ( , , ; also spelled Huyghens; la, Hugenius; 14 April 1629 – 8 July 1695) was a Dutch mathematician, physicist, engineer, astronomer, and inventor, who is regarded as one of the greatest scientists of ...
built his own air pump in
Amsterdam Amsterdam ( , , , lit. ''The Dam on the River Amstel'') is the Capital of the Netherlands, capital and Municipalities of the Netherlands, most populous city of the Netherlands, with The Hague being the seat of government. It has a population ...
, the first one outside the direct management of Boyle and his assistant at the time
Robert Hooke Robert Hooke FRS (; 18 July 16353 March 1703) was an English polymath active as a scientist, natural philosopher and architect, who is credited to be one of two scientists to discover microorganisms in 1665 using a compound microscope that ...
. Huygens reported an effect he termed "anomalous suspension", in which water appeared to levitate in a glass jar inside his air pump (in fact suspended over an air bubble), but Boyle and Hooke could not replicate this phenomenon in their own pumps. As Shapin and Schaffer describe, “it became clear that unless the phenomenon could be produced in England with one of the two pumps available, then no one in England would accept the claims Huygens had made, or his competence in working the pump”. Huygens was finally invited to England in 1663, and under his personal guidance Hooke was able to replicate anomalous suspension of water. Following this Huygens was elected a Foreign Member of the
Royal Society The Royal Society, formally The Royal Society of London for Improving Natural Knowledge, is a learned society and the United Kingdom's national academy of sciences. The society fulfils a number of roles: promoting science and its benefits, re ...
. However, Shapin and Schaffer also note that “the accomplishment of replication was dependent on contingent acts of judgment. One cannot write down a formula saying when replication was or was not achieved”. The philosopher of science
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian-British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the cl ...
noted briefly in his famous 1934 book '' The Logic of Scientific Discovery'' that “non-reproducible single occurrences are of no significance to science”. The
statistician A statistician is a person who works with theoretical or applied statistics. The profession exists in both the private and public sectors. It is common to combine statistical knowledge with expertise in other subjects, and statisticians may wor ...
Ronald Fisher wrote in his 1935 book ''
The Design of Experiments ''The Design of Experiments'' is a 1935 book by the English statistician Ronald Fisher about the design of experiments and is considered a foundational work in experimental design. Among other contributions, the book introduced the concept of the ...
'', which set the foundations for the modern scientific practice of hypothesis testing and
statistical significance In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
, that “we may say that a phenomenon is experimentally demonstrable when we know how to conduct an experiment which will rarely fail to give us statistically significant results”. Such assertions express a common
dogma Dogma is a belief or set of beliefs that is accepted by the members of a group without being questioned or doubted. It may be in the form of an official system of principles or doctrines of a religion, such as Roman Catholicism, Judaism, Islam ...
in modern science that reproducibility is a necessary condition (although not necessarily sufficient) for establishing a scientific fact, and in practice for establishing scientific authority in any field of knowledge. However, as noted above by Shapin and Schaffer, this dogma is not well-formulated quantitatively, such as statistical significance for instance, and therefore it is not explicitly established how many times must a fact be replicated to be considered reproducible.


Terminology

''Replicability'' and ''repeatability'' are related terms broadly or loosely synonymous with reproducibility (for example, among the general public), but they are often usefully differentiated in more precise senses, as follows. Two major steps are naturally distinguished in connection with reproducibility of experimental or observational studies: When new data is obtained in the attempt to achieve it, the term ''replicability'' is often used, and the new study is a ''replication'' or ''replicate'' of the original one. Obtaining the same results when analyzing the data set of the original study again with the same procedures, many authors use the term ''reproducibility'' in a narrow, technical sense coming from its use in computational research. ''Repeatability'' is related to the ''repetition'' of the experiment within the same study by the same researchers. Reproducibility in the original, wide sense is only acknowledged if a replication performed by an ''independent researcher team'' is successful. Unfortunately, the terms reproducibility and replicability sometimes appear even in the scientific literature with reversed meaning, when researchers fail to enforce the more precise usage.


Measures of reproducibility and repeatability

In chemistry, the terms reproducibility and repeatability are used with a specific quantitative meaning. In inter-laboratory experiments, a concentration or other quantity of a chemical substance is measured repeatedly in different laboratories to assess the variability of the measurements. Then, the standard deviation of the difference between two values obtained within the same laboratory is called repeatability. The standard deviation for the difference between two measurement from different laboratories is called ''reproducibility''. These measures are related to the more general concept of
variance component In statistics, a random effects model, also called a variance components model, is a statistical model where the model parameters are random variables. It is a kind of hierarchical linear model, which assumes that the data being analysed are ...
s in
metrology Metrology is the scientific study of measurement. It establishes a common understanding of units, crucial in linking human activities. Modern metrology has its roots in the French Revolution's political motivation to standardise units in Fran ...
.


Reproducible research


Reproducible research method

The term ''reproducible research'' refers to the idea that scientific results should be documented in such a way that their deduction is fully transparent. This requires a detailed description of the methods used to obtain the data and making the full dataset and the code to calculate the results easily accessible. This is the essential part of open science. To make any research project computationally reproducible, general practice involves all data and files being clearly separated, labelled, and documented. All operations should be fully documented and automated as much as practicable, avoiding manual intervention where feasible. The workflow should be designed as a sequence of smaller steps that are combined so that the intermediate outputs from one step directly feed as inputs into the next step. Version control should be used as it lets the history of the project be easily reviewed and allows for the documenting and tracking of changes in a transparent manner. A basic workflow for reproducible research involves data acquisition, data processing and data analysis. Data acquisition primarily consists of obtaining primary data from a primary source such as surveys, field observations, experimental research, or obtaining data from an existing source. Data processing involves the processing and review of the raw data collected in the first stage, and includes data entry, data manipulation and filtering and may be done using software. The data should be digitized and prepared for data analysis. Data may be analysed with the use of software to interpret or visualise statistics or data to produce the desired results of the research such as quantitative results including figures and tables. The use of software and automation enhances the reproducibility of research methods. There are systems that facilitate such documentation, like the R Markdown language or the
Jupyter Project Jupyter () is a project with goals to develop open-source software, open standards, and services for interactive computing across multiple programming languages. It was spun off from IPython in 2014 by Fernando Pérez and Brian Granger ...
notebook. The Open Science Framework provides a platform and useful tools to support reproducible research.


Reproducible research in practice

Psychology has seen a renewal of internal concerns about irreproducible results (see the entry on
replicability crisis The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which the results of many scientific studies are difficult or impossible to reproduce. Because the reproducibili ...
for empirical results on success rates of replications). Researchers showed in a 2006 study that, of 141 authors of a publication from the American Psychological Association (APA) empirical articles, 103 (73%) did not respond with their data over a six-month period. In a follow up study published in 2015, it was found that 246 out of 394 contacted authors of papers in APA journals did not share their data upon request (62%). In a 2012 paper, it was suggested that researchers should publish data along with their works, and a dataset was released alongside as a demonstration. In 2017, an article published in '' Scientific Data'' suggested that this may not be sufficient and that the whole analysis context should be disclosed. In economics, concerns have been raised in relation to the credibility and reliability of published research. In other sciences, reproducibility is regarded as fundamental and is often a prerequisite to research being published, however in economic sciences it is not seen as a priority of the greatest importance. Most peer-reviewed economic journals do not take any substantive measures to ensure that published results are reproducible, however, the top economics journals have been moving to adopt mandatory data and code archives. There is low or no incentives for researchers to share their data, and authors would have to bear the costs of compiling data into reusable forms. Economic research is often not reproducible as only a portion of journals have adequate disclosure policies for datasets and program code, and even if they do, authors frequently do not comply with them or they are not enforced by the publisher. A Study of 599 articles published in 37 peer-reviewed journals revealed that while some journals have achieved significant compliance rates, significant portion have only partially complied, or not complied at all. On an article level, the average compliance rate was 47.5%; and on a journal level, the average compliance rate was 38%, ranging from 13% to 99%. A 2018 study published in the journal '' PLOS ONE'' found that 14.4% of a sample of public health researchers had shared their data or code or both. There have been initiatives to improve reporting and hence reproducibility in the medical literature for many years, beginning with the CONSORT initiative, which is now part of a wider initiative, the EQUATOR Network. This group has recently turned its attention to how better reporting might reduce waste in research, especially biomedical research. Reproducible research is key to new discoveries in
pharmacology Pharmacology is a branch of medicine, biology and pharmaceutical sciences concerned with drug or medication action, where a drug may be defined as any artificial, natural, or endogenous (from within the body) molecule which exerts a biochemica ...
. A Phase I discovery will be followed by Phase II reproductions as a drug develops towards commercial production. In recent decades Phase II success has fallen from 28% to 18%. A 2011 study found that 65% of medical studies were inconsistent when re-tested, and only 6% were completely reproducible.


Noteworthy irreproducible results

Hideyo Noguchi , also known as , was a prominent Japanese bacteriologist who in 1911 discovered the agent of syphilis as the cause of progressive paralytic disease. Early life Noguchi Hideyo whose childhood name was Seisaku Noguchi was born to a family of farme ...
became famous for correctly identifying the bacterial agent of
syphilis Syphilis () is a sexually transmitted infection caused by the bacterium ''Treponema pallidum'' subspecies ''pallidum''. The signs and symptoms of syphilis vary depending in which of the four stages it presents (primary, secondary, latent, an ...
, but also claimed that he could culture this agent in his laboratory. Nobody else has been able to produce this latter result. In March 1989,
University of Utah The University of Utah (U of U, UofU, or simply The U) is a public research university in Salt Lake City, Utah. It is the flagship institution of the Utah System of Higher Education. The university was established in 1850 as the University of De ...
chemists Stanley Pons and Martin Fleischmann reported the production of excess heat that could only be explained by a nuclear process (" cold fusion"). The report was astounding given the simplicity of the equipment: it was essentially an
electrolysis In chemistry and manufacturing, electrolysis is a technique that uses direct electric current (DC) to drive an otherwise non-spontaneous chemical reaction. Electrolysis is commercially important as a stage in the separation of elements from n ...
cell containing heavy water and a
palladium Palladium is a chemical element with the symbol Pd and atomic number 46. It is a rare and lustrous silvery-white metal discovered in 1803 by the English chemist William Hyde Wollaston. He named it after the asteroid Pallas, which was itself na ...
cathode A cathode is the electrode from which a conventional current leaves a polarized electrical device. This definition can be recalled by using the mnemonic ''CCD'' for ''Cathode Current Departs''. A conventional current describes the direction in whi ...
which rapidly absorbed the
deuterium Deuterium (or hydrogen-2, symbol or deuterium, also known as heavy hydrogen) is one of two Stable isotope ratio, stable isotopes of hydrogen (the other being Hydrogen atom, protium, or hydrogen-1). The atomic nucleus, nucleus of a deuterium ato ...
produced during electrolysis. The news media reported on the experiments widely, and it was a front-page item on many newspapers around the world (see
science by press conference Science by press conference or science by press release is the practice by which scientists put an unusual focus on publicizing results of research in the media, in the form of press conference events or press release statements. The term is usual ...
). Over the next several months others tried to replicate the experiment, but were unsuccessful.
Nikola Tesla Nikola Tesla ( ; ,"Tesla"
''
without using wires. In 1904 he built
Wardenclyffe Tower Wardenclyffe Tower (1901–1917), also known as the Tesla Tower, was an early experimental wireless transmission station designed and built by Nikola Tesla on Long Island in 1901–1902, located in the village of Shoreham, New York. Tesla inten ...
on
Long Island Long Island is a densely populated island in the southeastern region of the U.S. state of New York (state), New York, part of the New York metropolitan area. With over 8 million people, Long Island is the most populous island in the United Sta ...
to demonstrate means to send and receive power without connecting wires. The facility was never fully operational and was not completed due to economic problems, so no attempt to reproduce his first result was ever carried out.Cheney, Margaret (1999), ''Tesla, Master of Lightning'', New York: Barnes & Noble Books, , pp. 107.; "Unable to overcome his financial burdens, he was forced to close the laboratory in 1905." Other examples which contrary evidence has refuted the original claim: *
Stimulus-triggered acquisition of pluripotency Stimulus-triggered acquisition of pluripotency (STAP) was a proposed method of generating pluripotent stem cells by subjecting ordinary cells to certain types of stress, such as the application of a bacterial toxin, submersion in a weak acid, or p ...
, revealed to be the result of fraud *
GFAJ-1 GFAJ-1 is a strain of rod-shaped bacteria in the family Halomonadaceae. It is an extremophile that was isolated from the hypersaline and alkaline Mono Lake in eastern California by geobiologist Felisa Wolfe-Simon, a NASA research fellow in reside ...
, a bacterium that could purportedly incorporate
arsenic Arsenic is a chemical element with the symbol As and atomic number 33. Arsenic occurs in many minerals, usually in combination with sulfur and metals, but also as a pure elemental crystal. Arsenic is a metalloid. It has various allotropes, but ...
into its DNA in place of phosphorus *
MMR vaccine controversy Claims of a link between the MMR vaccine and autism have been extensively investigated and found to be false. The link was first suggested in the early 1990s and came to public notice largely as a result of the 1998 ''Lancet'' MMR autism fraud ...
– a study in ''
The Lancet ''The Lancet'' is a weekly peer-reviewed general medical journal and one of the oldest of its kind. It is also the world's highest-impact academic journal. It was founded in England in 1823. The journal publishes original research articles, ...
'' claiming the MMR vaccine caused autism was revealed to be fraudulent *
Schön scandal The Schön scandal concerns German physicist Jan Hendrik Schön (born August 1970 in Verden an der Aller, Lower Saxony, Germany) who briefly rose to prominence after a series of apparent breakthroughs with semiconductors that were later discovered ...
– semiconductor "breakthroughs" revealed to be fraudulent *
Power posing Power posing is a controversial self-improvement technique or "life hack" in which people stand in a posture that they mentally associate with being powerful, in the hope of feeling more confident and behaving more assertively. Though the underlyi ...
– a
social psychology Social psychology is the scientific study of how thoughts, feelings, and behaviors are influenced by the real or imagined presence of other people or by social norms. Social psychologists typically explain human behavior as a result of the r ...
phenomenon that went viral after being the subject of a very popular TED talk, but was unable to be replicated in dozens of studies


See also

*
Metascience Metascience (also known as meta-research) is the use of scientific methodology to study science itself. Metascience seeks to increase the quality of scientific research while reducing inefficiency. It is also known as "''research on research''" ...
*
Accuracy Accuracy and precision are two measures of ''observational error''. ''Accuracy'' is how close a given set of measurements (observations or readings) are to their ''true value'', while ''precision'' is how close the measurements are to each other ...
* ANOVA gauge R&R * Contingency *
Corroboration Corroborating evidence, also referred to as corroboration, is a type of evidence in law. Types and uses Corroborating evidence tends to support a proposition that is already supported by some initial evidence, therefore confirming the propositio ...
*
Reproducible builds Reproducible builds, also known as deterministic compilation, is a process of compiling software which ensures the resulting binary code can be reproduced. Source code compiled using deterministic compilation will always output the same binary. ...
* Falsifiability *
Hypothesis A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obse ...
*
Measurement uncertainty In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by ...
* Pathological science *
Pseudoscience Pseudoscience consists of statements, beliefs, or practices that claim to be both scientific and factual but are incompatible with the scientific method. Pseudoscience is often characterized by contradictory, exaggerated or falsifiability, unfa ...
* Replication (statistics) * Replication crisis * ''
ReScience C ''ReScience C'' is a journal created in 2015 by Nicolas Rougier and Konrad Hinsen with the aim of publishing researchers' attempts to replicate computations made by other authors, using independently written, free and open-source software (FOSS), ...
'' (journal) * Retraction in academic publishing * Tautology * Testability *
Verification and validation Verification and validation (also abbreviated as V&V) are independent procedures that are used together for checking that a product, service, or system meets requirements and specifications and that it fulfills its intended purpose. These are ...


References


Further reading

* * "Science is not irrevocably broken, pidemiologist John Ioannidisasserts. It just needs some improvements. "Despite the fact that I’ve published papers with pretty depressive titles, I’m actually an optimist,” Ioannidis says. “I find no other investment of a society that is better placed than science.”"


External links


Transparency and Openness Promotion Guidelines
from the Center for Open Science
Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results
of the
National Institute of Standards and Technology The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...

Reproducible papers with artifacts
by the
CTuning foundation The cTuning Foundation is a global non-profit organization developing open-source tools and a common methodology to enable sustainable, collaborative and reproducible research in Computer science, perform collaborative optimization of realistic ...

ReproducibleResearch.net
{{Authority control Measurement Philosophy of science Scientific method Tests Validity (statistics) Discovery and invention controversies Metascience Statistical reliability