HOME

TheInfoList



OR:

arXiv (pronounced "
archive An archive is an accumulation of historical records or materials – in any medium – or the physical facility in which they are located. Archives contain primary source documents that have accumulated over the course of an individual ...
"—the X represents the Greek letter chi ⟨χ⟩) is an
open-access repository An open repository or open-access repository is a digital platform that holds research output and provides free, immediate and permanent access to research results for anyone to use, download and distribute. To facilitate open access such repositori ...
of electronic
preprint In academic publishing, a preprint is a version of a scholarly or scientific paper that precedes formal peer review and publication in a peer-reviewed scholarly or scientific journal. The preprint may be available, often as a non-typeset versi ...
s and
postprints A postprint is a digital draft of a research journal article ''after'' it has been peer reviewed and accepted for publication, but ''before'' it has been typeset and formatted by the journal. Related terminology A digital draft before peer re ...
(known as e-prints) approved for posting after moderation, but not
peer review Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work ( peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer revie ...
. It consists of
scientific papers : ''For a broader class of literature, see Academic publishing.'' Scientific literature comprises scholarly publications that report original empirical and theoretical work in the natural and social sciences. Within an academic field, scient ...
in the fields of
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
,
physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which ...
,
astronomy Astronomy () is a natural science that studies celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and evolution. Objects of interest include planets, moons, stars, nebulae, g ...
,
electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, devices, and systems which use electricity, electronics, and electromagnetism. It emerged as an identifiable occupation in the l ...
,
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, quantitative biology,
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
,
mathematical finance Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling of financial markets. In general, there exist two separate branches of finance that require ...
and
economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...
, which can be accessed online. In many fields of mathematics and physics, almost all scientific papers are self-archived on the arXiv repository before publication in a peer-reviewed journal. Some publishers also grant permission for authors to archive the peer-reviewed
postprint A postprint is a digital draft of a research journal article ''after'' it has been peer reviewed and accepted for publication, but ''before'' it has been typeset and formatted by the journal. Related terminology A digital draft before peer r ...
. Begun on August 14, 1991, arXiv.org passed the half-million-article milestone on October 3, 2008, and had hit a million by the end of 2014. As of April 2021, the submission rate is about 16,000 articles per month.


History

arXiv was made possible by the compact TeX file format, which allowed scientific papers to be easily transmitted over the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, p ...
and rendered
client-side Client-side refers to operations that are performed by the client in a client–server relationship in a computer network. General concepts Typically, a client is a computer application, such as a web browser, that runs on a user's local comput ...
. Around 1990,
Joanne Cohn Joanne Cohn is an American astrophysicist known for her work in cosmology and particle physics. She is also known for her role in the creation of the ArXiv.org e-print archive. Cohn is a Senior Space Fellow and Full Researcher in the Space Scien ...
began emailing
physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which ...
preprints to colleagues as TeX files, but the number of papers being sent soon filled mailboxes to capacity.
Paul Ginsparg Paul Henry Ginsparg (born January 1, 1955) is a physicist. He developed the arXiv.org e-print archive. Education He is a graduate of Syosset High School in Syosset, New York. He graduated from Harvard University with a Bachelor of Arts in phy ...
recognized the need for central storage, and in August 1991 he created a central repository mailbox stored at the
Los Alamos National Laboratory Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, ...
(LANL) which could be accessed from any computer. Additional modes of access were soon added: FTP in 1991, Gopher in 1992, and the
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web ...
in 1993. The term e-print was quickly adopted to describe the articles. It began as a physics archive, called the
LANL Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, ...
preprint archive, but soon expanded to include astronomy, mathematics, computer science, quantitative biology and, most recently, statistics. Its original
domain name A domain name is a string that identifies a realm of administrative autonomy, authority or control within the Internet. Domain names are often used to identify services provided through the Internet, such as websites, email services and more. As ...
was xxx.lanl.gov. Due to LANL's lack of interest in the rapidly expanding technology, in 2001 Ginsparg changed institutions to
Cornell University Cornell University is a private statutory land-grant research university based in Ithaca, New York. It is a member of the Ivy League. Founded in 1865 by Ezra Cornell and Andrew Dickson White, Cornell was founded with the intention to tea ...
and changed the name of the repository to arXiv.org. It is now hosted principally by Cornell, with five mirrors around the world. ArXiv was an early adopter and promoter of preprints. Its success in sharing preprints was one of the precipitating factors that led to the later movement in scientific publishing known as
open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
.
Mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, structure, space, models, and change. History On ...
s and scientists regularly upload their papers to arXiv.org for worldwide access and sometimes for reviews before they are published in
peer-reviewed Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work ( peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer revie ...
journals. Ginsparg was awarded a
MacArthur Fellowship The MacArthur Fellows Program, also known as the MacArthur Fellowship and commonly but unofficially known as the "Genius Grant", is a prize awarded annually by the John D. and Catherine T. MacArthur Foundation typically to between 20 and 30 indi ...
in 2002 for his establishment of arXiv. The annual budget for arXiv was approximately $826,000 for 2013 to 2017, funded jointly by Cornell University Library, the
Simons Foundation The Simons Foundation is a private foundation established in 1994 by Marilyn and Jim Simons with offices in New York City. As one of the largest charitable organizations in the US with assets of over $5 billion in 2022, the foundation's mission ...
(in both gift and
challenge grant Challenge grants are funds disbursed by one party (the grant maker), usually a government agency, corporation, foundation or trust (sometimes anonymously), typically to a non-profit entity or educational institution (the grantee) upon completion ...
forms) and annual fee income from member institutions. This model arose in 2010, when Cornell sought to broaden the financial funding of the project by asking institutions to make annual voluntary contributions based on the amount of download usage by each institution. Each member institution pledges a five-year funding commitment to support arXiv. Based on institutional usage ranking, the annual fees are set in four tiers from $1,000 to $4,400. Cornell's goal is to raise at least $504,000 per year through membership fees generated by approximately 220 institutions. In September 2011, Cornell University Library took overall administrative and financial responsibility for arXiv's operation and development. Ginsparg was quoted in the ''
Chronicle of Higher Education ''The Chronicle of Higher Education'' is a newspaper and website that presents news, information, and jobs for college and university faculty and student affairs professionals (staff members and administrators). A subscription is required to r ...
'' as saying it "was supposed to be a three-hour tour, not a life sentence". However, Ginsparg remains on the arXiv's Scientific Advisory Board and its Physics Advisory Committee.


Moderation process and endorsement

Although arXiv is not
peer review Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work ( peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer revie ...
ed, a collection of moderators for each area review the submissions; they may recategorize any that are deemed off-topic, or reject submissions that are not scientific papers, or sometimes for undisclosed reasons. The lists of moderators for many sections of arXiv are publicly available, but moderators for most of the physics sections remain unlisted. Additionally, an "endorsement" system was introduced in 2004 as part of an effort to ensure content is relevant and of interest to current research in the specified disciplines. Under the system, for categories that use it, an author must be endorsed by an established arXiv author before being allowed to submit papers to those categories. Endorsers are not asked to review the paper for errors, but to check whether the paper is appropriate for the intended subject area. New authors from recognized academic institutions generally receive automatic endorsement, which in practice means that they do not need to deal with the endorsement system at all. However, the endorsement system has attracted criticism for allegedly restricting scientific inquiry. A majority of the e-prints are also submitted to journals for publication, but some work, including some very influential papers, remain purely as e-prints and are never published in a peer-reviewed journal. A well-known example of the latter is an outline of a proof of Thurston's geometrization conjecture, including the
Poincaré conjecture In the mathematical field of geometric topology, the Poincaré conjecture (, , ) is a theorem about the characterization of the 3-sphere, which is the hypersphere that bounds the unit ball in four-dimensional space. Originally conjectured ...
as a particular case, uploaded by
Grigori Perelman Grigori Yakovlevich Perelman ( rus, links=no, Григорий Яковлевич Перельман, p=ɡrʲɪˈɡorʲɪj ˈjakəvlʲɪvʲɪtɕ pʲɪrʲɪlʲˈman, a=Ru-Grigori Yakovlevich Perelman.oga; born 13 June 1966) is a Russian mathemati ...
in November 2002. Perelman appears content to forgo the traditional peer-reviewed journal process, stating: "If anybody is interested in my way of solving the problem, it's all there let them go and read about it". Despite this non-traditional method of publication, other mathematicians recognized this work by offering the
Fields Medal The Fields Medal is a prize awarded to two, three, or four mathematicians under 40 years of age at the International Congress of the International Mathematical Union (IMU), a meeting that takes place every four years. The name of the award h ...
and Clay Mathematics Millennium Prizes to Perelman, both of which he refused. While arXiv does contain some dubious e-prints, such as those claiming to refute famous theorems or proving famous conjectures such as
Fermat's Last Theorem In number theory, Fermat's Last Theorem (sometimes called Fermat's conjecture, especially in older texts) states that no three positive integers , , and satisfy the equation for any integer value of greater than 2. The cases and have been ...
using only high-school mathematics, a 2002 article which appeared in ''
Notices of the American Mathematical Society ''Notices of the American Mathematical Society'' is the membership journal of the American Mathematical Society (AMS), published monthly except for the combined June/July issue. The first volume appeared in 1953. Each issue of the magazine sinc ...
'' described those as "surprisingly rare". arXiv generally re-classifies these works, e.g. in "General mathematics", rather than deleting them; however, some authors have voiced concern over the lack of transparency in the arXiv screening process.


Submission formats

Papers can be submitted in any of several formats, including
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
, and
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
printed from a
word processor A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features. Early word processors were stand-alone devices dedicated to the function, but current ...
other than TeX or LaTeX. The
submission Deference (also called submission or passivity) is the condition of submitting to the espoused, legitimate influence of one's superior or superiors. Deference implies a yielding or submitting to the judgment of a recognized superior, out of re ...
is rejected by the arXiv software if generating the final
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
file fails, if any image file is too large, or if the total size of the submission is too large. arXiv now allows one to store and modify an incomplete submission, and only finalize the submission when ready. The time stamp on the article is set when the submission is finalized.


Access

The standard access route is through the arXiv.org website or one of several mirrors. Other interfaces and access routes have also been created by other un-associated organisations.
Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
for arXiv is made available through
OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI- ...
, the standard for
open access repositories An open repository or open-access repository is a digital platform that holds research output and provides free, immediate and permanent access to research results for anyone to use, download and distribute. To facilitate open access such repositori ...
. Content is therefore indexed in all major consumers of such data, such as BASE,
CORE Core or cores may refer to: Science and technology * Core (anatomy), everything except the appendages * Core (manufacturing), used in casting and molding * Core (optical fiber), the signal-carrying portion of an optical fiber * Core, the centra ...
and
Unpaywall OurResearch, formerly known as ImpactStory, is a nonprofit organization which creates and distributes tools and services for libraries, institutions and researchers. The organization follows open practices with their data (to the extent allowed by ...
. As of 2020, the Unpaywall dump links over 500,000 arxiv URLs as the
open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
version of a work found in CrossRef data from the publishers, making arXiv a top 10 global host of
green open access Self-archiving is the act of (the author's) depositing a free copy of an electronic document online in order to provide open access to it. The term usually refers to the self-archiving of peer-reviewed research journal and conference articles, as ...
. Finally, researchers can select sub-fields and receive daily e-mailings or
RSS feed RSS ( RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many ...
s of all submissions in them.


Copyright status of files

Files on arXiv can have a number of different copyright statuses: #Some are
public domain The public domain (PD) consists of all the creative work to which no exclusive intellectual property rights apply. Those rights may have expired, been forfeited, expressly waived, or may be inapplicable. Because those rights have expired ...
, in which case they will have a statement saying so. #Some are available under either the
Creative Commons Creative Commons (CC) is an American non-profit organization and international network devoted to educational access and expanding the range of creative works available for others to build upon legally and to share. The organization has release ...
4.0 Attribution-ShareAlike license or the Creative Commons 4.0 Attribution-Noncommercial-ShareAlike license. #Some are copyright to the publisher, but the author has the right to distribute them and has given arXiv a non-exclusive irrevocable license to distribute them. #Most are copyright to the author, and arXiv has only a non-exclusive irrevocable license to distribute them.


See also

* List of preprint repositories *
List of academic databases and search engines This article contains a representative list of notable databases and search engines useful in an academic setting for finding and accessing articles in academic journals, institutional repositories, archives, or other collections of scientific and ...
* List of academic journals by preprint policy


Notes


References

* * * * * * * * * * * * *


External links

* {{Cornell Eprint archives Open-access archives Open science Physics websites American digital libraries Internet properties established in 1991 1991 establishments in New Mexico Cornell University