R packages are
extensions to the
R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised
software repository such as CRAN (the Comprehensive R Archive Network).
The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in
data science
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a br ...
.
Compared to
libraries in other programming language, R packages must conform to a relatively strict specification.
The ''Writing R Extensions'' manual specifies a standard
directory structure for R source code, data, documentation, and package metadata, which enables them to be installed and loaded using R's in-built
package management tools.
Packages distributed on CRAN must meet additional standards.
According to
John Chambers, whilst these requirements "impose considerable demands" on package developers, they improve the
usability
Usability can be described as the capacity of a system to provide a condition for its users to perform the tasks safely, effectively, and efficiently while enjoying the experience. In software engineering, usability is the degree to which a soft ...
and long-term stability of packages for end users.
Repositories
Comprehensive R Archive Network (CRAN)
The Comprehensive R Archive Network (CRAN) is R's central
software repository, supported by the R Foundation.
It contains an archive of the latest and previous versions of the R distribution, documentation, and contributed R packages.
It includes both
source packages and pre-
compiled binaries for
Windows and
macOS. , more than 16,000 packages are available. CRAN was created by
Kurt Hornik
Kurt is a male given name of Germanic or Turkish origin. ''Kurt'' or ''Curt'' originated as short forms of the Germanic Conrad, depending on geographical usage, with meanings including counselor or advisor.
In Turkish, Kurt means "Wolf" and is ...
and
Friedrich Leisch in 1997, with the name paralleling other early packing systems such as
TeX's
CTAN (released 1992) and
Perl's
CPAN (released 1995). , it is still maintained by Hornik and a team of volunteers.
The master site is located at the
Vienna University of Economics and Business and is
mirrored
''Mirrored'' is the debut studio album by American experimental rock band Battles. It was released on May 14, 2007 in the United Kingdom, and on May 22, 2007 in the United States. ''Mirrored'' marked the first album in which the band incorporated ...
on servers around the world.
The "Task Views" page (subject list) on the CRAN website
lists a wide range of tasks (in fields such as finance, genetics, high performance computing, machine learning, medical imaging,
meta-analysis, social sciences and spatial statistics) for which R packages are available. Another way to browse CRAN packages is provided by Metacran,
which also maintains lists of featured, most downloaded, trending or most depended upon packages.
The number of CRAN packages has
grown exponentially for many years, and an average of 21 submissions of new or updated packages were made every day.
Since each submission is manually reviewed by a small team of CRAN maintainers, many of whom, according to R core developer
Peter Dalgaard, are "approaching pensionable age", there is a concern that this system is not sustainable in the long term.
The growth of CRAN has exposed limitations of its
dependency management
In a project network, a dependency is a link among a project's Work breakdown structure#Terminal element, terminal elements.
The A Guide to the Project Management Body of Knowledge, A Guide to the Project Management Body of Knowledge (PMBOK Guide) ...
infrastructure, particularly the fact that it assumes that dependencies always refer to the latest version of a package, meaning that new releases of CRAN packages must always be
backwards compatible
Backward or Backwards is a relative direction.
Backwards or Sdrawkcab (the word "backwards" with its letters reversed) may also refer to:
* "Backwards" (''Red Dwarf''), episode of sci-fi TV sitcom ''Red Dwarf''
** ''Backwards'' (novel), a nov ...
, and that CRAN packages cannot have dependencies that are not on CRAN. It has also led to concerns about declining quality of packages.
MRAN and RStudio Package Manager
The Microsoft R Application Network (MRAN) is a mirror of CRAN maintained by
Microsoft which is based on the company's downstream distribution of R,
Microsoft R Open (formerly Revolution R Open). It also includes an archive of daily CRAN snapshots, branded as the "CRAN Time Machine", which enables users of MRAN to bypass the dependency versioning limitations of CRAN by installing a fixed set of R package versions via the package checkpoint.
RStudio Package Manager is a similar tool produced by
RStudio, which in addition to CRAN snapshots includes an archive of R packages from Bioconductor and
Python packages from the
Python Package Index.
It also distributes pre-compiled binary packages for Linux (only Windows and macOS binaries are included on CRAN).
Other repositories
The
Bioconductor project provides R packages for the analysis of genomic data. This includes object-oriented data-handling and analysis tools for data from
Affymetrix
Affymetrix is now Applied Biosystems, a brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name. The Santa Clara, Califor ...
,
cDNA
In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA (e.g., messenger RNA (mRNA) or microRNA (miRNA)) template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a speci ...
microarray
A microarray is a multiplex lab-on-a-chip. Its purpose is to simultaneously detect the expression of thousands of genes from a sample (e.g. from a tissue). It is a two-dimensional array on a solid substrate—usually a glass slide or silicon t ...
, and next-generation
high-throughput sequencing methods.
R-Forge, is a central platform for the collaborative development of R packages, R-related software, and projects. R-Forge also hosts many unpublished beta packages, and development versions of CRAN packages.
Base and recommended packages
R is distributed with fifteen "base packages": base, compiler, datasets, grDevices, graphics, grid, methods, parallel, splines, stats, stats4, tcltk, tools, translations, and utils.
In addition, there are fifteen "recommended packages" from CRAN which are included with binary distributions of R: KernSmooth, MASS, Matrix, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival.
Other packages
A group of packages called the
Tidyverse, which can be considered a "dialect of the R language", is increasingly popular in the R ecosystem. As of 2020-06-13, Metacran
listed 7 of the 8 core packages of the Tidyverse in the list of most download R packages. The group of packages strives to provide a cohesive collection of functions to deal with common data science tasks, including data import, cleaning, transformation and visualisation (notably with the
ggplot2 package).
The R Infrastructure packages
support coding and the development of R packages and as of 2021-05-04, Metacran
lists 16 of these packages among the 25 most downloaded packages.
See also
*
Tidyverse
*
ggplot2
*
knitr
knitr is an engine for dynamic report generation with R. It is a package in the programming language R that enables integration of R code into LaTeX, LyX, HTML, Markdown, AsciiDoc, and reStructuredText documents. The purpose of knitr is to allo ...
References
Further reading
*
*
*
*
*
External links
The Comprehensive R Archive Network (CRAN)
METACRAN a directory of R packages
CRAN Task Views listing of CRAN packages by topics
{{R (programming language)
R (programming language)