HOME

TheInfoList



OR:

The tidyverse is a collection of
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
packages Package may refer to: Containers or Enclosures * Packaging and labeling, enclosing or protecting products * Mail, items larger than a letter * Chip package or chip carrier * Electronic packaging, in electrical engineering * Automotive package, ...
for the
R programming language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinform ...
introduced by
Hadley Wickham Hadley Alexander Wickham (born 14 October 1979) is a statistician from New Zealand and Chief Scientist at Posit, PBC (former RStudio Inc.) and an adjunct Professor of statistics at the University of Auckland, Stanford University, and Rice Un ...
and his team that "share an underlying design philosophy, grammar, and data structures" of
tidy data Tidy data is an alternative name for the common statistical form called a ''model matrix'' or ''data matrix''. A data matrix is defined in as follows: A standard method of displaying a multivariate set of data is in the form of a data matrix in ...
. Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging
piping Within industry, piping is a system of pipes used to convey fluids (liquids and gases) from one location to another. The engineering discipline of piping design studies the efficient transport of fluid. Industrial process piping (and accompan ...
. As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages. The tidyverse is the subject of multiple books and papers. In 2019, the ecosystem has been published in the
Journal of Open Source Software The ''Journal of Open Source Software'' is a peer-reviewed open-access scientific journal covering open-source software from any research discipline. The journal was founded in 2016 by editors Arfon Smith, Kyle Niemeyer, Dan Katz, Kevin Moerman, an ...
. Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their base-R equivalents and are too dissimilar to other programming languages. On the other hand, some have argued that tidyverse is a very effective way to introduce complete beginners into programming, as pedagogically it allows students to quickly begin doing powerful data processing tasks.


Packages

The core packages, which provide functionality to model, transform, and visualize data, include: *
ggplot2 ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's ''Grammar of Graphics''—a general scheme for data visu ...
*
dplyr One of the core packages of the tidyverse in the R programming language, dplyr is primarily a set of functions designed to enable dataframe manipulation in an intuitive, user-friendly way. Data analysts typically use dplyr in order to transf ...
* ''tidyr'' * ''readr'' * ''purrr'' * ''tibble'' * ''stringr'' * ''forcats'' Additional packages assist the core collection. There is also a constantly growing gamut of useful packages available, based on the tidy data principles, such a
tidytext
for text analysis
tidymodels
for machine learning, o
tidyquant
for financial operations, just to name a few.


References

{{R (programming language) Data analysis software Statistical software Free R (programming language) software