HOME

TheInfoList



OR:

UpSet plots are a
data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is num ...
method for showing set data with more than three intersecting sets. UpSet shows intersections in a matrix, with the rows of the matrix corresponding to the sets, and the columns to the intersections between these sets (or vice versa). The size of the sets and of the intersections are shown as bar charts.


History

UpSet plots were first proposed in 2014. The first prototype was implemented as an interactive, web-based application. UpSet plots are related to Mosaic Plots, although Mosaic plots are designed for categorical instead of set data. UpSet plots became popular as they became available as an R -library based on
Matplotlib Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPytho ...
, and were subsequently re-implemented in various programming languages, such as
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, and others. As of May 2022, UpSetR has been downloaded from CRAN more than 1 million times. UpSet plots are now frequently used instead of Venn diagrams, especially in life sciences.


Usage

UpSet plots visualize intersections between sets in a matrix. In a vertical UpSet plot, the columns of the matrix correspond to the sets, the rows correspond to the intersections. For each row, the cells that are part of an intersection are filled in. If there are multiple filled-in cells, they are connected with a line, to emphasize the reading direction of the plot. As sets vary in size, the size of the set is plotted as bar charts on top of the columns. The size of the intersections are shown aligned with the rows, also as bar charts. This layout facilitates the comparison between the sizes of individual intersections, as the size of the bars is easy to compare. UpSets can be used horizontally and vertically. UpSet plots can be sorted in various ways. A common sorting approach, for example, is to sort by cardinality (the size of an intersection), which places the biggest intersections on top. Alternative sortings are by the degree of the intersection, or by sets. UpSet plots can also be used to visualize attributes about the intersections by placing attribute visualizations next to the bar charts. Common choices for these attribute visualizations are compact visualization approaches for distributions, such as
box plots In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are cal ...
, or violin plots. Advanced features of UpSet plots include querying, grouping and aggregating data. These features tend to be available only in interactive, web-based implementations of UpSet.


Benefits and Limitations

UpSet plots tend to perform better than Venn diagrams for larger numbers of sets and when it is desirable to also show contextual information about the set intersections. For visualizing diagrams with less than three sets, or when there are only few intersections, Venn and Euler diagram are generally preferred, because they tend to be more familiar and intuitive to read. UpSet plots are limited to displaying 20-30 sets, though specifics depends on the actual data. An alternative approach for larger datasets is to show a co-occurrence heat map, though these cannot show higher-order intersections


See also

*
Venn diagram A Venn diagram is a widely used diagram style that shows the logical relation between set (mathematics), sets, popularized by John Venn (1834–1923) in the 1880s. The diagrams are used to teach elementary set theory, and to illustrate simple ...
* Euler diagram


References

{{reflist Statistical charts and diagrams