HOME

TheInfoList



OR:

A bagplot, or starburst plot, is a method in
robust statistics Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, suc ...
for visualizing two- or three-dimensional statistical data, analogous to the one-dimensional
box plot In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are ca ...
. Introduced in 1999 by Rousseuw et al., the bagplot allows one to visualize the location, spread,
skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimodal d ...
, and
outliers In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are ...
of a data set.


Construction

The bagplot consists of three nested
polygon In geometry, a polygon () is a plane figure that is described by a finite number of straight line segments connected to form a closed ''polygonal chain'' (or ''polygonal circuit''). The bounded plane region, the bounding circuit, or the two toge ...
s, called the "bag", the "fence", and the "loop". *The inner polygon, called the ''bag'', is constructed on the basis of
Tukey depth In computational geometry, the Tukey depth is a measure of the depth of a point in a fixed set of points. The concept is named after its inventor, John Tukey. Given a set of points P in ''d''-dimensional space, a point ''p'' has Tukey depth ''k' ...
, the smallest number of observations that can be contained by a half-plane that also contains a given point. It contains at most 50% of the data points *The outermost of the three polygons, called the ''fence'' is not drawn as part of the bagplot, but is used to construct it. It is formed by inflating the bag by a certain factor (usually 3). Observations outside the fence are flagged as
outlier In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are ...
s. *The observations that are not marked as outliers are surrounded by a ''loop'', the
convex hull In geometry, the convex hull or convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space ...
of the observations within the fence. An asterisk symbol (*) near the center of the graph is used to mark the depth median, the point with the highest possible Tukey depth. The observations between the bag and fence are marked by line segments, on a line to the depth median, connecting them to the bag.
The three-dimensional version consists of an inner and outer bag. The outer bag must be drawn in transparent colors so that the inner bag remains visible.


Properties

The bagplot is invariant under
affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generally, ...
s of the plane, and robust against outliers.


References

{{Statistics, descriptive Robust statistics Statistical charts and diagrams Statistical outliers