HOME
*



picture info

Boxplot
In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are called ''whiskers'') extending from the box indicating variability outside the upper and lower quartiles, thus, the plot is also termed as the box-and-whisker plot and the box-and-whisker diagram. Outliers that differ significantly from the rest of the dataset may be plotted as individual points beyond the whiskers on the box-plot. Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution (though Tukey's boxplot assumes symmetry for the whiskers and normality for their length). The spacings in each subsection of the box-plot indicate the degree of dispersion (spread) and skewness of the data, which are usually described using the five-nu ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Quartile
In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three main quartiles are as follows: * The first quartile (''Q''1) is defined as the middle number between the smallest number (minimum) and the median of the data set. It is also known as the ''lower'' or ''25th empirical'' quartile, as 25% of the data is below this point. * The second quartile (''Q''2) is the median of a data set; thus 50% of the data lies below this point. * The third quartile (''Q''3) is the middle value between the median and the highest value (maximum) of the data set. It is known as the ''upper'' or ''75th empirical'' quartile, as 75% of the data lies below this point. Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above provid ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

First Quartile
In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three main quartiles are as follows: * The first quartile (''Q''1) is defined as the middle number between the smallest number (minimum) and the median of the data set. It is also known as the ''lower'' or ''25th empirical'' quartile, as 25% of the data is below this point. * The second quartile (''Q''2) is the median of a data set; thus 50% of the data lies below this point. * The third quartile (''Q''3) is the middle value between the median and the highest value (maximum) of the data set. It is known as the ''upper'' or ''75th empirical'' quartile, as 75% of the data lies below this point. Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above prov ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Third Quartile
In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three main quartiles are as follows: * The first quartile (''Q''1) is defined as the middle number between the smallest number (minimum) and the median of the data set. It is also known as the ''lower'' or ''25th empirical'' quartile, as 25% of the data is below this point. * The second quartile (''Q''2) is the median of a data set; thus 50% of the data lies below this point. * The third quartile (''Q''3) is the middle value between the median and the highest value (maximum) of the data set. It is known as the ''upper'' or ''75th empirical'' quartile, as 75% of the data lies below this point. Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above prov ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Mary Eleanor Spear
Mary Eleanor Hunt Spear (March 4, 1897 – January 22, 1986) was an American data visualization specialist, graphic analyst and author, who pioneered development of the bar chart and box plot. Early life and education Mary Eleanor Hunt was born in Jonesboro, Indiana, the daughter of Amos Zophar Hunt and Mabel Elizabeth Ewry Hunt. She attended Peabody Elementary School, Washington D.C., followed by Eastern High School. She then studied at Strayer's Business College and George Washington University. She married Albert Austin Spear in September 1921. Career Spear worked as a graphic analyst for many United States Federal Government agencies between the 1920s and the 1960s, including for the Internal Revenue Service and the Bureau of Labor Statistics. She also operated her own studio for 22 years, and taught Graphic Representation of Statistics at the American University. She wrote two books, ''Charting Statistics'' in 1952 and ''Practical Charting Techniques'' in 1969, which d ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




L-estimator
In statistics, an L-estimator is an estimator which is a linear combination of order statistics of the measurements (which is also called an L-statistic). This can be as little as a single point, as in the median (of an odd number of values), or as many as all points, as in the mean. The main benefits of L-estimators are that they are often extremely simple, and often robust statistics: assuming sorted data, they are very easy to calculate and interpret, and are often resistant to outliers. They thus are useful in robust statistics, as descriptive statistics, in statistics education, and when computation is difficult. However, they are inefficient, and in modern times robust statistics M-estimators are preferred, though these are much more difficult computationally. In many circumstances L-estimators are reasonably efficient, and thus adequate for initial estimation. Examples A basic example is the median. Given ''n'' values x_1, \ldots, x_n, if n=2k+1 is odd, the median equals x ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Sample Maximum
In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statistics such as the five-number summary and Bowley's seven-figure summary and the associated box plot. The minimum and the maximum value are the first and last order statistics (often denoted ''X''(1) and ''X''(''n'') respectively, for a sample size of ''n''). If the sample has outliers, they necessarily include the sample maximum or sample minimum, or both, depending on whether they are extremely high or low. However, the sample maximum and minimum need not be outliers, if they are not unusually far from other observations. Robustness The sample maximum and minimum are the ''least'' robust statistics: they are maximally sensitive to outliers. This can either be an advantage or a drawback: if extreme values are real (not measurement errors), a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Interquartile Range
In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by Q1 (also called the lower quartile), ''Q''2 (the median), and ''Q''3 (also called the upper quartile). The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = ''Q''3 −  ''Q''1. The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a Box plot. Use Unlike tota ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Five-number Summary
The five-number summary is a set of descriptive statistics that provides information about a dataset. It consists of the five most important sample percentiles: # the sample minimum ''(smallest observation)'' # the lower quartile or ''first quartile'' # the median (the middle value) # the upper quartile or ''third quartile'' # the sample maximum (largest observation) In addition to the median of a single set of data there are two related statistics called the upper and lower quartiles. If data are placed in order, then the lower quartile is central to the lower half of the data and the upper quartile is central to the upper half of the data. These quartiles are used to calculate the interquartile range, which helps to describe the spread of the data, and determine whether or not any data points are outliers. In order for these statistics to exist the observations must be from a univariate variable that can be measured on an ordinal, interval or ratio scale. Use and representa ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Interquartile Range
In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by Q1 (also called the lower quartile), ''Q''2 (the median), and ''Q''3 (also called the upper quartile). The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = ''Q''3 −  ''Q''1. The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a Box plot. Use Unlike tota ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Outlier
In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set. An outlier can be an indication of exciting possibility, but can also cause serious problems in statistical analyses. Outliers can occur by chance in any distribution, but they can indicate novel behaviour or structures in the data-set, measurement error, or that the population has a heavy-tailed distribution. In the case of measurement error, one wishes to discard them or use statistics that are robust to outliers, while in the case of heavy-tailed distributions, they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution. A frequent cause of outliers is a mixture of two distributions, which may be two dist ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Sample Minimum
In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statistics such as the five-number summary and Bowley's seven-figure summary and the associated box plot. The minimum and the maximum value are the first and last order statistics (often denoted ''X''(1) and ''X''(''n'') respectively, for a sample size of ''n''). If the sample has outliers, they necessarily include the sample maximum or sample minimum, or both, depending on whether they are extremely high or low. However, the sample maximum and minimum need not be outliers, if they are not unusually far from other observations. Robustness The sample maximum and minimum are the ''least'' robust statistics: they are maximally sensitive to outliers. This can either be an advantage or a drawback: if extreme values are real (not measurement errors), a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]