Box Plot

picture info	Box Plot In descriptive statistics, a box plot or boxplot is a method for demonstrating graphically the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are called ''whiskers'') extending from the box indicating variability outside the upper and lower quartiles, thus, the plot is also called the box-and-whisker plot and the box-and-whisker diagram. Outliers that differ significantly from the rest of the dataset may be plotted as individual points beyond the whiskers on the box-plot. Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying probability distribution, statistical distribution (though Tukey's boxplot assumes symmetry for the whiskers and normality for their length). The spacings in each subsection of the box-plot indicate the degree of statistical dispersion, dispersion (spread) and skewness of the da ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distribution, the Tukey test of additivity, and the Teichmüller–Tukey lemma all bear his name. He is also credited with coining the term '' bit'' and the first published use of the word ''software''. Biography Tukey was born in New Bedford, Massachusetts, in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French. Tukey obtained a B.A. in 1936 and M.S. in 1937 in chemistry, from Brown University, before moving to Princeton University, where in 1939 he received a PhD in mathematics after completing a doctoral dissertation titled "On denumerability in topology". During World War II, Tukey worked at the Fire Control Research Office and coll ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Seven-number Summary In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is an extension of the five-number summary. There are three similar, common forms. As with the five-number summary, it can be represented by a modified box plot, adding hatch-marks on the "whiskers" for two of the additional numbers. Seven-number summary The following percentiles are (approximately) evenly spaced under a normally distributed variable: : The middle three values – the lower quartile, median, and upper quartile – are the usual statistics from the five-number summary and are the standard values for the box in a box plot. The two unusual percentiles at either end are used because the locations of all seven values will be approximately equally spaced if the data is normally distributed. Some statistical tests require normally distributed data, so the plotted values provide a convenient visual check for validity of later tests, simply by scanning to see if ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Standard Deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is commonly used in the determination of what constitutes an outlier and what does not. Standard deviation may be abbreviated SD or std dev, and is most commonly represented in mathematical texts and equations by the lowercase Greek alphabet, Greek letter Sigma, σ (sigma), for the population standard deviation, or the Latin script, Latin letter ''s'', for the sample standard deviation. The standard deviation of a random variable, Sample (statistics), sample, statistical population, data set, or probability distribution is the square root of its variance. (For a finite population, v ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Box Plot Picture A box (plural: boxes) is a container with rigid sides used for the storage or transportation of its contents. Most boxes have flat, parallel, rectangular sides (typically rectangular prisms). Boxes can be very small (like a matchbox) or very large (like a shipping box for furniture) and can be used for a variety of purposes, from functional to decorative. Boxes may be made of a variety of materials, both durable (such as wood and metal) and non-durable (such as corrugated fiberboard and paperboard). Corrugated metal boxes are commonly used as shipping containers. Boxes may be closed and shut with flaps, doors, or a separate lid. They can be secured shut with adhesives, tapes, string, or more decorative or elaborately functional mechanisms, such as catches, clasps or locks. Packaging Several types of boxes are used in packaging and storage. * A corrugated box is a shipping container made from corrugated fiberboard, most commonly used to transport products fro ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	The American Statistician ''The American Statistician'' is a quarterly peer-reviewed scientific journal covering statistics published by Taylor & Francis on behalf of the American Statistical Association. It was established in 1947. The editor-in-chief An editor-in-chief (EIC), also known as lead editor or chief editor, is a publication's editorial leader who has final responsibility for its operations and policies. The editor-in-chief heads all departments of the organization and is held accoun ... is Daniel R. Jeske, a professor at the University of California, Riverside. External links * Taylor & Francis academic journals Statistics journals Academic journals established in 1947 English-language journals Quarterly journals 1947 establishments in the United States Academic journals associated with learned and professional societies of the United States {{statistics-journal-stub ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Interquartile Range In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by ''Q''1 (also called the lower quartile), ''Q''2 (the median), and ''Q''3 (also called the upper quartile). The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = ''Q''3 − ''Q''1. The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a box plot. Use ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Third Quartile In statistics, quartiles are a type of quantiles which divide the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three quartiles, resulting in four data divisions, are as follows: * The first quartile (''Q''1) is defined as the 25th percentile where lowest 25% data is below this point. It is also known as the ''lower'' quartile. * The second quartile (''Q''2) is the median of a data set; thus 50% of the data lies below this point. * The third quartile (''Q''3) is the 75th percentile where lowest 75% data is below this point. It is known as the ''upper'' quartile, as 75% of the data lies below this point. Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above provide a five-number summary of the data. This summary is important in statistics because it provides in ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Median The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “middle" value. The basic feature of the median in describing data compared to the Arithmetic mean, mean (often simply described as the "average") is that it is not Skewness, skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center. Median income, for example, may be a better way to describe the center of the income distribution because increases in the largest incomes alone have no effect on the median. For this reason, the median is of central importance in robust statistics. Median is a 2-quantile; it is the value that partitions a set into two equal parts. Finite set of numbers The median of a finite list of numbers is the "middle" number, when those numbers are liste ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Sample Maximum In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statistics such as the five-number summary and Bowley's seven-figure summary and the associated box plot. The minimum and the maximum value are the first and last order statistics (often denoted ''X''(1) and ''X''(''n'') respectively, for a sample size of ''n''). If the sample has outliers, they necessarily include the sample maximum or sample minimum, or both, depending on whether they are extremely high or low. However, the sample maximum and minimum need not be outliers, if they are not unusually far from other observations. Robustness The sample maximum and minimum are the ''least'' robust statistics: they are maximally sensitive to outliers. This can either be an advantage or a drawback: if extreme values are real (not measurement errors), ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Percentile In statistics, a ''k''-th percentile, also known as percentile score or centile, is a score (e.g., a data point) a given percentage ''k'' of all scores in its frequency distribution exists ("exclusive" definition) or a score a given percentage of the all scores exists ("inclusive" definition); i.e. a score in the ''k''-th percentile would be above approximately ''k''% of all scores in its set. For example, the 97th percentile of data is a data point below which 97% of all data points exist (by the exclusive definition). Percentiles depends on how scores are arranged. Percentiles are a type of quantiles, obtained adopting a subdivision into 100 groups. The 25th percentile is also known as the first '' quartile'' (''Q''1), the 50th percentile as the ''median'' or second quartile (''Q''2), and the 75th percentile as the third quartile (''Q''3). For example, the 50th percentile (median) is the score (or , depending on the definition) which 50% of the scores in the distribution are ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]