Sturges's rule
is a method to choose the number of bins for a
histogram
A histogram is a visual representation of the frequency distribution, distribution of quantitative data. To construct a histogram, the first step is to Data binning, "bin" (or "bucket") the range of values— divide the entire range of values in ...
. Given
observations, Sturges's rule suggests using
:
bins in the histogram. This rule is widely employed in
data analysis
Data analysis is the process of inspecting, Data cleansing, cleansing, Data transformation, transforming, and Data modeling, modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Da ...
software including
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (prog ...
and
R, where it is the default bin selection method.
Sturges's rule comes from the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...
which is used as a discrete approximation to the
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
.
If the function to be approximated
is binomially distributed then
:
where
is the number of trials and
is the probability of success and
. Choosing
gives
:
In this form we can consider
as the normalisation factor and Sturges's rule is saying that the sample should result in a histogram with bin counts given by the
binomial coefficients
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the te ...
. Since the total sample size is fixed to
we must have
:
using the well-known formula for
sums of the binomial coefficients. Solving this by taking logs of both sides gives
and finally using
(due to counting the 0 outcomes) gives Sturges's rule. In general Sturges's rule does not give an integer answer so the result is rounded up.
Doane's formula
Doane
[Doane DP (1976) Aesthetic frequency classification. American Statistician, 30: 181–183] proposed modifying Sturges's formula to add extra bins when the data is
skewed
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimodal ...
. Using the
method of moments estimator
:
along with its variance
:
Doane proposed adding
extra bins giving ''Doane's formula''
:
For symmetric distributions
this is equivalent to Sturges's rule. For asymmetric distributions a number of additional bins will be used.
Criticisms

Sturges's rule is not based on any sort of optimisation procedure, like the
Freedman–Diaconis rule
In statistics, the Freedman–Diaconis rule can be used to select the width of the bins to be used in a histogram. It is named after David A. Freedman and Persi Diaconis.
For a set of empirical measurements sampled from some probability distri ...
or
Scott's rule
Scott's rule is a method to select the number of bins in a histogram. Scott's rule is widely employed in data analysis software including R, Python and Microsoft Excel where it is the default bin selection method.
For a set of n observations x_i ...
. It is simply posited based on the approximation of a normal curve by a binomial distribution.
Hyndman has pointed out
[Hyndman RJ. The problem with Sturges' rule for constructing histograms. Monash University. 1995 Jul 5:1-2.] that any multiple of the binomial coefficients would also converge to a normal distribution, so any number of bins could be obtained following the derivation above. Scott
shows that Sturges's rule in general produces oversmoothed histograms i.e. too few bins, and advises against its use in favour of other rules such as Freedman-Diaconis or Scott's rule.
References
{{reflist
Rules of thumb
Statistical charts and diagrams
Infographics