, the Gini coefficient ( ), sometimes called the Gini index or Gini ratio, is a measure of statistical dispersion
intended to represent the income inequality
or wealth inequality
within a nation or any other group of people. It was developed by the Italian statistician
and sociologist Corrado Gini
The Gini coefficient measures the inequality
among values of a frequency distribution
(for example, levels of income
). A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of one (or 100%) expresses maximal inequality among values (e.g., for a large number of people where only one person has all the income or consumption and all others have none, the Gini coefficient will be nearly one).
For larger groups, values close to one are unlikely. Given the normalization of both the cumulative population and the cumulative share of income used to calculate the Gini coefficient, the measure is not overly sensitive to the specifics of the income distribution, but rather only on how incomes vary relative to the other members of a population. The exception to this is in the redistribution of income
resulting in a minimum income for all people. When the population is sorted, if their income distribution were to approximate a well-known function, then some representative values could be calculated.
The Gini coefficient was proposed by Gini as a measure of inequality
. For OECD countries
, in the late 20th century, considering the effect of taxes and transfer payments
, the income Gini coefficient ranged between 0.24 and 0.49, with Slovenia being the lowest and Mexico the highest.
African countries had the highest pre-tax Gini coefficients in 2008–2009, with South Africa the world's highest, variously estimated to be 0.63 to 0.7, although this figure drops to 0.52 after social assistance is taken into account, and drops again to 0.47 after taxation.
The global income Gini coefficient in 2005 has been estimated to be between 0.61 and 0.68 by various sources.
There are some issues in interpreting a Gini coefficient. The same value may result from many different distribution curves. The demographic structure should be taken into account. Countries with an aging population, or with a baby boom, experience an increasing pre-tax Gini coefficient even if real income distribution for working adults remains constant. Scholars have devised over a dozen variants of the Gini coefficient.
The Gini coefficient was developed by the Italian statistician Corrado Gini
and published in his 1912 paper ''Variability and Mutability'' ( it|Variabilità e mutabilità). Building on the work of American economist Max Lorenz
, Gini proposed that the difference between the hypothetical straight line depicting perfect equality, and the actual line depicting people's incomes, be used as a measure of inequality.
The Gini coefficient is a single number aimed at measuring the degree of inequality in a distribution. It is most often used in economics to measure how far a country's wealth or income distribution deviates from a totally equal distribution.
In terms of income-ordered population percentiles, the Gini coefficient is the cumulative shortfall from equal share of the total income up to each percentile. That summed shortfall is then divided by the value it would have in the case of complete equality.
The Gini coefficient is usually defined mathematically
based on the Lorenz curve
, which plots the proportion of the total income of the population (y axis) that is cumulatively earned by the bottom ''x'' of the population (see diagram). The line at 45 degrees thus represents perfect equality of incomes. The Gini coefficient can then be thought of as the ratio of the area that lies between the line of equality and the Lorenz curve (marked ''A'' in the diagram) over the total area under the line of equality (marked ''A'' and ''B'' in the diagram); i.e., . It is also equal to 2''A'' and to due to the fact that (since the axes scale from 0 to 1).
If all people have non-negative income (or wealth, as the case may be), the Gini coefficient can theoretically range from 0 (complete equality) to 1 (complete inequality); it is sometimes expressed as a percentage ranging between 0 and 100. In reality, both extreme values are not quite reached. If negative values are possible (such as the negative wealth of people with debts), then the Gini coefficient could theoretically be more than 1. Normally the mean (or total) is assumed positive, which rules out a Gini coefficient less than zero.
An alternative approach is to define the Gini coefficient as half of the relative mean absolute difference
, which is mathematically equivalent to the definition based on the Lorenz curve. The mean absolute difference is the average absolute difference
of all pairs of items of the population, and the relative mean absolute difference is the mean absolute difference divided by the average
, to normalize for scale. If ''x''''i''
is the wealth or income of person ''i'', and there are ''n'' persons, then the Gini coefficient ''G'' is given by:
When the income (or wealth) distribution is given as a continuous probability distribution function
''p''(''x''), the Gini coefficient is again half of the relative mean absolute difference:
is the mean of the distribution, and the lower limits of integration may be replaced by zero when all incomes are positive.
While the income distribution of any particular country won't always follow theoretical models in reality, these functions give a qualitative understanding of the income distribution in a nation given the Gini coefficient.
Example: two levels of income
There are two levels of income. The first is an equal society where every person receives the same income () and the second is in an unequal society where a single person receives 100% of the total income and the remaining individuals receive none. ().
These two levels of income can be distinguished as low and high. If the high income group is a proportion ''u'' of the population and earns a proportion ''f'' of all income, then the Gini coefficient is . A more factual and realistic graded distribution with the same values ''u'' and ''f'' will always have a higher Gini coefficient than .
For example in the scenario where the richest 20% have 80% of all income (see Pareto principle
) this would lead to an income Gini coefficient of at least 60%.
An often cited case that 1% of all the world's population owns 50% of all wealth, means a wealth Gini coefficient of at least 49%.
In some cases, this equation can be applied to calculate the Gini coefficient without direct reference to the Lorenz curve
. For example, (taking ''y'' to mean the income or wealth of a person or household):
* For a population uniform on the values ''y''''i''
, ''i'' = 1 to ''n'', indexed in non-decreasing order (''y''''i''
:This may be simplified to:
:This formula actually applies to any real population, since each person can be assigned his or her own ''y''''i''
Since the Gini coefficient is half the relative mean absolute difference, it can also be calculated using formulas for the relative mean absolute difference. For a random sample ''S'' consisting of values ''y''''i''
, ''i'' = 1 to ''n'', that are indexed in non-decreasing order (''y''''i''
), the statistic:
is a consistent estimator
of the population Gini coefficient, but is not, in general, unbiased
. Like ''G'', has a simpler form:
There does not exist a sample statistic that is in general an unbiased estimator of the population Gini coefficient, like the relative mean absolute difference
Discrete probability distribution
For a discrete probability distribution
with probability mass function
is the fraction of the population with income or wealth
, the Gini coefficient is:
:If the points with nonzero probabilities are indexed in increasing order
These formulae are also applicable in the limit as
Continuous probability distribution
When the population is large, the income distribution may be represented by a continuous probability density function
''f''(''x'') where ''f''(''x'') ''dx'' is the fraction of the population with wealth or income in the interval ''dx'' about ''x''. If ''F''(''x'') is the cumulative distribution function
for ''f''(''x''), then the Lorenz curve ''L''(''F'') may then be represented as a function parametric in ''L''(''x'') and ''F''(''x'') and the value of ''B'' can be found by integration
The Gini coefficient can also be calculated directly from the cumulative distribution function
of the distribution ''F''(''y''). Defining μ as the mean of the distribution, and specifying that ''F''(''y'') is zero for all negative values, the Gini coefficient is given by:
The latter result comes from integration by parts
. (Note that this formula can be applied when there are negative values if the integration is taken from minus infinity to plus infinity.)
The Gini coefficient may be expressed in terms of the quantile function
''Q''(''F'') (inverse of the cumulative distribution function: ''Q''(''F''(''x'')) = ''x'')
For some functional forms, the Gini index can be calculated explicitly. For example, if ''y'' follows a lognormal distribution
with the standard deviation of logs equal to
is the error function
is the cumulative standard normal distribution). In the table below, some examples for probability density functions with support on