TheInfoList

In
economics Economics () is the social science that studies how people interact with value; in particular, the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and ...
, the Gini coefficient ( ), sometimes called the Gini index or Gini ratio, is a measure of statistical dispersion intended to represent the
income inequality There are wide varieties of economic inequality, most notably measured using the distribution of income (the amount of money people are paid) and the distribution of wealth (the amount of wealth people own). Besides economic inequality between ...
or
wealth inequality The distribution of wealth is a comparison of the wealth of various members or groups in a society. It shows one aspect of economic inequality or economic heterogeneity. The distribution of wealth differs from the income distribution in that it ...

within a nation or any other group of people. It was developed by the Italian
statistician A statistician is a person who works with theoretical or applied statistics. The profession exists in both the private and public sectors. It is common to combine statistical knowledge with expertise in other subjects, and statisticians may work a ...
and sociologist
Corrado Gini Corrado Gini (23 May 1884 – 13 March 1965) was an Italian statistician, demographer and sociologist who developed the Gini coefficient, a measure of the income inequality in a society. Gini was a proponent of organicism and applied it to nations. ...
. The Gini coefficient measures the
inequality Inequality may refer to: Economics * Attention inequality, unequal distribution of attention across users, groups of people, issues in etc. in attention economy * Economic inequality, difference in economic well-being between population groups * I ...
among values of a
frequency distributionIn statistics, a frequency distribution is a list, table or graph that displays the frequency of various outcomes in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval. ...
(for example, levels of
income Income is the consumption and saving opportunity gained by an entity within a specified timeframe, which is generally expressed in monetary terms.Smith's financial dictionary. Smith, Howard Irving. 1908. Income is defined as, "Revenue; the amount o ...
). A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of one (or 100%) expresses maximal inequality among values (e.g., for a large number of people where only one person has all the income or consumption and all others have none, the Gini coefficient will be nearly one). For larger groups, values close to one are unlikely. Given the normalization of both the cumulative population and the cumulative share of income used to calculate the Gini coefficient, the measure is not overly sensitive to the specifics of the income distribution, but rather only on how incomes vary relative to the other members of a population. The exception to this is in the redistribution of income resulting in a minimum income for all people. When the population is sorted, if their income distribution were to approximate a well-known function, then some representative values could be calculated. The Gini coefficient was proposed by Gini as a measure of
inequality Inequality may refer to: Economics * Attention inequality, unequal distribution of attention across users, groups of people, issues in etc. in attention economy * Economic inequality, difference in economic well-being between population groups * I ...
of
income Income is the consumption and saving opportunity gained by an entity within a specified timeframe, which is generally expressed in monetary terms.Smith's financial dictionary. Smith, Howard Irving. 1908. Income is defined as, "Revenue; the amount o ...
or
wealth Wealth is the abundance of valuable financial assets or physical possessions which can be converted into a form that can be used for transactions. This includes the core meaning as held in the originating old English word ''weal'', which is fro ...
. For OECD countries, in the late 20th century, considering the effect of taxes and
transfer paymentsIn macroeconomics and finance, a transfer payment (also called a government transfer or simply transfer) is a redistribution of income and wealth by means of the government making a payment, without goods or services being received in return. These p ...
, the income Gini coefficient ranged between 0.24 and 0.49, with Slovenia being the lowest and Mexico the highest. African countries had the highest pre-tax Gini coefficients in 2008–2009, with South Africa the world's highest, variously estimated to be 0.63 to 0.7, although this figure drops to 0.52 after social assistance is taken into account, and drops again to 0.47 after taxation. The global income Gini coefficient in 2005 has been estimated to be between 0.61 and 0.68 by various sources. There are some issues in interpreting a Gini coefficient. The same value may result from many different distribution curves. The demographic structure should be taken into account. Countries with an aging population, or with a baby boom, experience an increasing pre-tax Gini coefficient even if real income distribution for working adults remains constant. Scholars have devised over a dozen variants of the Gini coefficient.

# History

The Gini coefficient was developed by the Italian statistician
Corrado Gini Corrado Gini (23 May 1884 – 13 March 1965) was an Italian statistician, demographer and sociologist who developed the Gini coefficient, a measure of the income inequality in a society. Gini was a proponent of organicism and applied it to nations. ...
and published in his 1912 paper ''Variability and Mutability'' ( it, Variabilità e mutabilità). Building on the work of American economist Max Lorenz, Gini proposed that the difference between the hypothetical straight line depicting perfect equality, and the actual line depicting people's incomes, be used as a measure of inequality.

# Definition

The Gini coefficient is a single number aimed at measuring the degree of inequality in a distribution. It is most often used in economics to measure how far a country's wealth or income distribution deviates from a totally equal distribution. In terms of income-ordered population percentiles, the Gini coefficient is the cumulative shortfall from equal share of the total income up to each percentile. That summed shortfall is then divided by the value it would have in the case of complete equality. The Gini coefficient is usually defined
mathematically Mathematics (from Greek: ) includes the study of such topics as quantity (number theory), structure (algebra), space (geometry), and change (analysis). It has no generally accepted definition. Mathematicians seek and use patterns to formulate ...
based on the
Lorenz curve upright=1.2, A typical Lorenz curve In economics, the Lorenz curve is a graphical representation of the distribution of income or of wealth. It was developed by Max O. Lorenz in 1905 for representing inequality of the wealth distribution. The curve ...
, which plots the proportion of the total income of the population (y axis) that is cumulatively earned by the bottom ''x'' of the population (see diagram). The line at 45 degrees thus represents perfect equality of incomes. The Gini coefficient can then be thought of as the ratio of the area that lies between the line of equality and the Lorenz curve (marked ''A'' in the diagram) over the total area under the line of equality (marked ''A'' and ''B'' in the diagram); i.e., . It is also equal to 2''A'' and to due to the fact that (since the axes scale from 0 to 1). If all people have non-negative income (or wealth, as the case may be), the Gini coefficient can theoretically range from 0 (complete equality) to 1 (complete inequality); it is sometimes expressed as a percentage ranging between 0 and 100. In reality, both extreme values are not quite reached. If negative values are possible (such as the negative wealth of people with debts), then the Gini coefficient could theoretically be more than 1. Normally the mean (or total) is assumed positive, which rules out a Gini coefficient less than zero. An alternative approach is to define the Gini coefficient as half of the relative mean absolute difference, which is mathematically equivalent to the definition based on the Lorenz curve. The mean absolute difference is the average
absolute differenceAbsolute may refer to: Companies * Absolute Entertainment, a video game publisher * Absolute Radio, (formerly Virgin Radio), independent national radio station in the UK * Absolute Software Corporation, specializes in security and data risk managem ...

of all pairs of items of the population, and the relative mean absolute difference is the mean absolute difference divided by the
average In colloquial language, an average is a single number taken as representative of a non-empty list of numbers. Different concepts of average are used in different contexts. Often "average" refers to the arithmetic mean, the sum of the numbers divided ...
, $\bar$, to normalize for scale. If ''x''''i'' is the wealth or income of person ''i'', and there are ''n'' persons, then the Gini coefficient ''G'' is given by: :$G = \frac = \frac = \frac$ When the income (or wealth) distribution is given as a continuous
probability distribution functionA probability distribution function is some function that may be used to define a particular probability distribution. Depending upon which text is consulted, the term may refer to: * a cumulative distribution function * a probability mass function * ...
''p''(''x''), the Gini coefficient is again half of the relative mean absolute difference: :$G = \frac\int_^\infty\int_^\infty p\left(x\right)p\left(y\right)\,, x-y, \,dx\,dy$ where $\textstyle\mu=\int_^\infty x p\left(x\right) \,dx$ is the mean of the distribution, and the lower limits of integration may be replaced by zero when all incomes are positive.

# Calculation

While the income distribution of any particular country won't always follow theoretical models in reality, these functions give a qualitative understanding of the income distribution in a nation given the Gini coefficient.

## Example: two levels of income

The extreme cases are the most equal society in which every person receives the same income () and the most unequal society where a single person receives 100% of the total income and the remaining people receive none (). A more general simplified case also just distinguishes two levels of income, low and high. If the high income group is a proportion ''u'' of the population and earns a proportion ''f'' of all income, then the Gini coefficient is . An actual more graded distribution with these same values ''u'' and ''f'' will always have a higher Gini coefficient than . The proverbial case where the richest 20% have 80% of all income (see
Pareto principle The Pareto principle states that for many outcomes, roughly 80% of consequences come from 20% of the causes (the “vital few”). Other names for this principle are the 80/20 rule, the law of the vital few, or the principle of factor sparsity. ...
) would lead to an income Gini coefficient of at least 60%. An often cited case that 1% of all the world's population owns 50% of all wealth, means a wealth Gini coefficient of at least 49%.

## Alternative expressions

In some cases, this equation can be applied to calculate the Gini coefficient without direct reference to the
Lorenz curve upright=1.2, A typical Lorenz curve In economics, the Lorenz curve is a graphical representation of the distribution of income or of wealth. It was developed by Max O. Lorenz in 1905 for representing inequality of the wealth distribution. The curve ...
. For example, (taking ''y'' to mean the income or wealth of a person or household): * For a population uniform on the values ''y''''i'', ''i'' = 1 to ''n'', indexed in non-decreasing order (''y''''i'' ≤ ''y''''i''+1): ::$G = \frac\left \left( n+1 - 2 \left \left( \frac \right \right) \right \right).$ :This may be simplified to: ::$G = \frac -\frac.$ :This formula actually applies to any real population, since each person can be assigned his or her own ''y''''i''. Since the Gini coefficient is half the relative mean absolute difference, it can also be calculated using formulas for the relative mean absolute difference. For a random sample ''S'' consisting of values ''y''''i'', ''i'' = 1 to ''n'', that are indexed in non-decreasing order (''y''''i'' ≤ ''y''''i''+1), the statistic: :$G\left(S\right) = \frac\left \left(n+1 - 2 \left \left( \frac\right \right) \right \right)$ is a
consistent estimator Image:Consistency of estimator.svg, 250px, is a sequence of estimators for parameter ''θ''0, the true value of which is 4. This sequence is consistent: the estimators are getting more and more concentrated near the true value ''θ''0; at the same ...
of the population Gini coefficient, but is not, in general,
unbiased Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, o ...
. Like ''G'', has a simpler form: :$G\left(S\right) = 1 - \frac\left \left( n - \frac\right \right).$ There does not exist a sample statistic that is in general an unbiased estimator of the population Gini coefficient, like the relative mean absolute difference.

## Discrete probability distribution

For a
discrete probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in ...
with probability mass function $f \left( y_i \right),$ $i = 1,\ldots, n$, where $f \left( y_i \right)$ is the fraction of the population with income or wealth $y_i >0$, the Gini coefficient is: :$G = \frac \sum\limits_^n \sum\limits_^n \, f\left(y_i\right) f\left(y_j\right), y_i-y_j,$ where :$\mu=\sum\limits_^n y_i f\left(y_i\right).$ :If the points with nonzero probabilities are indexed in increasing order $\left(y_i < y_\right)$ then: :$G = 1 - \frac$ where :$S_i = \sum_^i f\left(y_j\right)\,y_j\,$ and $S_0 = 0.$ These formulae are also applicable in the limit as $n\rightarrow\infty.$

## Continuous probability distribution

When the population is large, the income distribution may be represented by a continuous
probability density function and probability density function of a normal distribution . Image:visualisation_mode_median_mean.svg, 150px, Geometric visualisation of the mode, median and mean of an arbitrary probability density function. In probability theory, a probability den ...
''f''(''x'') where ''f''(''x'') ''dx'' is the fraction of the population with wealth or income in the interval ''dx'' about ''x''. If ''F''(''x'') is the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Every ...
for ''f''(''x''), then the Lorenz curve ''L''(''F'') may then be represented as a function parametric in ''L''(''x'') and ''F''(''x'') and the value of ''B'' can be found by
integration Integration may refer to: Biology *Modular integration, where different parts in a module have a tendency to vary together *Multisensory integration *Path integration * Pre-integration complex, viral genetic material used to insert a viral genome ...
: :$B = \int_0^1 L\left(F\right) \,dF.$ The Gini coefficient can also be calculated directly from the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Every ...
of the distribution ''F''(''y''). Defining μ as the mean of the distribution, and specifying that ''F''(''y'') is zero for all negative values, the Gini coefficient is given by: :$G = 1 - \frac\int_0^\infty \left(1-F\left(y\right)\right)^2 \,dy = \frac\int_0^\infty F\left(y\right)\left(1-F\left(y\right)\right) \,dy$ The latter result comes from
integration by parts In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. It ...
. (Note that this formula can be applied when there are negative values if the integration is taken from minus infinity to plus infinity.) The Gini coefficient may be expressed in terms of the
quantile function In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equal ...
''Q''(''F'') (inverse of the cumulative distribution function: ''Q''(''F''(''x'')) = ''x'') : $G=\frac\int_0^1 \int_0^1 , Q\left(F_1\right)-Q\left(F_2\right), \,dF_1\,dF_2 .$ For some functional forms, the Gini index can be calculated explicitly. For example, if ''y'' follows a
lognormal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal di ...

with the standard deviation of logs equal to $\sigma$, then $G = \operatorname\left\left(\frac\right\right)$ where $\operatorname$ is the
error function In mathematics, the error function (also called the Gauss error function), often denoted by erf, is a complex function of a complex variable defined as: :\operatorname z = \frac\int_0^z e^\,dt. This integral is a special (non-elementary) sigmoid ...

( since $G=2 \phi \left\left(\frac\right\right)-1$, where $\phi$ is the cumulative standard normal distribution). In the table below, some examples for probability density functions with support on 
interpolating In_the_mathematical_field_of_numerical_analysis,_interpolation_is_a_type_of_estimation,_a_method_of_constructing_new_data_points_within_the_range_of_a_discrete_set_of_known_data_points. In_engineering_and_science,_one_often_has_a_number_of_data__...
_the_missing_values_of_the_Lorenz_curve._If_(''X''''k'',_''Y''''k'')_are_the_known_points_on_the_Lorenz_curve,_with_the_''X''''k''_indexed_in_increasing_order_(''X''''k''_–_1_<_''X''''k''),_so_that: *_''X''''k''_is_the_cumulated_proportion_of_the_population_variable,_for_''k''_=_0,...,''n'',_with_''X''0_=_0,_''X''''n''_=_1. *_''Y''''k''_is_the_cumulated_proportion_of_the_income_variable,_for_''k''_=_0,...,''n'',_with_''Y''0_=_0,_''Y''''n''_=_1. *_''Y''''k''_should_be_indexed_in_non-decreasing_order_(''Y''''k''_>_''Y''''k''_–_1) If_the_Lorenz_curve_is_approximated_on_each_interval_as_a_line_between_consecutive_points,_then_the_area_B_can_be_approximated_with_ interpolating In_the_mathematical_field_of_numerical_analysis,_interpolation_is_a_type_of_estimation,_a_method_of_constructing_new_data_points_within_the_range_of_a_discrete_set_of_known_data_points. In_engineering_and_science,_one_often_has_a_number_of_data__...
_and: :$G_1_=_1_-_\sum_^_\left(X__-_X_\right)_\left(Y__+_Y_\right)$ is_the_resulting_approximation_for_G._More_accurate_results_can_be_obtained_using_other_methods_to_Numerical_integration.html" ;"title="Trapezoidal_rule.html" "title="interpolation.html" "title=",\infty) on are shown. The Dirac delta distribution represents the case where everyone has the same wealth (or income); it implies that there are no variations at all between incomes. :

## Other approaches

Sometimes the entire Lorenz curve is not known, and only values at certain intervals are given. In that case, the Gini coefficient can be approximated by using various techniques for
interpolating In_the_mathematical_field_of_numerical_analysis,_interpolation_is_a_type_of_estimation,_a_method_of_constructing_new_data_points_within_the_range_of_a_discrete_set_of_known_data_points. In_engineering_and_science,_one_often_has_a_number_of_data__...
_and: :$G_1_=_1_-_\sum_^_\left(X__-_X_\right)_\left(Y__+_Y_\right)$ is_the_resulting_approximation_for_G._More_accurate_results_can_be_obtained_using_other_methods_to_Numerical_integration">approximate_the_area_B,_such_as_approximating_the_Lorenz_curve_with_a_ interpolating In_the_mathematical_field_of_numerical_analysis,_interpolation_is_a_type_of_estimation,_a_method_of_constructing_new_data_points_within_the_range_of_a_discrete_set_of_known_data_points. In_engineering_and_science,_one_often_has_a_number_of_data__...
_and: :$G_1_=_1_-_\sum_^_\left(X__-_X_\right)_\left(Y__+_Y_\right)$ is_the_resulting_approximation_for_G._More_accurate_results_can_be_obtained_using_other_methods_to_Numerical_integration">approximate_the_area_B,_such_as_approximating_the_Lorenz_curve_with_a_Simpson's_rule">quadratic_function In_algebra,_a_quadratic_function,_a_quadratic_polynomial,_a_polynomial_of_degree_2,_or_simply_a_quadratic,_is_a_polynomial_function_with_one_or_more_variables_in_which_the_highest-degree_term_is_of_the_second_degree. _roots_(crossings_of_the_''x''__...
_across_pairs_of_intervals,_or_building_an_appropriately_smooth_approximation_to_the_underlying_distribution_function_that_matches_the_known_data._If_the_population_mean_and_boundary_values_for_each_interval_are_also_known,_these_can_also_often_be_used_to_improve_the_accuracy_of_the_approximation. The_Gini_coefficient_calculated_from_a_sample_is_a_statistic_and_its_standard_error,_or_confidence_intervals_for_the_population_Gini_coefficient,_should_be_reported._These_can_be_calculated_using_Resampling_(statistics)#Bootstrap.html" ;"title="Simpson's_rule.html" "title="interpolation">interpolating In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing new data points within the range of a discrete set of known data points. In engineering and science, one often has a number of data ...
the missing values of the Lorenz curve. If (''X''''k'', ''Y''''k'') are the known points on the Lorenz curve, with the ''X''''k'' indexed in increasing order (''X''''k'' – 1 < ''X''''k''), so that: * ''X''''k'' is the cumulated proportion of the population variable, for ''k'' = 0,...,''n'', with ''X''0 = 0, ''X''''n'' = 1. * ''Y''''k'' is the cumulated proportion of the income variable, for ''k'' = 0,...,''n'', with ''Y''0 = 0, ''Y''''n'' = 1. * ''Y''''k'' should be indexed in non-decreasing order (''Y''''k'' > ''Y''''k'' – 1) If the Lorenz curve is approximated on each interval as a line between consecutive points, then the area B can be approximated with trapezoids In_Euclidean_geometry,_a_convex_quadrilateral_with_at_least_one_pair_of_parallel_sides_is_referred_to_as_a_trapezium_()_in_English_outside_North_America,_but_as_a_trapezoid_()_in__American_and_Canadian_English._The_parallel_sides_are_called_th_...
_and: :$G_1_=_1_-_\sum_^_\left(X__-_X_\right)_\left(Y__+_Y_\right)$ is_the_resulting_approximation_for_G._More_accurate_results_can_be_obtained_using_other_methods_to_Numerical_integration">approximate_the_area_B,_such_as_approximating_the_Lorenz_curve_with_a_Simpson's_rule">quadratic_function In_algebra,_a_quadratic_function,_a_quadratic_polynomial,_a_polynomial_of_degree_2,_or_simply_a_quadratic,_is_a_polynomial_function_with_one_or_more_variables_in_which_the_highest-degree_term_is_of_the_second_degree. _roots_(crossings_of_the_''x''__...
_and: :$G_1_=_1_-_\sum_^_\left(X__-_X_\right)_\left(Y__+_Y_\right)$ is_the_resulting_approximation_for_G._More_accurate_results_can_be_obtained_using_other_methods_to_Numerical_integration">approximate_the_area_B,_such_as_approximating_the_Lorenz_curve_with_a_Simpson's_rule">quadratic_function In_algebra,_a_quadratic_function,_a_quadratic_polynomial,_a_polynomial_of_degree_2,_or_simply_a_quadratic,_is_a_polynomial_function_with_one_or_more_variables_in_which_the_highest-degree_term_is_of_the_second_degree. _roots_(crossings_of_the_''x''__...
_across_pairs_of_intervals,_or_building_an_appropriately_smooth_approximation_to_the_underlying_distribution_function_that_matches_the_known_data._If_the_population_mean_and_boundary_values_for_each_interval_are_also_known,_these_can_also_often_be_used_to_improve_the_accuracy_of_the_approximation. The_Gini_coefficient_calculated_from_a_sample_is_a_statistic_and_its_standard_error,_or_confidence_intervals_for_the_population_Gini_coefficient,_should_be_reported._These_can_be_calculated_using_Resampling_(statistics)#Bootstrap">bootstrap_techniques_but_those_proposed_have_been_mathematically_complicated_and_computationally_onerous_even_in_an_era_of_fast_computers._Economist_Tomson_Ogwang_made_the_process_more_efficient_by_setting_up_a_"trick_regression_model"_in_which_respective_income_variables_in_the_sample_are_ranked_with_the_lowest_income_being_allocated_rank_1._The_model_then_expresses_the_rank_(dependent_variable)_as_the_sum_of_a_constant_''A''_and_a_normal_distribution">normal_error_term_whose_variance_is_inversely_proportional_to_''y''''k''; :$k_=_A_+_\_N\left(0,_s^/y_k\right)_$ Thus,_''G''_can_be_expressed_as_a_function_of_the_weighted_least_squares_estimate_of_the_constant_''A''_and_that_this_can_be_used_to_speed_up_the_calculation_of_the_Resampling_(statistics)#Jackknife.html" ;"title="normal_distribution.html" ;"title="Trapezoidal rule">trapezoids In Euclidean geometry, a convex quadrilateral with at least one pair of parallel sides is referred to as a trapezium () in English outside North America, but as a trapezoid () in American and Canadian English. The parallel sides are called th ...
and: :$G_1 = 1 - \sum_^ \left(X_ - X_\right) \left(Y_ + Y_\right)$ is the resulting approximation for G. More accurate results can be obtained using other methods to Numerical integration">approximate the area B, such as approximating the Lorenz curve with a Simpson's rule">quadratic function In algebra, a quadratic function, a quadratic polynomial, a polynomial of degree 2, or simply a quadratic, is a polynomial function with one or more variables in which the highest-degree term is of the second degree. roots (crossings of the ''x'' ...
across pairs of intervals, or building an appropriately smooth approximation to the underlying distribution function that matches the known data. If the population mean and boundary values for each interval are also known, these can also often be used to improve the accuracy of the approximation. The Gini coefficient calculated from a sample is a statistic and its standard error, or confidence intervals for the population Gini coefficient, should be reported. These can be calculated using Resampling (statistics)#Bootstrap">bootstrap techniques but those proposed have been mathematically complicated and computationally onerous even in an era of fast computers. Economist Tomson Ogwang made the process more efficient by setting up a "trick regression model" in which respective income variables in the sample are ranked with the lowest income being allocated rank 1. The model then expresses the rank (dependent variable) as the sum of a constant ''A'' and a normal distribution">normal error term whose variance is inversely proportional to ''y''''k''; :$k = A + \ N\left(0, s^/y_k\right)$ Thus, ''G'' can be expressed as a function of the weighted least squares estimate of the constant ''A'' and that this can be used to speed up the calculation of the Resampling (statistics)#Jackknife">jackknife estimate for the standard error. Economist David Giles argued that the standard error of the estimate of ''A'' can be used to derive that of the estimate of ''G'' directly without using a jackknife at all. This method only requires the use of ordinary least squares regression after ordering the sample data. The results compare favorably with the estimates from the jackknife with agreement improving with increasing sample size. However it has since been argued that this is dependent on the model's assumptions about the error distributions and the independence of error terms, assumptions that are often not valid for real data sets. There is still ongoing debate surrounding this topic. Guillermina Jasso and Angus Deaton independently proposed the following formula for the Gini coefficient: :$G = \frac-\frac\left(\sum_^n P_iX_i\right)$ where $\mu$ is mean income of the population, Pi is the income rank P of person i, with income X, such that the richest person receives a rank of 1 and the poorest a rank of N. This effectively gives higher weight to poorer people in the income distribution, which allows the Gini to meet the
Transfer Principle In model theory, a transfer principle states that all statements of some language that are true for some structure are true for another structure. One of the first examples was the Lefschetz principle, which states that any sentence in the first-or ...
. Note that the Jasso-Deaton formula rescales the coefficient so that its value is 1 if all the $X_i$ are zero except one. Note however Allison's reply on the need to divide by N² instead. FAO explains another version of the formula. :

# Generalized inequality indices

The Gini coefficient and other standard inequality indices reduce to a common form. Perfect equality—the absence of inequality—exists when and only when the inequality ratio, $r_j = x_j / \overline$, equals 1 for all j units in some population (for example, there is perfect income equality when everyone's income $x_j$ equals the mean income $\overline$, so that $r_j=1$ for everyone). Measures of inequality, then, are measures of the average deviations of the $r_j=1$ from 1; the greater the average deviation, the greater the inequality. Based on these observations the inequality indices have this common form: :$\text = \sum_j p_j \, f\left(r_j\right),$ where ''p''''j'' weights the units by their population share, and ''f''(''r''''j'') is a function of the deviation of each unit's ''r''''j'' from 1, the point of equality. The insight of this generalised inequality index is that inequality indices differ because they employ different functions of the distance of the inequality ratios (the ''r''''j'') from 1.

# Of income distributions

Gini coefficients of income are calculated on market income as well as disposable income basis. The Gini coefficient on market income—sometimes referred to as a pre-tax Gini coefficient—is calculated on income before taxes and transfers, and it measures inequality in income without considering the effect of taxes and social spending already in place in a country. The Gini coefficient on disposable income—sometimes referred to as after-tax Gini coefficient—is calculated on income after taxes and transfers, and it measures inequality in income after considering the effect of taxes and social spending already in place in a country. For
OECD The Organisation for Economic Co-operation and Development (OECD; french: Organisation de Coopération et de Développement Économiques, OCDE) is an intergovernmental economic organisation with 37 member countries, founded in 1961 to sti ...

countries over the 2008–2009 period, the Gini coefficient (pre-taxes and transfers) for a total population ranged between 0.34 and 0.53, with South Korea the lowest and Italy the highest. The Gini coefficient (after-taxes and transfers) for a total population ranged between 0.25 and 0.48, with Denmark the lowest and Mexico the highest. For the United States, the country with the largest population of the OECD countries, the pre-tax Gini index was 0.49, and the after-tax Gini index was 0.38, in 2008–2009. The OECD averages for total populations in OECD countries was 0.46 for the pre-tax income Gini index and 0.31 for the after-tax income Gini index. Taxes and social spending that were in place in 2008–2009 period in OECD countries significantly lowered effective income inequality, and in general, "European countries—especially Nordic and Continental
welfare states#REDIRECT Welfare state {{R from other capitalisation ...
—achieve lower levels of income inequality than other countries." Using the Gini can help quantify differences in
welfare Welfare is a type of government support intended to ensure that members of a society can meet basic human needs such as food and shelter. Social security may either be synonymous with welfare, or refer specifically to ''social insurance'' prog ...
and
compensation Compensation may refer to: *Financial compensation *Compensation (chess), various advantages a player has in exchange for a disadvantage *Compensation (engineering) *''Compensation'' (essay), by Ralph Waldo Emerson *''Compensation'' (film), a 2000 ...
policies and philosophies. However it should be borne in mind that the Gini coefficient can be misleading when used to make political comparisons between large and small countries or those with different immigration policies (see
limitationsLimitation may refer to: *A disclaimer for research done in an experiment or study *A Statute of limitations * ''Limitations'' (novel), a 2006 novel by Scott Turow * A technical limitation {{Disamb ...
section). The Gini coefficient for the entire world has been estimated by various parties to be between 0.61 and 0.68. The graph shows the values expressed as a percentage in their historical development for a number of countries.

## Regional income Gini indices

According to UNICEF, Latin America and the Caribbean region had the highest net income Gini index in the world at 48.3, on unweighted average basis in 2008. The remaining regional averages were: sub-Saharan Africa (44.2), Asia (40.4), Middle East and North Africa (39.2), Eastern Europe and Central Asia (35.4), and High-income Countries (30.9). Using the same method, the United States is claimed to have a Gini index of 36, while South Africa had the highest income Gini index score of 67.8.

## World income Gini index since 1800s

Taking income distribution of all human beings, worldwide income inequality has been constantly increasing since the early 19th century. There was a steady increase in the global income inequality Gini score from 1820 to 2002, with a significant increase between 1980 and 2002. This trend appears to have peaked and begun a reversal with rapid economic growth in emerging economies, particularly in the large populations of
BRIC BRIC is a grouping acronym which refers to the countries of Brazil, Russia, India and China deemed to be developing countries at a similar stage of newly advanced economic development, on their way to becoming developed countries. It is typical ...
countries. The table below presents the estimated world income Gini coefficients over the last 200 years, as calculated by Milanovic. More detailed data from similar sources plots a continuous decline since 1988. This is attributed to
globalization Globalization, or globalisation (Commonwealth English; see spelling differences), is the process of interaction and integration among people, companies, and governments worldwide. Globalization has accelerated since the 18th century due to adva ...
increasing incomes for billions of poor people, mostly in countries like China and India. Developing countries like Brazil have also improved basic services like health care, education, and sanitation; others like Chile and Mexico have enacted more
progressive tax Progressive may refer to: Politics * Progressivism is a political philosophy in support of social reform Political organizations * Congressional Progressive Caucus, members within the Democratic Party in the United States Congress dedicated to th ...
policies.

# Of social development

Gini coefficient is widely used in fields as diverse as sociology, economics, health science, ecology, engineering and agriculture. For example, in social sciences and economics, in addition to income Gini coefficients, scholars have published education Gini coefficients and opportunity Gini coefficients.

## Education

Education Gini index estimates the inequality in education for a given population. It is used to discern trends in social development through educational attainment over time. From a study of 85 countries by three Economists of World Bank Vinod Thomas, Yan Wang, Xibo Fan, estimate Mali had the highest education Gini index of 0.92 in 1990 (implying very high inequality in education attainment across the population), while the United States had the lowest education inequality Gini index of 0.14. Between 1960 and 1990, China, India and South Korea had the fastest drop in education inequality Gini Index. They also claim education Gini index for the United States slightly increased over the 1980–1990 period.

## Opportunity

Similar in concept to income Gini coefficient, opportunity Gini coefficient measures inequality of opportunity. The concept builds on
Amartya Sen Amartya Kumar Sen (; born 3 November 1933) is an Indian economist and philosopher, who since 1972 has taught and worked in the United Kingdom and the United States. Sen has made contributions to welfare economics, social choice theory, economic ...

's suggestion that inequality coefficients of social development should be premised on the process of enlarging people's choices and enhancing their capabilities, rather than on the process of reducing income inequality. Kovacevic in a review of opportunity Gini coefficient explains that the coefficient estimates how well a society enables its citizens to achieve success in life where the success is based on a person's choices, efforts and talents, not his background defined by a set of predetermined circumstances at birth, such as, gender, race, place of birth, parent's income and circumstances beyond the control of that individual. In 2003, Roemer reported Italy and Spain exhibited the largest opportunity inequality Gini index amongst advanced economies.

## Income mobility

In 1978,
Anthony Shorrocks Anthony F. Shorrocks is a British development economist. Academic career Between January 2001 and April 2009 he was Director of UNU-WIDER. Prior to that he was Professor at the London School of Economics and before that he worked at the Univers ...
introduced a measure based on income Gini coefficients to estimate income mobility. This measure, generalized by Maasoumi and Zandvakili, is now generally referred to as Shorrocks index, sometimes as Shorrocks mobility index or Shorrocks rigidity index. It attempts to estimate whether the income inequality Gini coefficient is permanent or temporary, and to what extent a country or region enables economic mobility to its people so that they can move from one (e.g., bottom 20%) income quantile to another (e.g., middle 20%) over time. In other words, Shorrocks index compares inequality of short-term earnings such as annual income of households, to inequality of long-term earnings such as 5-year or 10-year total income for same households. Shorrocks index is calculated in number of different ways, a common approach being from the ratio of income Gini coefficients between short-term and long-term for the same region or country. A 2010 study using social security income data for the United States since 1937 and Gini-based Shorrocks indices concludes that income mobility in the United States has had a complicated history, primarily due to mass influx of women into the American labor force after World War II. Income inequality and income mobility trends have been different for men and women workers between 1937 and the 2000s. When men and women are considered together, the Gini coefficient-based Shorrocks index trends imply long-term income inequality has been substantially reduced among all workers, in recent decades for the United States. Other scholars, using just 1990s data or other short periods have come to different conclusions. For example, Sastre and Ayala, conclude from their study of income Gini coefficient data between 1993 and 1998 for six developed economies, that France had the least income mobility, Italy the highest, and the United States and Germany intermediate levels of income mobility over those 5 years.

# Features

The Gini coefficient has features that make it useful as a measure of dispersion in a population, and inequalities in particular.

# Limitations

The Gini coefficient is a relative measure. It is possible for the Gini coefficient of a developing country to rise (due to increasing inequality of income) while the number of people in absolute poverty decreases. This is because the Gini coefficient measures relative, not absolute, wealth. Changing income inequality, measured by Gini coefficients, can be due to structural changes in a society such as growing population (baby booms, aging populations, increased divorce rates,
extended family Extension, extend or extended may refer to: Mathematics Logic or set theory * Axiom of extensionality * Extensible cardinal * Extension (model theory) * Extension (predicate logic), the set of tuples of values that satisfy the predicate * Exten ...
households splitting into nuclear families, emigration, immigration) and income mobility. Gini coefficients are simple, and this simplicity can lead to oversights and can confuse the comparison of different populations; for example, while both Bangladesh (per capita income of $1,693) and the Netherlands (per capita income of$42,183) had an income Gini coefficient of 0.31 in 2010, the quality of life, economic opportunity and absolute income in these countries are very different, i.e. countries may have identical Gini coefficients, but differ greatly in wealth. Basic necessities may be available to all in a developed economy, while in an undeveloped economy with the same Gini coefficient, basic necessities may be unavailable to most or unequally available, due to lower absolute wealth. ;Different income distributions with the same Gini coefficient Even when the total income of a population is the same, in certain situations two countries with different income distributions can have the same Gini index (e.g. cases when income Lorenz Curves cross). Table A illustrates one such situation. Both countries have a Gini coefficient of 0.2, but the average income distributions for household groups are different. As another example, in a population where the lowest 50% of individuals have no income and the other 50% have equal income, the Gini coefficient is 0.5; whereas for another population where the lowest 75% of people have 25% of income and the top 25% have 75% of the income, the Gini index is also 0.5. Economies with similar incomes and Gini coefficients can have very different income distributions. Bellù and Liberati claim that to rank income inequality between two different populations based on their Gini indices is sometimes not possible, or misleading. ;Extreme wealth inequality, yet low income Gini coefficient A Gini index does not contain information about absolute national or personal incomes. Populations can have very low income Gini indices, yet simultaneously very high wealth Gini index. By measuring inequality in income, the Gini ignores the differential efficiency of use of household income. By ignoring wealth (except as it contributes to income) the Gini can create the appearance of inequality when the people compared are at different stages in their life. Wealthy countries such as Sweden can show a low Gini coefficient for disposable income of 0.31 thereby appearing equal, yet have very high Gini coefficient for wealth of 0.79 to 0.86 thereby suggesting an extremely unequal wealth distribution in its society. These factors are not assessed in income-based Gini. ;Small sample bias – sparsely populated regions more likely to have low Gini coefficient Gini index has a downward-bias for small populations. Counties or states or countries with small populations and less diverse economies will tend to report small Gini coefficients. For economically diverse large population groups, a much higher coefficient is expected than for each of its regions. Taking world economy as one, and income distribution for all human beings, for example, different scholars estimate global Gini index to range between 0.61 and 0.68. As with other inequality coefficients, the Gini coefficient is influenced by the
granularity Granularity (also called graininess), the condition of existing in granules or grains, refers to the extent to which a material or system is composed of distinguishable pieces. It can either refer to the extent to which a larger entity is subdivi ...
of the measurements. For example, five 20% quantiles (low granularity) will usually yield a lower Gini coefficient than twenty 5% quantiles (high granularity) for the same distribution. Philippe Monfort has shown that using inconsistent or unspecified granularity limits the usefulness of Gini coefficient measurements. The Gini coefficient measure gives different results when applied to individuals instead of households, for the same economy and same income distributions. If household data is used, the measured value of income Gini depends on how the household is defined. When different populations are not measured with consistent definitions, comparison is not meaningful. Deininger and Squire (1996) show that income Gini coefficient based on individual income, rather than household income, are different. For example, for the United States, they find that the individual income-based Gini index was 0.35, while for France it was 0.43. According to their individual focused method, in the 108 countries they studied, South Africa had the world's highest Gini coefficient at 0.62, Malaysia had Asia's highest Gini coefficient at 0.5, Brazil the highest at 0.57 in Latin America and Caribbean region, and Turkey the highest at 0.5 in OECD countries. ;Gini coefficient is unable to discern the effects of structural changes in populations Expanding on the importance of life-span measures, the Gini coefficient as a point-estimate of equality at a certain time, ignores life-span changes in income. Typically, increases in the proportion of young or old members of a society will drive apparent changes in equality, simply because people generally have lower incomes and wealth when they are young than when they are old. Because of this, factors such as age distribution within a population and mobility within income classes can create the appearance of inequality when none exist taking into account demographic effects. Thus a given economy may have a higher Gini coefficient at any one point in time compared to another, while the Gini coefficient calculated over individuals' lifetime income is actually lower than the apparently more equal (at a given point in time) economy's. Essentially, what matters is not just inequality in any particular year, but the composition of the distribution over time. Kwok claims income Gini coefficient for Hong Kong has been high (0.434 in 2010), in part because of structural changes in its population. Over recent decades, Hong Kong has witnessed increasing numbers of small households, elderly households and elderly living alone. The combined income is now split into more households. Many old people are living separately from their children in Hong Kong. These social changes have caused substantial changes in household income distribution. Income Gini coefficient, claims Kwok, does not discern these structural changes in its society. Household money income distribution for the United States, summarized in Table C of this section, confirms that this issue is not limited to just Hong Kong. According to the US Census Bureau, between 1979 and 2010, the population of United States experienced structural changes in overall households, the income for all income brackets increased in inflation-adjusted terms, household income distributions shifted into higher income brackets over time, while the income Gini coefficient increased.Congressional Budget Office: Trends in the Distribution of Household Income Between 1979 and 2007
October 2011. see pp. i–x, with definitions on ii–iii
Another limitation of Gini coefficient is that it is not a proper measure of
egalitarianism Egalitarianism (), or equalitarianism, is a school of thought within political philosophy that builds from the concept of social equality, prioritizing it for all people. Egalitarian doctrines are generally characterized by the idea that all huma ...
, as it is only measures income dispersion. For example, if two equally egalitarian countries pursue different immigration policies, the country accepting a higher proportion of low-income or impoverished migrants will report a higher Gini coefficient and therefore may appear to exhibit more income inequality. ;Inability to value benefits and income from
informal economy An informal economy (informal sector or grey economy) is the part of any economy that is neither taxed nor monitored by any form of government. Although the informal sector makes up a significant portion of the economies in developing countries, ...
affects Gini coefficient accuracy Some countries distribute benefits that are difficult to value. Countries that provide subsidized housing, medical care, education or other such services are difficult to value objectively, as it depends on quality and extent of the benefit. In absence of free markets, valuing these income transfers as household income is subjective. The theoretical model of Gini coefficient is limited to accepting correct or incorrect subjective assumptions. In subsistence-driven and informal economies, people may have significant income in other forms than money, for example through
subsistence farming Subsistence agriculture occurs when farmers grow food crops to meet the needs of themselves and their families on smallholdings. Subsistence agriculturalists target farm output for survival and for mostly local requirements, with little or no su ...
or
barter In trade, barter (derived from ''baretor'') is a system of exchange in which participants in a transaction directly exchange goods or services for other goods or services without using a medium of exchange, such as money. Economists distinguish ...
ing. These income tend to accrue to the segment of population that is below-poverty line or very poor, in emerging and transitional economy countries such as those in sub-Saharan Africa, Latin America, Asia and Eastern Europe. Informal economy accounts for over half of global employment and as much as 90 per cent of employment in some of the poorer sub-Saharan countries with high official Gini inequality coefficients. Schneider et al., in their 2010 study of 162 countries, report about 31.2%, or about \$20 trillion, of world's
GDP Gross domestic product (GDP) is a monetary measure of the market value of all the final goods and services produced in a specific time period. GDP (nominal) per capita does not, however, reflect differences in the cost of living and the inflation ...
is informal. In developing countries, the informal economy predominates for all income brackets except for the richer, urban upper income bracket populations. Even in developed economies, between 8% (United States) to 27% (Italy) of each nation's GDP is informal, and resulting informal income predominates as a livelihood activity for those in the lowest income brackets. The value and distribution of the incomes from informal or underground economy is difficult to quantify, making true income Gini coefficients estimates difficult. Different assumptions and quantifications of these incomes will yield different Gini coefficients. Gini has some mathematical limitations as well. It is not additive and different sets of people cannot be averaged to obtain the Gini coefficient of all the people in the sets.

# Alternatives

Given the limitations of Gini coefficient, other statistical methods are used in combination or as an alternative measure of population dispersity. For example, ''entropy measures'' are frequently used (e.g. the
Atkinson indexThe Atkinson index (also known as the Atkinson measure or Atkinson inequality measure) is a measure of income inequality developed by British economist Anthony Barnes Atkinson. The measure is useful in determining which end of the distribution contri ...
or the
Theil Index The Theil index is a statistic primarily used to measure economic inequality and other economic phenomena, though it has also been used to measure racial segregation. The first presentation of this method of measuring inequality was builded up on ...
and Mean log deviation as special cases of the
generalized entropy index The generalized entropy index has been proposed as a measure of income inequality in a population. It is derived from information theory as a measure of redundancy in data. In information theory a measure of redundancy can be interpreted as non-r ...
). These measures attempt to compare the distribution of resources by intelligent agents in the market with a maximum
entropy Entropy is a scientific concept, as well as a measurable physical property that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamic ...
random distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in ...
, which would occur if these agents acted like non-interacting particles in a closed system following the laws of statistical physics.

# Relation to other statistical measures

There is a summary measure of the diagnostic ability of a binary classifier system that is also called ''Gini coefficient'', which is defined as twice the area between the
receiver operating characteristic A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The method was originally developed for operators of mil ...
(ROC) curve and its diagonal. It is related to the AUC ( Area Under the ROC Curve) measure of performance given by $AUC = \left(G+1\right)/2$ and to Mann–Whitney U. Although both Gini coefficients are defined as areas between certain curves and share certain properties, there is no direct simple relation between the Gini coefficient of statistical dispersion and the Gini coefficient of a classifier. The Gini index is also related to Pietra index—both of which are a measure of statistical heterogeneity and are derived from Lorenz curve and the diagonal line. In certain fields such as ecology, inverse Simpson's index $1/\lambda$ is used to quantify diversity, and this should not be confused with the
Simpson index A diversity index (also called phylogenetic or Simpson's Diversity Index) is a quantitative measure that reflects how many different types (such as species) there are in a dataset (a community) and that can simultaneously take into account the phyl ...
$\lambda$. These indicators are related to Gini. The inverse Simpson index increases with diversity, unlike Simpson index and Gini coefficient which decrease with diversity. The Simpson index is in the range , 1 where 0 means maximum and 1 means minimum diversity (or heterogeneity). Since diversity indices typically increase with increasing heterogeneity, Simpson index is often transformed into inverse Simpson, or using the complement $1 - \lambda$, known as Gini-Simpson Index.

# Other uses

Although the Gini coefficient is most popular in economics, it can in theory be applied in any field of science that studies a distribution. For example, in ecology the Gini coefficient has been used as a measure of
biodiversity Biodiversity is the biological variety and variability of life on Earth. Biodiversity is a measure of variation at the genetic, species, and ecosystem level. Terrestrial biodiversity is usually greater near the equator, which is the result of th ...
, where the cumulative proportion of species is plotted against cumulative proportion of individuals. In health, it has been used as a measure of the inequality of health related
quality of life Quality of life (QOL), according to Britannica, is the degree to which an individual is healthy, comfortable, and able to participate in or enjoy life events. The World Health Organization (WHO) defines QOL as "an individual's perception of their ...
in a population. In education, it has been used as a measure of the inequality of universities. In chemistry it has been used to express the selectivity of protein kinase inhibitors against a panel of kinases. In engineering, it has been used to evaluate the fairness achieved by Internet routers in scheduling packet transmissions from different flows of traffic. The Gini coefficient is sometimes used for the measurement of the discriminatory power of
rating A rating is an evaluation or assessment of something, in terms of quality, quantity, or some combination of both. Rating or ratings may also refer to: Business and economics * Credit rating, estimating the credit worthiness of an individual, co ...
systems in
credit risk A credit risk is risk of default on a debt that may arise from a borrower failing to make required payments. In the first resort, the risk is that of the lender and includes lost principal and interest, disruption to cash flows, and increased coll ...
management. A 2005 study accessed US census data to measure home computer ownership, and used the Gini coefficient to measure inequalities amongst whites and African Americans. Results indicated that although decreasing overall, home computer ownership inequality is substantially smaller among white households. A 2016 peer-reviewed study titled Employing the Gini coefficient to measure participation inequality in treatment-focused Digital Health Social Networks illustrated that the Gini coefficient was helpful and accurate in measuring shifts in inequality, however as a standalone metric it failed to incorporate overall network size. The discriminatory power refers to a credit risk model's ability to differentiate between defaulting and non-defaulting clients. The formula $G_1$, in calculation section above, may be used for the final model and also at individual model factor level, to quantify the discriminatory power of individual factors. It is related to accuracy ratio in population assessment models.

# References

* * * * * * * * * * * Reprinted in * * * * * * * * * The Chinese version of this paper appears in *

* Deutsche Bundesbank
Do banks diversify loan portfolios?
2005 (on using e.g. the Gini coefficient for risk evaluation of loan portfolios)

Measuring Software Project Risk With The Gini Coefficient
an application of the Gini coefficient to software

Travis Hale, University of Texas Inequality Project:The Theoretical Basics of Popular Inequality Measures
online computation of examples
1A1B

Article from The Guardian analysing inequality in the UK 1974–2006

World Income Inequality Database

Income Distribution and Poverty in OECD Countries

U.S. Income Distribution: Just How Unequal?
{{Authority control 1912 introductions 1912 in economics
Demographic economics {{Cat main, Demographic economics Economics Subfields of economics ...
Income inequality metrics Welfare economics