The effective population size (''N''
''e'') is the size of an
idealised population In population genetics an idealised population is one that can be described using a number of simplifying assumptions. Models of idealised populations are either used to make a general point, or they are fit to data on real populations for which the ...
that would experience the same rate of
genetic drift
Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance.
Genetic drift may cause gene va ...
as the real population. Idealised populations are those following simple one-
locus models that comply with assumptions of the
neutral theory of molecular evolution
The neutral theory of molecular evolution holds that most evolutionary changes occur at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The ...
. The effective population size is normally smaller than the
census population size ''N'', partly because chance events prevent some individuals from breeding, and partly due to
background selection and
genetic hitchhiking
Genetic hitchhiking, also called genetic draft or the hitchhiking effect, is when an allele changes frequency not because it itself is under natural selection, but because it is near another gene that is undergoing a selective sweep and that is ...
.
The same real population could have a different effective population size for different properties of interest, such as genetic drift (or more precisely, the speed of
coalescence) over one generation vs. over many generations. Within a species,
areas of the genome that have more
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s and/or less
genetic recombination
Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryot ...
tend to have lower effective population sizes, because of the effects of selection at linked sites. In a population with selection at many loci and abundant
linkage disequilibrium Linkage disequilibrium, often abbreviated to LD, is a term in population genetics referring to the association of genes, usually linked genes, in a population. It has become an important tool in medical genetics and other fields
In defining LD, it ...
, the coalescent effective population size may not reflect the census population size at all, or may reflect its logarithm.
The concept of effective population size was introduced in the field of
population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
in 1931 by the
American geneticist
A geneticist is a biologist or physician who studies genetics, the science of genes, heredity, and variation of organisms. A geneticist can be employed as a scientist or a lecturer. Geneticists may perform general research on genetic process ...
Sewall Wright
Sewall Green Wright ForMemRS
HonFRSE (December 21, 1889March 3, 1988) was an American geneticist known for his influential work on evolutionary theory and also for his work on path analysis. He was a founder of population genetics alongside ...
. Some versions of the effective population size are used in wildlife conservation.
Empirical measurements
In a rare experiment that directly measured genetic drift one generation at a time, in ''Drosophila'' populations of census size 16, the effective population size was 11.5. This measurement was achieved through studying changes in the frequency of a neutral allele from one generation to another in over 100 replicate populations.
More commonly, effective population size is estimated indirectly by comparing data on current within-species
genetic diversity
Genetic diversity is the total number of genetic characteristics in the genetic makeup of a species. It ranges widely, from the number of species to differences within species, and can be correlated to the span of survival for a species. It is d ...
to theoretical expectations. According to the
neutral theory of molecular evolution
The neutral theory of molecular evolution holds that most evolutionary changes occur at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The ...
, an idealised diploid population will have a pairwise
nucleotide diversity Nucleotide diversity is a concept in molecular genetics which is used to measure the degree of polymorphism (biology), polymorphism within a population.
One commonly used measure of nucleotide diversity was first introduced by Masatoshi Nei, Nei a ...
equal to 4
''N''
''e'', where
is the
mutation rate
In genetics, the mutation rate is the frequency of new mutations in a single gene, nucleotide sequence, or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mu ...
. The effective population size can therefore be estimated empirically by dividing the nucleotide diversity by 4
.
This captures the cumulative effects of genetic drift, genetic hitchhiking, and background selection over longer timescales. More advanced methods, permitting a changing effective population size over time, have also been developed.
The effective size measured to reflect these longer timescales may have little relationship to the number of individuals physically present in a population. Measured effective population sizes vary between genes in the same population, being low in genome areas of low recombination and high in genome areas of high recombination. Sojourn times are proportional to N in neutral theory, but for alleles under selection, sojourn times are proportional to log(N).
Genetic hitchhiking
Genetic hitchhiking, also called genetic draft or the hitchhiking effect, is when an allele changes frequency not because it itself is under natural selection, but because it is near another gene that is undergoing a selective sweep and that is ...
can cause neutral mutations to have sojourn times proportional to log(N): this may explain the relationship between measured effective population size and the local recombination rate.
If the
recombination map of
recombination frequencies along
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
s is known, ''N''
''e'' can be inferred from ''r''
P2 = 1 / (1+4''N''
''e'' ''r''), where ''r''
P is the
Pearson correlation coefficient
In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviatio ...
between loci. This expression can be interpreted as the probability that two
lineages coalesce before one allele on either lineage recombines onto some third lineage.
A survey of publications on 102 mostly wildlife animal and plant species yielded 192 ''N''
''e''/''N'' ratios. Seven different estimation methods were used in the surveyed studies. Accordingly, the ratios ranged widely from 10
''-6'' for Pacific oysters to 0.994 for humans, with an average of 0.34 across the examined species. Based on these data they subsequently estimated more comprehensive ratios, accounting for fluctuations in population size, variance in family size and unequal sex-ratio. These ratios average to only 0.10-0.11.
A genealogical analysis of human hunter-gatherers (
Eskimo
''Eskimo'' () is a controversial Endonym and exonym, exonym that refers to two closely related Indigenous peoples: Inuit (including the Alaska Native Iñupiat, the Canadian Inuit, and the Greenlandic Inuit) and the Yupik peoples, Yupik (or Sibe ...
s) determined the effective-to-census population size ratio for haploid (mitochondrial DNA, Y chromosomal DNA), and diploid (autosomal DNA) loci separately: the ratio of the effective to the census population size was estimated as 0.6–0.7 for autosomal and X-chromosomal DNA, 0.7–0.9 for mitochondrial DNA and 0.5 for Y-chromosomal DNA.
Selection effective size
In an idealised Wright-Fisher model, the fate of an allele, beginning at an intermediate frequency, is largely determined by selection if the
selection coefficient
Selection coefficient, usually denoted by the letter ''s'', is a measure used in population genetics to quantify the relative fitness of a genotype compared to other genotypes. Selection coefficients are central to the quantitative description of ...
s ≫ 1/N, and largely determined by neutral genetic drift if s ≪ 1/N. In real populations, the cutoff value of s may depend instead on local recombination rates.
This limit to selection in a real population may be captured in a toy Wright-Fisher simulation through the appropriate choice of Ne. Populations with different selection effective population sizes are predicted to evolve profoundly different genome architectures.
History of theory
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
and
Sewall Wright
Sewall Green Wright ForMemRS
HonFRSE (December 21, 1889March 3, 1988) was an American geneticist known for his influential work on evolutionary theory and also for his work on path analysis. He was a founder of population genetics alongside ...
originally defined effective population size as "the number of breeding individuals in an
idealised population In population genetics an idealised population is one that can be described using a number of simplifying assumptions. Models of idealised populations are either used to make a general point, or they are fit to data on real populations for which the ...
that would show the same amount of dispersion of
allele frequencies
Allele frequency, or gene frequency, is the relative frequency of an allele (variant of a gene) at a particular locus in a population, expressed as a fraction or percentage. Specifically, it is the fraction of all chromosomes in the population tha ...
under random
genetic drift
Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance.
Genetic drift may cause gene va ...
or the same amount of
inbreeding
Inbreeding is the production of offspring from the mating or breeding of individuals or organisms that are closely genetic distance, related genetically. By analogy, the term is used in human reproduction, but more commonly refers to the genet ...
as the population under consideration". This implied two potentially different effective population sizes, based either on the one-generation increase in variance across replicate populations (variance effective population size), or on the one-generation change in the inbreeding coefficient (inbreeding effective population size). These two are closely linked, and derived from
F-statistics
In population genetics, ''F''-statistics (also known as fixation indices) describe the statistically expected level of heterozygosity in a population; more specifically the expected degree of (usually) a reduction in heterozygosity when compared ...
, but they are not identical.
Today, the effective population size is usually estimated empirically with respect to the amount of within-species
genetic diversity
Genetic diversity is the total number of genetic characteristics in the genetic makeup of a species. It ranges widely, from the number of species to differences within species, and can be correlated to the span of survival for a species. It is d ...
divided by the
mutation rate
In genetics, the mutation rate is the frequency of new mutations in a single gene, nucleotide sequence, or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mu ...
, yielding a coalescent effective population size that reflects the cumulative effects of genetic drift, background selection, and genetic hitchhiking over longer time periods.
Another important effective population size is the selection effective population size 1/s
critical, where s
critical is the critical value of the
selection coefficient
Selection coefficient, usually denoted by the letter ''s'', is a measure used in population genetics to quantify the relative fitness of a genotype compared to other genotypes. Selection coefficients are central to the quantitative description of ...
at which selection becomes more important than
genetic drift
Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance.
Genetic drift may cause gene va ...
.
Variance effective size
In the
Wright-Fisher idealized population model, the
conditional variance
In probability theory and statistics, a conditional variance is the variance of a random variable given the value(s) of one or more other variables.
Particularly in econometrics, the conditional variance is also known as the scedastic function or s ...
of the allele frequency
, given the
allele frequency
Allele frequency, or gene frequency, is the relative frequency of an allele (variant of a gene) at a particular locus in a population, expressed as a fraction or percentage. Specifically, it is the fraction of all chromosomes in the population tha ...
in the previous generation, is
:
Let
denote the same, typically larger, variance in the actual population under consideration. The variance effective population size
is defined as the size of an idealized population with the same variance. This is found by substituting
for
and solving for
which gives
:
In the following examples, one or more of the assumptions of a strictly idealised population are relaxed, while other assumptions are retained. The variance effective population size of the more relaxed population model is then calculated with respect to the strict model.
Variations in population size
Population size varies over time. Suppose there are ''t'' non-overlapping
generation
A generation is all of the people born and living at about the same time, regarded collectively. It also is "the average period, generally considered to be about 20–30 years, during which children are born and grow up, become adults, and b ...
s, then effective population size is given by the
harmonic mean
In mathematics, the harmonic mean is a kind of average, one of the Pythagorean means.
It is the most appropriate average for ratios and rate (mathematics), rates such as speeds, and is normally only used for positive arguments.
The harmonic mean ...
of the population sizes:
:
For example, say the population size was ''N'' = 10, 100, 50, 80, 20, 500 for six generations (''t'' = 6). Then the effective population size is the
harmonic mean
In mathematics, the harmonic mean is a kind of average, one of the Pythagorean means.
It is the most appropriate average for ratios and rate (mathematics), rates such as speeds, and is normally only used for positive arguments.
The harmonic mean ...
of these, giving:
:
Note this is less than the
arithmetic mean
In mathematics and statistics, the arithmetic mean ( ), arithmetic average, or just the ''mean'' or ''average'' is the sum of a collection of numbers divided by the count of numbers in the collection. The collection is often a set of results fr ...
of the population size, which in this example is 126.7. The harmonic mean tends to be dominated by the smallest
bottleneck
Bottleneck may refer to:
* the narrowed portion (neck) of a bottle
Science and technology
* Bottleneck (engineering), where the performance of an entire system is limited by a single component
* Bottleneck (network), in a communication network
* ...
that the population goes through.
Dioeciousness
If a population is
dioecious
Dioecy ( ; ; adj. dioecious, ) is a characteristic of certain species that have distinct unisexual individuals, each producing either male or female gametes, either directly (in animals) or indirectly (in seed plants). Dioecious reproduction is ...
, i.e. there is no
self-fertilisation
Autogamy or self-fertilization refers to the fusion of two gametes that come from one individual. Autogamy is predominantly observed in the form of self-pollination, a reproductive mechanism employed by many flowering plants. However, species of ...
then
:
or more generally,
:
where ''D'' represents dioeciousness and may take the value 0 (for not dioecious) or 1 for dioecious.
When ''N'' is large, ''N''
''e'' approximately equals ''N'', so this is usually trivial and often ignored:
:
Variance in reproductive success
If population size is to remain constant, each individual must contribute on average two
gamete
A gamete ( ) is a Ploidy#Haploid and monoploid, haploid cell that fuses with another haploid cell during fertilization in organisms that Sexual reproduction, reproduce sexually. Gametes are an organism's reproductive cells, also referred to as s ...
s to the next generation. An idealized population assumes that this follows a
Poisson distribution
In probability theory and statistics, the Poisson distribution () is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known const ...
so that the
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
of the number of gametes contributed, ''k'' is equal to the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
number contributed, i.e. 2:
:
However, in natural populations the variance is often larger than this. The vast majority of individuals may have no offspring, and the next generation stems only from a small number of individuals, so
:
The effective population size is then smaller, and given by:
:
Note that if the variance of ''k'' is less than 2, ''N''
''e'' is greater than ''N''. In the extreme case of a population experiencing no variation in family size, in a laboratory population in which the number of offspring is artificially controlled, ''V''
''k'' = 0 and ''N''
''e'' = 2''N''.
Non-Fisherian sex-ratios
When the
sex ratio
A sex ratio is the ratio of males to females in a population. As explained by Fisher's principle, for evolutionary reasons this is typically about 1:1 in species which reproduce sexually. However, many species deviate from an even sex ratio, ei ...
of a population varies from the
Fisherian 1:1 ratio, effective population size is given by:
:
Where ''N''
''m'' is the number of males and ''N''
''f'' the number of females. For example, with 80 males and 20 females (an absolute population size of 100):
:
Again, this results in ''N''
''e'' being less than ''N''.
Inbreeding effective size
Alternatively, the effective population size may be defined by noting how the average
inbreeding coefficient
The coefficient of relationship is a measure of the degree of consanguinity (or biological relationship) between two individuals. The term coefficient of relationship was defined by Sewall Wright in 1922, and was derived from his definition of ...
changes from one generation to the next, and then defining ''N''
''e'' as the size of the idealized population that has the same change in average inbreeding coefficient as the population under consideration. The presentation follows Kempthorne (1957).
For the idealized population, the inbreeding coefficients follow the recurrence equation
:
Using Panmictic Index (1 − ''F'') instead of inbreeding coefficient, we get the approximate recurrence equation
:
The difference per generation is
:
The inbreeding effective size can be found by solving
:
This is
:
.
Theory of overlapping generations and age-structured populations
When organisms live longer than one breeding season, effective population sizes have to take into account the
life table
In actuarial science and demography, a life table (also called a mortality table or actuarial table) is a table which shows, for each age, the probability that a person of that age will die before their next birthday ("probability of death"). In ...
s for the species.
= Haploid
=
Assume a haploid population with discrete age structure. An example might be an organism that can survive several discrete breeding seasons. Further, define the following age structure characteristics:
:
Fisher's reproductive value for age
,
:
The chance an individual will survive to age
, and
:
The number of newborn individuals per breeding season.
The
generation time
In population biology and demography
Demography () is the statistical study of human populations: their size, composition (e.g., ethnic group, age), and how they change through the interplay of fertility (births), mortality (deaths), and mi ...
is calculated as
:
average age of a reproducing individual
Then, the inbreeding effective population size is
:
= Diploid
=
Similarly, the inbreeding effective number can be calculated for a diploid population with discrete age structure. This was first given by Johnson, but the notation more closely resembles Emigh and Pollak.
Assume the same basic parameters for the life table as given for the haploid case, but distinguishing between male and female, such as ''N''
0''ƒ'' and ''N''
0''m'' for the number of newborn females and males, respectively (notice lower case ''ƒ'' for females, compared to upper case ''F'' for inbreeding).
The inbreeding effective number is
:
See also
*
Minimum viable population
Minimum viable population (MVP) is a lower bound on the population of a species, such that it can survive in the wild. This term is commonly used in the fields of biology, ecology, and conservation biology. MVP refers to the smallest possible si ...
*
Small population size
Small populations can behave differently from larger populations. They are often the result of population bottlenecks from larger populations, leading to loss of heterozygosity and reduced genetic diversity and loss or fixation of alleles and shif ...
References
External links
*
*
* https://web.archive.org/web/20050524144622/http://www.kursus.kvl.dk/shares/vetgen/_Popgen/genetics/3/6.htm — on Københavns Universitet.
{{modelling ecosystems, expanded=none
Population genetics
Population ecology
Ecological metrics
Quantitative genetics