HOME

TheInfoList



OR:

In
population genetics Population genetics is a subfield of genetics that deals with genetic differences within and between populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, ...
, Ewens's sampling formula, describes the probabilities associated with counts of how many different
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chro ...
s are observed a given number of times in the
sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of s ...
.


Definition

Ewens's sampling formula, introduced by
Warren Ewens Warren John Ewens (born 23 January 1937 in Canberra) is an Australian-born mathematician who has been Professor of Biology at the University of Pennsylvania since 1997. (He also held that position 1972–1977.) He concentrates his research ...
, states that under certain conditions (specified below), if a random sample of ''n''
gamete A gamete (; , ultimately ) is a haploid cell that fuses with another haploid cell during fertilization in organisms that reproduce sexually. Gametes are an organism's reproductive cells, also referred to as sex cells. In species that produce ...
s is taken from a population and classified according to the
gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
at a particular
locus Locus (plural loci) is Latin for "place". It may refer to: Entertainment * Locus (comics), a Marvel Comics mutant villainess, a member of the Mutant Liberation Front * ''Locus'' (magazine), science fiction and fantasy magazine ** ''Locus Award' ...
then the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...
that there are ''a''1
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chro ...
s represented once in the sample, and ''a''2 alleles represented twice, and so on, is :\operatorname(a_1,\dots,a_n; \theta)=\prod_^n, for some positive number ''θ'' representing the
population mutation rate In genetics, the mutation rate is the frequency of new mutations in a single gene or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mutations. Mutation rates ...
, whenever a_1, \ldots, a_n is a sequence of nonnegative integers such that :a_1+2a_2+3a_3+\cdots+na_n=\sum_^ i a_i = n.\, The phrase "under certain conditions" used above is made precise by the following assumptions: * The sample size ''n'' is small by comparison to the size of the whole population; and * The population is in statistical equilibrium under
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA replication, DNA or viral repl ...
and
genetic drift Genetic drift, also known as allelic drift or the Wright effect, is the change in the frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene variants to disappear completely and there ...
and the role of selection at the locus in question is negligible; and * Every mutant allele is novel. This is a probability distribution on the set of all partitions of the integer ''n''. Among probabilists and statisticians it is often called the multivariate Ewens distribution.


Mathematical properties

When ''θ'' = 0, the probability is 1 that all ''n'' genes are the same. When ''θ'' = 1, then the distribution is precisely that of the integer partition induced by a uniformly distributed
random permutation A random permutation is a random ordering of a set of objects, that is, a permutation-valued random variable. The use of random permutations is often fundamental to fields that use randomized algorithms such as coding theory, cryptography, and sim ...
. As ''θ'' → ∞, the probability that no two of the ''n'' genes are the same approaches 1. This family of probability distributions enjoys the property that if after the sample of ''n'' is taken, ''m'' of the ''n'' gametes are chosen without replacement, then the resulting probability distribution on the set of all partitions of the smaller integer ''m'' is just what the formula above would give if ''m'' were put in place of ''n''. The Ewens distribution arises naturally from the
Chinese restaurant process In probability theory, the Chinese restaurant process is a discrete-time stochastic process, analogous to seating customers at tables in a restaurant. Imagine a restaurant with an infinite number of circular tables, each with infinite capacity. C ...
.


See also

* Chinese restaurant table distribution * Coalescent theory *
Unified neutral theory of biodiversity The unified neutral theory of biodiversity and biogeography (here "Unified Theory" or "UNTB") is a theory and the title of a monograph by ecologist Stephen P. Hubbell. It aims to explain the diversity and relative abundance of species in ecolo ...
*
Biomathematics Mathematical and theoretical biology, or biomathematics, is a branch of biology which employs theoretical analysis, mathematical models and abstractions of the living organisms to investigate the principles that govern the structure, development a ...


Notes

* Warren Ewens, "The sampling theory of selectively neutral alleles", ''Theoretical Population Biology'', volume 3, pages 87–112, 1972. * H. Crane. (2016)
The Ubiquitous Ewens Sampling Formula
, ''Statistical Science'', 31:1 (Feb 2016). This article introduces a series of seven articles about Ewens Sampling in a special issue of the journal. * J.F.C. Kingman, "Random partitions in population genetics", ''Proceedings of the Royal Society of London, Series B, Mathematical and Physical Sciences'', volume 361, number 1704, 1978. * S. Tavare and W. J. Ewens, "The Multivariate Ewens distribution." (1997, Chapter 41 from the reference below). * N.L. Johnson, S. Kotz, and N. Balakrishnan (1997) ''Discrete Multivariate Distributions'', Wiley. . {{DEFAULTSORT:Ewens's Sampling Formula Theory of probability distributions Population genetics Discrete distributions