Watterson Estimator

	Watterson Estimator In population genetics, the Watterson estimator is a method for describing the genetic diversity in a population. It was developed by Margaret Wu and G. A. Watterson in the 1970s. It is estimated by counting the number of polymorphic sites. It is a measure of the "population mutation rate" (the product of the effective population size and the neutral mutation rate) from the observed nucleotide diversity of a population. \theta = 4N_e\mu, where N_e is the effective population size and \mu is the per-generation mutation rate of the population of interest ( ). The assumptions made are that there is a sample of n haploid individuals from the population of interest with effective size N_e, that n \ll N_e, and that there are infinitely many sites capable of varying (so that mutations never overlay or reverse one another). Because the number of segregating sites counted will increase with the number of sequences looked at, the correction factor a_n is used. The estimate of \theta, often d ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Population Genetics Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, speciation, and population stratification, population structure. Population genetics was a vital ingredient in the emergence of the Modern synthesis (20th century), modern evolutionary synthesis. Its primary founders were Sewall Wright, J. B. S. Haldane and Ronald Fisher, who also laid the foundations for the related discipline of quantitative genetics. Traditionally a highly mathematical discipline, modern population genetics encompasses theoretical, laboratory, and field work. Population genetic models are used both for statistical inference from DNA sequence data and for proof/disproof of concept. What sets population genetics apart from newer, more phenotypic approaches to modelling evolution, such as evolutionary game theory and evolu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Bias Of An Estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In statistics, "bias" is an property of an estimator. Bias is a distinct concept from consistency: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased (see bias versus consistency for more). All else being equal, an unbiased estimator is preferable to a biased estimator, although in practice, biased estimators (with generally small bias) are frequently used. When a biased estimator is used, bounds of the bias are calculated. A biased estimator may be used for various reasons: because an unbiased estimator does not exist without further assumptions about a population; because an estimator is difficult to compute (as in unbiased estimation of standard deviation); because a biased esti ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Ewens Sampling Formula In population genetics, Ewens's sampling formula describes the probabilities associated with counts of how many different alleles are observed a given number of times in the sample. Definition Ewens's sampling formula, introduced by Warren Ewens, states that under certain conditions (specified below), if a random sample of ''n'' gametes is taken from a population and classified according to the gene at a particular locus then the probability that there are ''a''1 alleles represented once in the sample, and ''a''2 alleles represented twice, and so on, is :\operatorname(a_1,\dots,a_n; \theta)=\prod_^n, for some positive number ''θ'' representing the population mutation rate, whenever a_1, \ldots, a_n is a sequence of nonnegative integers such that :a_1+2a_2+3a_3+\cdots+na_n=\sum_^ i a_i = n.\, The phrase "under certain conditions" used above is made precise by the following assumptions: * The sample size ''n'' is small by comparison to the size of the whole population; and * ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Coupon Collector's Problem In probability theory, the coupon collector's problem refers to mathematical analysis of "collect all coupons and win" contests. It asks the following question: if each box of a given product (e.g., breakfast cereals) contains a coupon, and there are ''n'' different types of coupons, what is the probability that more than ''t'' boxes need to be bought to collect all ''n'' coupons? An alternative statement is: given ''n'' coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once? The mathematical analysis of the problem reveals that the expected number of trials needed grows as \Theta(n\log(n)). For example, when ''n'' = 50 it takes about 225E(50) = 50(1 + 1/2 + 1/3 + ... + 1/50) = 224.9603, the expected number of trials to collect all 50 coupons. The approximation n\log n+\gamma n+1/2 for this expected number gives in this case 50\log 50+50\gamma+1/2 \approx 195.6011+28.8608+0.5\approx 224.9619. trials on ave ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Tajima's D Tajima's D is a population genetic test statistic created by and named after the Japanese researcher Fumio Tajima. Tajima's D is computed as the difference between two measures of genetic diversity: the mean number of pairwise differences and the number of segregating sites, each scaled so that they are expected to be the same in a neutrally evolving population of constant size. The purpose of Tajima's D test is to distinguish between a DNA sequence evolving randomly ("neutrally") and one evolving under a non-random process, including directional selection or balancing selection, demographic expansion or contraction, genetic hitchhiking, or introgression. A randomly evolving DNA sequence contains mutations with no effect on the fitness and survival of an organism. The randomly evolving mutations are called "neutral", while mutations under selection are "non-neutral". For example, a mutation that causes prenatal death or severe disease would be expected to be under selection. In the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Nucleotide Diversity Nucleotide diversity is a concept in molecular genetics which is used to measure the degree of polymorphism (biology), polymorphism within a population. One commonly used measure of nucleotide diversity was first introduced by Masatoshi Nei, Nei and Wen-Hsiung Li, Li in 1979. This measure is defined as the average number of nucleotide differences per site between two DNA sequences in all possible pairs in the sample population, and is denoted by \pi. An estimator for \pi is given by: : \hat = \frac \sum_ x_i x_j \pi_ = \frac \sum_^n \sum_^ 2 x_i x_j \pi_ where x_i and x_j are the respective frequencies of the i th and j th sequences, \pi_ is the number of nucleotide differences per nucleotide site between the i th and j th sequences, and n is the number of sequences in the sample. The term in front of the sums guarantees an unbiased estimator, which does not depend on how many sequences you sample. Nucleotide diversity is a measure of genetic variation. It is usually associat ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Exponential Growth Exponential growth occurs when a quantity grows as an exponential function of time. The quantity grows at a rate directly proportional to its present size. For example, when it is 3 times as big as it is now, it will be growing 3 times as fast as it is now. In more technical language, its instantaneous rate of change (that is, the derivative) of a quantity with respect to an independent variable is proportional to the quantity itself. Often the independent variable is time. Described as a function, a quantity undergoing exponential growth is an exponential function of time, that is, the variable representing time is the exponent (in contrast to other types of growth, such as quadratic growth). Exponential growth is the inverse of logarithmic growth. Not all cases of growth at an always increasing rate are instances of exponential growth. For example the function f(x) = x^3 grows at an ever increasing rate, but is much slower than growing exponentially. For example, w ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by \sigma^2, s^2, \operatorname(X), V(X), or \mathbb(X). An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation; for example, the variance of a sum of uncorrelated random variables is equal to the sum of their variances. A disadvantage of the variance for practical applications is that, unlike the standard deviation, its units differ from the random variable, which is why the standard devi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Coalescent Theory Coalescent theory is a Scientific modelling, model of how alleles sampled from a population may have originated from a most recent common ancestor, common ancestor. In the simplest case, coalescent theory assumes no genetic recombination, recombination, no natural selection, and no gene flow or population structure (genetics), population structure, meaning that each variant is equally likely to have been passed from one generation to the next. The model looks backward in time, merging alleles into a single ancestral copy according to a random process in coalescence events. Under this model, the expected time between successive coalescence events increases almost exponential growth, exponentially back in time (with wide variance). Variance in the model comes from both the random passing of alleles from one generation to the next, and the random occurrence of mutations in these alleles. The mathematical theory of the coalescent was developed independently by several groups in the earl ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Genetic Diversity Genetic diversity is the total number of genetic characteristics in the genetic makeup of a species. It ranges widely, from the number of species to differences within species, and can be correlated to the span of survival for a species. It is distinguished from '' genetic variability'', which describes the tendency of genetic characteristics to vary. Genetic diversity serves as a way for populations to adapt to changing environments. With more variation, it is more likely that some individuals in a population will possess variations of alleles that are suited for the environment. Those individuals are more likely to survive to produce offspring bearing that allele. The population will continue for more generations because of the success of these individuals. The academic field of population genetics includes several hypotheses and theories regarding genetic diversity. The neutral theory of evolution proposes that diversity is the result of the accumulation of neutral substitu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Harmonic Number In mathematics, the -th harmonic number is the sum of the reciprocals of the first natural numbers: H_n= 1+\frac+\frac+\cdots+\frac =\sum_^n \frac. Starting from , the sequence of harmonic numbers begins: 1, \frac, \frac, \frac, \frac, \dots Harmonic numbers are related to the harmonic mean in that the -th harmonic number is also times the reciprocal of the harmonic mean of the first positive integers. Harmonic numbers have been studied since antiquity and are important in various branches of number theory. They are sometimes loosely termed harmonic series, are closely related to the Riemann zeta function, and appear in the expressions of various special functions. The harmonic numbers roughly approximate the natural logarithm function and thus the associated harmonic series grows without limit, albeit slowly. In 1737, Leonhard Euler used the divergence of the harmonic series to provide a new proof of the infinity of prime numbers. His work was extended into the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Single-nucleotide Polymorphism In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently large fraction of the population (e.g. 1% or more), many publications do not apply such a frequency threshold. For example, a Guanine, G nucleotide present at a specific location in a reference genome may be replaced by an Adenine, A in a minority of individuals. The two possible nucleotide variations of this SNP – G or A – are called alleles. SNPs can help explain differences in susceptibility to a wide range of diseases across a population. For example, a common SNP in the Factor H, CFH gene is associated with increased risk of age-related macular degeneration. Differences in the severity of an illness or response to treatments may also be manifestations of genetic variations caused by SNPs. For example, two ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]