HOME

TheInfoList



OR:

Mark and recapture is a method commonly used in
ecology Ecology () is the study of the relationships between living organisms, including humans, and their physical environment. Ecology considers organisms at the individual, population, community, ecosystem, and biosphere level. Ecology overl ...
to estimate an animal
population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction usi ...
's size where it is impractical to count every individual. A portion of the population is captured, marked, and released. Later, another portion will be captured and the number of marked individuals within the sample is counted. Since the number of marked individuals within the second sample should be proportional to the number of marked individuals in the whole population, an estimate of the total population size can be obtained by dividing the number of marked individuals by the proportion of marked individuals in the second sample. Other names for this method, or closely related methods, include capture-recapture, capture-mark-recapture, mark-recapture, sight-resight, mark-release-recapture, multiple systems estimation, band recovery, the Petersen method, and the Lincoln method. Another major application for these methods is in
epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evi ...
, where they are used to estimate the completeness of ascertainment of disease registers. Typical applications include
estimating Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...
the number of people needing particular services (i.e. services for children with
learning disabilities Learning disability, learning disorder, or learning difficulty (British English) is a condition in the brain that causes difficulties comprehending or processing information and can be caused by several different factors. Given the "difficult ...
, services for medically frail elderly living in the community), or with particular conditions (i.e. illegal drug addicts, people infected with HIV, etc.).


Field work related to mark-recapture

Typically a
researcher Research is "creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness t ...
visits a study area and uses traps to capture a group of individuals alive. Each of these individuals is marked with a unique identifier (e.g., a numbered tag or band), and then is released unharmed back into the environment. A mark-recapture method was first used for ecological study in 1896 by
C.G. Johannes Petersen Carl Georg Johannes Petersen (24 October 1860 – 11 May 1928) was a Danish marine biologist, especially fisheries biologist. He was the first to describe communities of benthic marine invertebrates and is often considered a founder of modern fish ...
to estimate plaice, '' Pleuronectes platessa'', populations. Sufficient time is allowed to pass for the marked individuals to redistribute themselves among the unmarked population. Next, the researcher returns and captures another
sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of ...
of individuals. Some individuals in this second sample will have been marked during the initial visit and are now known as recaptures. Other organisms captured during the second visit, will not have been captured during the first visit to the study area. These unmarked animals are usually given a tag or band during the second visit and then are released. Population size can be estimated from as few as two visits to the study area. Commonly, more than two visits are made, particularly if estimates of survival or movement are desired. Regardless of the total number of visits, the researcher simply records the date of each capture of each individual. The "capture histories" generated are analyzed mathematically to estimate population size, survival, or movement. When capturing and marking organisms, ecologists need to consider the welfare of the organisms. If the chosen identifier harms the organism, then its behavior might become irregular.


Notation

Let : ''N'' = Number of animals in the population : ''n'' = Number of animals marked on the first visit : ''K'' = Number of animals captured on the second visit : ''k'' = Number of recaptured animals that were marked A biologist wants to estimate the size of a population of turtles in a lake. She captures 10 turtles on her first visit to the lake, and marks their backs with paint. A week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals. This example is (n, K, k) = (10, 15, 5). The problem is to estimate ''N''.


Lincoln–Petersen estimator

The Lincoln–Petersen method (also known as the Petersen–Lincoln index or
Lincoln index The Lincoln index is a statistical measure used in several fields to estimate the number of cases that have not yet been observed, based on two independent sets of observed cases. Described by Frederick Charles Lincoln in 1930, it is also sometimes ...
) can be used to estimate population size if only two visits are made to the study area. This method assumes that the study population is "closed". In other words, the two visits to the study area are close enough in time so that no individuals die, are born, or move into or out of the study area between visits. The model also assumes that no marks fall off animals between visits to the field site by the researcher, and that the researcher correctly records all marks. Given those conditions, estimated population size is: :\hat = \frac,


Derivation

It is assumed that all individuals have the same probability of being captured in the second sample, regardless of whether they were previously captured in the first sample (with only two samples, this assumption cannot be tested directly). This implies that, in the second sample, the proportion of marked individuals that are caught (k/K) should equal the proportion of the total population that is marked (n/N). For example, if half of the marked individuals were recaptured, it would be assumed that half of the total population was included in the second sample. In symbols, :\frac = \frac. A rearrangement of this gives :\hat=\frac, the formula used for the Lincoln–Petersen method.


Sample calculation

In the example (n, K, k) = (10, 15, 5) the Lincoln–Petersen method estimates that there are 30 turtles in the lake. : \hat = \frac = \frac = 30


Chapman estimator

The Lincoln–Petersen estimator is asymptotically unbiased as sample size approaches infinity, but is biased at small sample sizes. An alternative less biased estimator of population size is given by the Chapman estimator: :\hat_C = \frac - 1


Sample calculation

The example (n, K, k) = (10, 15, 5) gives :\hat_C = \frac -1= \frac-1 = 28.3 Note that the answer provided by this equation must be truncated not rounded. Thus, the Chapman method estimates 28 turtles in the lake. Surprisingly, Chapman's estimate was one conjecture from a range of possible estimators: "In practice, the whole number immediately less than (''K''+1)(''n''+1)/(''k''+1) or even ''Kn''/(''k''+1) will be the estimate. The above form is more convenient for mathematical purposes."(see footnote, page 144). Chapman also found the estimator could have considerable negative bias for small ''Kn''/''N'' (page 146), but was unconcerned because the estimated standard deviations were large for these cases.


Confidence interval

An approximate 100(1-\alpha)\%
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the population size ''N'' can be obtained as: K + n - k - 0.5 + \frac \exp(\pm z_\hat_) , where z_ corresponds to the 1-\alpha/2
quantile In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one fewer quantile th ...
of a standard
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
random variable, and \hat_ = \sqrt. The example (''n, K, k'') = (10, 15, 5) gives the estimate ''N'' ≈ 30 with a 95% confidence interval of 22 to 65. It has been shown that this confidence interval has actual coverage probabilities that are close to the nominal 100(1-\alpha)\% level even for small populations and extreme capture probabilities (near to 0 or 1), in which cases other confidence intervals fail to achieve the nominal coverage levels.


Bayesian estimate

The mean value ± standard deviation is :N\approx \mu\pm\sqrt where :\mu=\frac for k>2 :\epsilon=\frac for k>3 A derivation is found here: Talk:Mark and recapture#Statistical treatment. The example (''n, K, k'') = (10, 15, 5) gives the estimate ''N'' ≈ 42 ± 21.5


Capture probability

The capture probability refers to the probability of a detecting an individual animal or person of interest, and has been used in both ecology and
epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evi ...
for detecting animal or human diseases, respectively. The capture probability is often defined as a two-variable model, in which ''f'' is defined as the fraction of a finite resource devoted to detecting the animal or person of interest from a high risk sector of an animal or human population, and ''q'' is the frequency of time that the problem (e.g., an animal disease) occurs in the high-risk versus the low-risk sector. For example, an application of the model in the 1920s was to detect typhoid carriers in London, who were either arriving from zones with high rates of tuberculosis (probability ''q'' that a passenger with the disease came from such an area, where ''q''>0.5), or low rates (probability 1−''q''). It was posited that only 5 out of every 100 of the travelers could be detected, and 10 out of every 100 were from the high risk area. Then the capture probability ''P'' was defined as: :P = \fracfq+\frac(1-f)(1-q), where the first term refers to the probability of detection (capture probability) in a high risk zone, and the latter term refers to the probability of detection in a low risk zone. Importantly, the formula can be re-written as a linear equation in terms of ''f'': :P = \left(\fracq-\frac(1-q)\right)f + \frac(1-q). Because this is a linear function, it follows that for certain versions of ''q'' for which the slope of this line (the first term multiplied by ''f'') is positive, all of the detection resource should be devoted to the high-risk population (''f'' should be set to 1 to maximize the capture probability), whereas for other value of ''q'', for which the slope of the line is negative, all of the detection should be devoted to the low-risk population (''f'' should be set to 0. We can solve the above equation for the values of ''q'' for which the slope will be positive to determine the values for which ''f'' should be set to 1 to maximize the capture probability: :\left( \frac q - \frac(1-q) \right) > 0, which simplifies to: :q > \frac. This is an example of linear optimization. In more complex cases, where more than one resource ''f'' is devoted to more than two areas, multivariate
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
is often used, through the
simplex algorithm In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming. The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. Simplices are n ...
or its derivatives.


More than two visits

The literature on the analysis of capture-recapture studies has blossomed since the early 1990s. There are very elaborate statistical models available for the analysis of these experiments.McCrea, R.S. and Morgan, B.J.T. (2014) A simple model which easily accommodates the three source, or the three visit study, is to fit a Poisson regression model. Sophisticated mark-recapture models can be fit with several packages for the Open Source
R programming language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinforma ...
. These include "Spatially Explicit Capture-Recapture (secr)", "Loglinear Models for Capture-Recapture Experiments (Rcapture)", and "Mark-Recapture Distance Sampling (mrds)". Such models can also be fit with specialized programs such as
MARK Mark may refer to: Currency * Bosnia and Herzegovina convertible mark, the currency of Bosnia and Herzegovina * East German mark, the currency of the German Democratic Republic * Estonian mark, the currency of Estonia between 1918 and 1927 * Finn ...
or E-SURGE. Other related methods which are often used include the Jolly–Seber model (used in open populations and for multiple census estimates) and Schnabel estimators (an expansion to the Lincoln–Petersen method for closed populations). These are described in detail by Sutherland.


Integrated approaches

Modelling mark-recapture data is trending towards a more integrative approach,Maunder M.N. (2003) Paradigm shifts in fisheries stock assessment: from integrated analysis to Bayesian analysis and back again. Natural Resource Modeling 16:465–475 which combines mark-recapture data with
population dynamics Population dynamics is the type of mathematics used to model and study the size and age composition of populations as dynamical systems. History Population dynamics has traditionally been the dominant branch of mathematical biology, which has a ...
models and other types of data. The integrated approach is more computationally demanding, but extracts more information from the data improving
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
and
uncertainty Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
estimates.Maunder, M.N. (2001) Integrated Tagging and Catch-at-Age Analysis (ITCAAN). In Spatial Processes and Management of Fish Populations, edited by G.H. Kruse, N. Bez, A. Booth, M.W. Dorn, S. Hills, R.N. Lipcius, D. Pelletier, C. Roy, S.J. Smith, and D. Witherell, Alaska Sea Grant College Program Report No. AK-SG-01-02, University of Alaska Fairbanks, pp. 123–146.


See also

* German tank problem, for estimation of population size when the elements are numbered. * Tag and release *
Abundance estimation Abundance estimation comprises all statistical methods for estimating the number of individuals in a population. In ecology, this may be anything from estimating the number of daisies in a field to estimating the number of blue whale The b ...
* GPS wildlife tracking


References

* * * * * * * * *


Further reading

* * * {{cite journal , last1 = Lincoln , first1 = F. C. , year = 1930 , title = Calculating Waterfowl Abundance on the Basis of Banding Returns , journal = United States Department of Agriculture Circular , volume = 118 , pages = 1–4 * Petersen, C. G. J. (1896). "The Yearly Immigration of Young Plaice Into the Limfjord From the German Sea", ''Report of the Danish Biological Station (1895)'', 6, 5–84. * Schofield, J. R. (2007). "Beyond Defect Removal: Latent Defect Estimation With Capture-Recapture Method", Crosstalk, August 2007; 27–29.


External links


A historical introduction to capture-recapture methods

Analysis of capture-recapture data
Ecological techniques Epidemiology Statistical data types Environmental statistics Environmental Sampling Equipment