In
population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
, an ancestry-informative marker (AIM) is a
single-nucleotide polymorphism
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a ...
that exhibits substantially different frequencies between different populations. A set of many AIMs can be used to estimate the proportion of ancestry of an individual derived from each population.
A single-nucleotide polymorphism is a modification of a single nucleotide base within a DNA sequence. There are an estimated 15 million SNP (
Single-nucleotide polymorphism
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a ...
) sites (out of roughly 3 billion base pairs, or about 0.4%) from among which AIMs may potentially be selected. The SNPs that relate to ancestry are often traced to the Y chromosome and
mitochondrial DNA
Mitochondrial DNA (mtDNA and mDNA) is the DNA located in the mitochondrion, mitochondria organelles in a eukaryotic cell that converts chemical energy from food into adenosine triphosphate (ATP). Mitochondrial DNA is a small portion of the D ...
because both of these areas are inherited from one parent, eradicating complexities that come with parental gene recombination. SNP mutations are rare, so sequences with SNPs tend to be passed down through generations rather than altered each generation. However, because any given SNP is relatively common in a population, analysts must examine groups of SNPs (otherwise known as AIMS) to determine someone's ancestry. Using statistical methods such as apparent error rate and Improved Bayesian Estimate, the set of SNPs with the highest accuracy for predicting a specific ancestry can be found.
Examining a suite of these markers more or less evenly spaced across the genome is also a cost-effective way to discover novel genes underlying complex diseases in a technique called
admixture mapping or mapping by admixture
linkage disequilibrium Linkage disequilibrium, often abbreviated to LD, is a term in population genetics referring to the association of genes, usually linked genes, in a population. It has become an important tool in medical genetics and other fields
In defining LD, it ...
.
As one example, the
Duffy Null
allele
An allele is a variant of the sequence of nucleotides at a particular location, or Locus (genetics), locus, on a DNA molecule.
Alleles can differ at a single position through Single-nucleotide polymorphism, single nucleotide polymorphisms (SNP), ...
(FY*0) has a frequency of almost 100% of Sub-Saharan Africans, but occurs very infrequently in populations outside of this region. A person having this allele is thus more likely to have Sub-Saharan African ancestors. North and South
Han Chinese
The Han Chinese, alternatively the Han people, are an East Asian people, East Asian ethnic group native to Greater China. With a global population of over 1.4 billion, the Han Chinese are the list of contemporary ethnic groups, world's la ...
ancestry can be distinguished unambiguously using a set of 140 AIMS.
Collections of AIMs have been developed that can estimate the geographical origins of ancestors from within Europe.
Following the development of ancient DNA databases, ancient ancestry-informative marker (aAIM) were similarly defined as a
single-nucleotide polymorphism
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a ...
that exhibits substantially different frequencies between different ancient populations. A set of aAIMs can be used to identify the ancestry of ancient populations and eventually quantify the genetic similarity to modern-day individuals.
Discovery and development
The discovery of ancestry-informative markers was made possible by the development of
next generation sequencing, or NGS. NGS enables the study of genetic markers by isolating specific
gene sequences. One such method for sequence extraction is the use
restriction enzyme
A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
s, specifically
endonuclease
In molecular biology, endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain (namely DNA or RNA). Some, such as deoxyribonuclease I, cut DNA relatively nonspecifically (with regard to sequence), while man ...
, which modifies the DNA sequence. This enzyme can be used with DNA ligase (connecting two different DNA), modifying DNA by inserting DNA from other organism. Another method, cDNA sequencing, or
RNA-seq
RNA-Seq (named as an abbreviation of RNA sequencing) is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also k ...
, can also help to acquire information of the
transcriptome
The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The ...
s in a broad range of organisms and find SNPs (
single nucleotide polymorphisms
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in ...
), within a DNA sequence.
Applications
Ancestry informative markers have a number of applications in genetic research, forensics, and private industry. AIMs that indicate a predisposition for diseases such as
type 2 diabetes
Type 2 diabetes (T2D), formerly known as adult-onset diabetes, is a form of diabetes mellitus that is characterized by high blood sugar, insulin resistance, and relative lack of insulin. Common symptoms include increased thirst, frequent ...
mellitus and
renal disease have been shown to reduce the effects of
genetic admixture
Genetic admixture occurs when previously isolated populations interbreed resulting in a population that is descended from multiple sources. It can occur between species, such as with hybrids, or within species, such as when geographically dista ...
in ancestral mapping when using admixture mapping software. The differential ability of ancestry-informative markers allows scientists and researchers to narrow geographical populations of concern; for example, illegal
organ trafficking
Organ trade (also known as the blood market or the red market) is the trading of human organs, tissues, or other body products, usually for transplantation.(Carney, Scott. 2011. "The Red Market." Wired 19, no. 2: 112–1. Internet and Personal C ...
can be traced to certain areas by comparing the samples taken from organ recipients and deciphering the foreign marker in their body. An array of private companies, such as
23andMe
23andMe Holding Co. is an American personal genomics and biotechnology company based in South San Francisco, California. It is best known for providing a direct-to-consumer genetic testing service in which customers provide a saliva testing, sali ...
and
AncestryDNA
Ancestry.com LLC is an American genealogy company based in Lehi, Utah. The largest for-profit genealogy company in the world, it operates a network of genealogical, historical records, and related genetic genealogy websites. It is owned by The ...
, provide cost-effective
direct-to-consumers (DTC) genetic testing
Genetic testing, also known as DNA testing, is used to identify changes in DNA sequence or chromosome structure. Genetic testing can also include measuring the results of genetic changes, such as RNA analysis as an output of gene expression, or ...
by analyzing ancestry informative markers to determine geographic origins. These private companies collect massive quantities of data such as biological samples and self-reported information from consumers, a practice known as
biobank
A biobank is a type of biorepository that stores biological samples (usually human) for use in research. Biobanks have become an important resource in medical research, supporting many types of contemporary research like genomics and personalized ...
ing, enabling their researchers to discover more insights on AIMs.
Though AIM panels can be useful for disease screening, the
Genetic Information Nondiscrimination Act
The Genetic Information Nondiscrimination Act of 2008 (, GINA ), is an Act of Congress in the United States designed to prohibit some types of genetic discrimination. The act bars the use of genetic information in health insurance and employm ...
(GINA) prevents the use of genetic information for insurance and workplace discrimination.
Medical research
Different ancestral traits and their affiliation to diseases can help scientists determine appropriate approaches of treatment for a specific population.
Medical researchers have revealed the link between ancestry traits and some common diseases; for example, individuals of African descent have been found to be at higher risk of
asthma
Asthma is a common long-term inflammatory disease of the airways of the lungs. It is characterized by variable and recurring symptoms, reversible airflow obstruction, and easily triggered bronchospasms. Symptoms include episodes of wh ...
than those of European ancestry.
AIM panels can be used for detecting disease
risk factor
In epidemiology, a risk factor or determinant is a variable associated with an increased risk of disease or infection.
Due to a lack of harmonization across disciplines, determinant, in its more widely accepted scientific meaning, is often use ...
s. One such panel was created for
African American
African Americans, also known as Black Americans and formerly also called Afro-Americans, are an Race and ethnicity in the United States, American racial and ethnic group that consists of Americans who have total or partial ancestry from an ...
ancestry based on subsets of commercially available SNP arrays. These types of arrays can help reduce the cost of identifying risk factors, since they allow researchers to screen for ancestry markers instead of the entire genome. This is due to the fact that these SNP arrays narrow the scope of the necessary screening from hundreds of thousands of SNP markers to a panel of a few thousands of AIMs.
While some believe that structured populations should be used in studies to better ascertain genetic associations to
disease
A disease is a particular abnormal condition that adversely affects the structure or function (biology), function of all or part of an organism and is not immediately due to any external injury. Diseases are often known to be medical condi ...
s, the social implications of the potential racial stigma that may result from such studies is a major concern. However, the study done by Yang et al. (2005) suggests that the technology to conduct deeper research into and identify ancestry-associated variations in human disease does already exist.
See also
*
SLC24A5
*
Race and genetics
Researchers have investigated the relationship between race and genetics as part of efforts to understand how biology may or may not contribute to human racial categorization. Today, the consensus among scientists is that race is a social cons ...
References
;General
*Shriver, Mark D. et al.
"Skin pigmentation, biogeographical ancestry and admixture mapping,"Hum. Genet. 112, 387-399 (2003)
*SNP Science Prime
*dbSNP Summar
Explanationfrom DNAPrint Genomics
{{DEFAULTSORT:Ancestry-Informative Marker
Applied genetics