In
genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a
germline
In biology and genetics, the germline is the population of a multicellular organism's cells that pass on their genetic material to the progeny (offspring). In other words, they are the cells that form the egg, sperm and the fertilised egg. They ...
substitution of a single
nucleotide at a specific position in the
genome. Although certain definitions require the substitution to be present in a sufficiently large fraction of the population (e.g. 1% or more), many publications do not apply such a frequency threshold.
For example, at a specific base position in the human genome, the
G nucleotide may appear in most individuals, but in a minority of individuals, the position is occupied by an
A. This means that there is a SNP at this specific position, and the two possible nucleotide variations – G or A – are said to be the
alleles for this specific position.
SNPs pinpoint differences in our susceptibility to a wide range of
diseases, for example age-related macular degeneration (a common SNP in the
CFH gene is associated with increased risk of the disease) or nonalcoholic fatty liver disease (a SNP in the
PNPLA3
Patatin-like phospholipase domain-containing protein 3 (PNPLA3) also known as adiponutrin (ADPN), acylglycerol O-acyltransferase or calcium-independent phospholipase A2-epsilon (iPLA2-epsilon) is an enzyme that in humans is encoded by the ''PNPLA3 ...
gene is associated with increased risk of the disease). The severity of illness and the way the body responds to treatments are also manifestations of genetic variations caused by SNPs. For example, the
APOE E4
Apolipoprotein E (APOE) is a protein involved in the metabolism of fats in the body of mammals. A subtype is implicated in Alzheimer's disease and cardiovascular disease.
APOE belongs to a family of fat-binding proteins called apolipoproteins. ...
allele that is determined by two common SNPs, rs429358 and rs7412, in the
APOE
Apolipoprotein E (APOE) is a protein involved in the metabolism of fats in the body of mammals. A subtype is implicated in Alzheimer's disease and cardiovascular disease.
APOE belongs to a family of fat-binding proteins called apolipoproteins. ...
gene is not only associated with increased risk for Alzheimer’s disease but also younger age at onset of the disease.
A single-nucleotide variant (SNV) is a general term for single nucleotide change in DNA sequence. So a SNV can be a common SNP or a rare mutation, and can be germline or
somatic
Somatic may refer to:
* Somatic (biology), referring to the cells of the body in contrast to the germ line cells
** Somatic cell, a non-gametic cell in a multicellular organism
* Somatic nervous system, the portion of the vertebrate nervous sys ...
and can be caused by cancer, but a SNP has to
segregate in a species' population of organisms. SNVs also commonly arise in molecular diagnostics such as designing
PCR PCR or pcr may refer to:
Science
* Phosphocreatine, a phosphorylated creatine molecule
* Principal component regression, a statistical technique
Medicine
* Polymerase chain reaction
** COVID-19 testing, often performed using the polymerase chain r ...
primers to detect viruses, in which the viral RNA or DNA sample may contain SNVs.
Types
Single-nucleotide
polymorphisms may fall within coding sequences of
genes,
non-coding regions of genes, or in the
intergenic regions (regions between genes). SNPs within a coding sequence do not necessarily change the
amino acid sequence of the
protein that is produced, due to
degeneracy of the genetic code
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
.
SNPs in the coding region are of two types: synonymous SNPs and nonsynonymous SNPs. Synonymous SNPs do not affect the protein sequence, while nonsynonymous SNPs change the amino acid sequence of protein.
* SNPs in
non-coding regions can manifest in a higher risk of cancer, and may affect mRNA structure and disease susceptibility. Non-coding SNPs can also alter the level of
expression of a gene, as an
eQTL (expression quantitative trait locus).
* SNPs in
coding region
The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to no ...
s:
**
synonymous substitutions by definition do not result in a change of amino acid in the protein, but still can affect its function in other ways. An example would be a seemingly silent mutation in the multidrug resistance gene 1 (
MDR1), which codes for a cellular membrane pump that expels drugs from the cell, can slow down translation and allow the peptide chain to fold into an unusual conformation, causing the mutant pump to be less functional (in MDR1 protein e.g. C1236T polymorphism changes a GGC codon to GGT at amino acid position 412 of the polypeptide (both encode glycine) and the C3435T polymorphism changes ATC to ATT at position 1145 (both encode isoleucine)).
**
nonsynonymous substitutions:
***
missense – single change in the base results in change in amino acid of protein and its malfunction which leads to disease (e.g. c.1580G>T SNP in
LMNA gene – position 1580 (nt) in the DNA sequence (CGT codon) causing the
guanine to be replaced with the
thymine, yielding CTT codon in the DNA sequence, results at the protein level in the replacement of the
arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) and both the am ...
by the
leucine in the position 527, at the
phenotype level this manifests in overlapping
mandibuloacral dysplasia
Mandibuloacral dysplasia (MAD) is a rare autosomal recessive syndrome characterized by mandibular hypoplasia, delayed cranial suture closure, dysplastic clavicles, abbreviated and club-shaped terminal phalanges, acroosteolysis, atrophy of the skin ...
and
progeria syndrome
Progeria is a specific type of Progeroid syndromes, progeroid syndrome, also known as Hutchinson–Gilford syndrome. A single gene mutation is responsible for progeria. The gene, known as lamin A (LMNA), makes a protein necessary for holding the ...
)
***
nonsense –
point mutation in a sequence of DNA that results in a premature
stop codon, or a ''nonsense codon'' in the
transcribed mRNA, and in a
truncated, incomplete, and usually nonfunctional protein product (e.g.
Cystic fibrosis
Cystic fibrosis (CF) is a rare genetic disorder that affects mostly the lungs, but also the pancreas, liver, kidneys, and intestine. Long-term issues include difficulty breathing and coughing up mucus as a result of frequent lung infections. O ...
caused by the G542X mutation in the
cystic fibrosis transmembrane conductance regulator gene).
SNPs that are not in protein-coding regions may still affect
gene splicing
Recombinant DNA (rDNA) molecules are DNA molecules formed by laboratory methods of genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be foun ...
,
transcription factor binding,
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the p ...
degradation, or the sequence of noncoding RNA. Gene expression affected by this type of SNP is referred to as an eSNP (expression SNP) and may be upstream or downstream from the gene.
Frequency
More than 335 million SNPs have been found across humans from multiple populations. A typical genome differs from the reference human genome at 4 to 5 million sites, most of which (more than 99.9%) consist of SNPs and short
indels.
Within a genome
The genomic distribution of SNPs is not homogenous; SNPs occur in
non-coding regions more frequently than in
coding region
The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to no ...
s or, in general, where natural selection is acting and "fixing" the
allele (eliminating other variants) of the SNP that constitutes the most favorable genetic adaptation.
Other factors, like
genetic recombination
Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryo ...
and mutation rate, can also determine SNP density.
SNP density can be predicted by the presence of
microsatellites: AT microsatellites in particular are potent predictors of SNP density, with long (AT)(n) repeat tracts tending to be found in regions of significantly reduced SNP density and low
GC content.
Within a population
There are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another. However, this pattern of variation is relatively rare; in a global sample of 67.3 million SNPs, the
Human Genome Diversity Project "found no such private variants that are
fixed
Fixed may refer to:
* ''Fixed'' (EP), EP by Nine Inch Nails
* ''Fixed'', an upcoming 2D adult animated film directed by Genndy Tartakovsky
* Fixed (typeface), a collection of monospace bitmap fonts that is distributed with the X Window System
* ...
in a given continent or major region. The highest frequencies are reached by a few tens of variants present at >70% (and a few thousands at >50%) in Africa, the Americas, and Oceania. By contrast, the highest frequency variants private to Europe, East Asia, the Middle East, or Central and South Asia reach just 10 to 30%."
Within a population, SNPs can be assigned a
minor allele frequency Minor allele frequency (MAF) is the frequency at which the ''second most common'' allele occurs in a given population. They play a surprising role in heritability since MAF variants which occur only once, known as "singletons", drive an enormous am ...
—the lowest allele frequency at a
locus that is observed in a particular population. This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms.
With this knowledge scientists have developed new methods in analyzing population structures in less studied species. By using pooling techniques the cost of the analysis is significantly lowered. These techniques are based on sequencing a population in a pooled sample instead of sequencing every individual within the population by itself. With new bioinformatics tools there is a possibility of investigating population structure, gene flow and gene migration by observing the allele frequencies within the entire population. With these protocols there is a possibility in combining the advantages of SNPs with micro satellite markers. However, there are information lost in the process such as linkage disequilibrium and zygosity information.
Applications
*
Association studies
Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.
Studies of genetic association aim to test whether single-locus alleles or genotype fr ...
can determine whether a genetic variant is associated with a disease or trait.
* A tag SNP is a representative single-nucleotide polymorphism in a region of the genome with high
linkage disequilibrium (the non-random association of alleles at two or more loci). Tag SNPs are useful in whole-genome SNP association studies, in which hundreds of thousands of SNPs across the entire genome are genotyped.
*
Haplotype
A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent.
Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA or ...
mapping: sets of alleles or DNA sequences can be clustered so that a single SNP can identify many linked SNPs.
*
Linkage disequilibrium (LD), a term used in population genetics, indicates non-random association of alleles at two or more loci, not necessarily on the same chromosome. It refers to the phenomenon that SNP allele or DNA sequence that are close together in the genome tend to be inherited together. LD can be affected by two parameters (among other factors, such as population stratification): 1) The distance between the SNPs
he larger the distance, the lower the LD 2) Recombination rate
he lower the recombination rate, the higher the LD
He or HE may refer to:
Language
* He (pronoun), an English pronoun
* He (kana), the romanization of the Japanese kana へ
* He (letter), the fifth letter of many Semitic alphabets
* He (Cyrillic), a letter of the Cyrillic script called ''He'' ...
* In
genetic epidemiology
Genetic epidemiology is the study of the role of genetic factors in determining health and disease in families and in populations, and the interplay of such genetic factors with environmental factors. Genetic epidemiology seeks to derive a statist ...
SNPs are used to estimate transmission clusters.
Importance
Variations in the DNA sequences of humans can affect how humans develop
diseases and respond to
pathogens,
chemicals,
drugs
A drug is any chemical substance that causes a change in an organism's physiology or psychology when consumed. Drugs are typically distinguished from food and substances that provide nutritional support. Consumption of drugs can be via inhalat ...
,
vaccines, and other agents. SNPs are also critical for
personalized medicine. Examples include biomedical research, forensics, pharmacogenetics, and disease causation, as outlined below.
Clinical research
Genome-wide association study (GWAS)
One of main contributions of SNPs in clinical research is genome-wide association study (GWAS).
Genome-wide genetic data can be generated by multiple technologies, including SNP array and whole genome sequencing. GWAS has been commonly used in identifying SNPs associated with diseases or clinical phenotypes or traits. Since GWAS is a genome-wide assessment, a large sample site is required to obtain sufficient statistical power to detect all possible associations. Some SNPs have relatively small effect on diseases or clinical phenotypes or traits. To estimate study power, the genetic model for disease needs to be considered, such as dominant, recessive, or additive effects. Due to genetic heterogeneity, GWAS analysis must be adjusted for race.
Candidate gene association study
Candidate gene association study is commonly used in genetic study before the invention of high throughput genotyping or sequencing technologies. Candidate gene association study is to investigate limited number of pre-specified SNPs for association with diseases or clinical phenotypes or traits. So this is a hypothesis driven approach. Since only a limited number of SNPs are tested, a relatively small sample size is sufficient to detect the association. Candidate gene association approach is also commonly used to confirm findings from GWAS in independent samples.
Homozygosity mapping in disease
Genome-wide SNP data can be used for homozygosity mapping. Homozygosity mapping is a method used to identify homozygous autosomal recessive loci, which can be a powerful tool to map genomic regions or genes that are involved in disease pathogenesis.
Forensic sciences
SNPs have historically been used to match a forensic DNA sample to a suspect but has been made obsolete due to advancing
STR-based
DNA fingerprinting
DNA profiling (also called DNA fingerprinting) is the process of determining an individual's DNA characteristics. DNA analysis intended to identify a species, rather than an individual, is called DNA barcoding.
DNA profiling is a forensic tec ...
techniques. However, the development of
next-generation-sequencing (NGS) technology may allow for more opportunities for the use of SNPs in phenotypic clues such as ethnicity, hair color, and eye color with a good probability of a match. This can additionally be applied to increase the accuracy of facial reconstructions by providing information that may otherwise be unknown, and this information can be used to help identify suspects even without a STR
DNA profile match.
Some cons to using SNPs versus STRs is that SNPs yield less information than STRs, and therefore more SNPs are needed for analysis before a profile of a suspect is able to be created. Additionally, SNPs heavily rely on the presence of a database for comparative analysis of samples. However, in instances with degraded or small volume samples, SNP techniques are an excellent alternative to STR methods. SNPs (as opposed to STRs) have an abundance of potential markers, can be fully automated, and a possible reduction of required fragment length to less than 100bp.
6
Pharmacogenetics
Pharmacogenetics focuses on identifying genetic variations including SNPs associated with differential responses to treatment. Many drug metabolizing enzymes, drug targets, or target pathways can be influenced by SNPs. The SNPs involved in drug metabolizing enzyme activities can change drug pharmacokinetics, while the SNPs involved in drug target or its pathway can change drug pharmacodynamics. Therefore, SNPs are potential genetic markers that can be used to predict drug exposure or effectiveness of the treatment. Genome-wide pharmacogenetic study is called pharmacogenomics. Pharmacogenetics and pharmacogenomics are important in the development of precision medicine, especially for life threatening diseases such as cancers.
Disease
Only small amount of SNPs in the human genome may have impact on human diseases. Large scale GWAS has been done for the most important human diseases, including heart diseases, metabolic diseases, autoimmune diseases, and neurodegenerative and psychiatric disorders.
Most of the SNPs with relatively large effects on these diseases have been identified. These findings have significantly improved understanding of disease pathogenesis and molecular pathways, and facilitated development of better treatment. Further GWAS with larger samples size will reveal the SNPs with relatively small effect on diseases. For common and complex diseases, such as type-2 diabetes, rheumatoid arthritis, and Alzheimer’s disease, multiple genetic factors are involved in disease etiology. In addition, gene-gene interaction and gene-environment interaction also play an important role in disease initiation and progression.
Examples
*
rs6311
In genetics, rs6311 is a gene variation—a single nucleotide polymorphism (SNP)—in the human ''HTR2A'' gene that codes for the 5-HT2A, 5-HT2A receptor (biology), receptor. 5-HT2A is a neuroreceptor, and several scientific studies have investigat ...
and
rs6313
In genetics, rs6313 also called T102C or C102T is a gene variation—a single nucleotide polymorphism (SNP)—in the human ''HTR2A'' gene that codes for the 5-HT2A receptor.
The SNP is a synonymous substitution located in exon 1 of the ge ...
are SNPs in the
Serotonin 5-HT2A receptor gene on human chromosome 13.
* The SNP − 3279C/A (rs3761548) is amongst the SNPs locating in the promoter region of the
Foxp3 gene, might be involved in cancer progression.
* A SNP in the ''
F5'' gene causes
Factor V Leiden thrombophilia.
*
rs3091244 RS3 or RS-3 may refer to:
Vehicles Automobiles
* Audi RS3, a 2011–present German compact performance car
* Baojun RS-3, a 2019–present Chinese subcompact SUV
Other
* RS3 (sail), a windsurfing sail
* ALCO RS-3, diesel locomotive built by Ame ...
is an example of a triallelic SNP in the
CRP gene on human chromosome 1.
*
TAS2R38
Taste receptor 2 member 38 is a protein that in humans is encoded by the ''TAS2R38'' gene. TAS2R38 is a bitter taste receptor; varying genotypes of ''TAS2R38'' influence the ability to taste both 6-''n''-propylthiouracil (PROP) and phenylthiocar ...
codes for
PTC tasting ability, and contains 6 annotated SNPs.
* rs148649884 and rs138055828 in the ''
FCN1
Ficolin-1, and also commonly termed M-ficolin is a protein that in humans is encoded by the ''FCN1'' gene.
Proteins of the Ficolin, ficolin family consist of a leader peptide, a short N-terminus, N-terminal segment, followed by a Collagen, collag ...
'' gene encoding M-ficolin crippled the ligand-binding capability of the recombinant M-ficolin.
* An
intron
An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of the cistron .e., gene. ...
ic SNP in
DNA mismatch repair gene ''
PMS2
Mismatch repair endonuclease PMS2 is an enzyme that in humans is encoded by the ''PMS2'' gene.
Function
This gene is one of the PMS2 gene family members which are found in clusters on chromosome 7. Human PMS2 related genes are located at bands ...
'' (rs1059060, Ser775Asn) is associated with increased
sperm
Sperm is the male reproductive cell, or gamete, in anisogamous forms of sexual reproduction (forms in which there is a larger, female reproductive cell and a smaller, male one). Animals produce motile sperm with a tail known as a flagellum, whi ...
DNA damage
DNA repair is a collection of processes by which a cell identifies and corrects damage to the DNA molecules that encode its genome. In human cells, both normal metabolic activities and environmental factors such as radiation can cause DNA da ...
and risk of
male infertility
Male infertility refers to a sexually mature male's inability to impregnate a fertile female. In humans it accounts for 40–50% of infertility. It affects approximately 7% of all men. Male infertility is commonly due to deficiencies in the semen, ...
.
Databases
As there are for genes,
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
databases exist for SNPs.
* ''
dbSNP'' is a SNP database from the
National Center for Biotechnology Information (NCBI). , dbSNP listed 149,735,377 SNPs in humans.
*
Kaviar'
is a compendium of SNPs from multiple data sources including dbSNP.
* ''
SNPedia
SNPedia (pronounced "snipedia") is a wiki-based bioinformatics web site that serves as a database of single nucleotide polymorphisms (SNPs). Each article on a SNP provides a short description, links to scientific articles and personal genomics web ...
'' is a wiki-style database supporting personal genome annotation, interpretation and analysis.
* The ''
OMIM'' database describes the association between polymorphisms and diseases (e.g., gives diseases in text form)
* dbSAP – single amino-acid polymorphism database for protein variation detection
* The Human Gene Mutation Database provides gene mutations causing or associated with human inherited diseases and functional SNPs
* The
International HapMap Project, where researchers are identifying
Tag SNP A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenot ...
s to be able to determine the collection of
haplotype
A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent.
Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA or ...
s present in each subject.
*
GWAS Central allows users to visually interrogate the actual summary-level association data in one or more
genome-wide association studies.
The International SNP Map working group mapped the sequence flanking each SNP by alignment to the genomic sequence of large-insert clones in Genebank. These alignments were converted to chromosomal coordinates that is shown in Table 1.
This list has greatly increased since, with, for instance, the Kaviar database now listing 162 million single nucleotide variants (SNVs).
Nomenclature
The nomenclature for SNPs include several variations for an individual SNP, while lacking a common consensus.
The rs### standard is that which has been adopted by
dbSNP and uses the prefix "rs", for "reference SNP", followed by a unique and arbitrary number. SNPs are frequently referred to by their dbSNP rs number, as in the examples above.
The Human Genome Variation Society (HGVS) uses a standard which conveys more information about the SNP. Examples are:
* c.76A>T: "c." for
coding region
The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to no ...
, followed by a number for the position of the nucleotide, followed by a one-letter abbreviation for the nucleotide (A, C, G, T or U), followed by a greater than sign (">") to indicate substitution, followed by the abbreviation of the nucleotide which replaces the former
* p.Ser123Arg: "p." for protein, followed by a three-letter abbreviation for the amino acid, followed by a number for the position of the amino acid, followed by the abbreviation of the amino acid which replaces the former.
SNP analysis
SNPs can be easily assayed due to only containing two possible
alleles and three possible
genotype
The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
s involving the two alleles:
homozygous A, homozygous B and
heterozygous AB, leading to many possible techniques for analysis. Some include:
DNA sequencing
DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. Th ...
;
capillary electrophoresis;
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
;
single-strand conformation polymorphism Single-strand conformation polymorphism (SSCP), or single-strand ''chain'' polymorphism, is defined as a conformational difference of single-stranded nucleotide sequences of identical length as induced by differences in the sequences under certain ...
(SSCP);
single base extension; electrochemical analysis; denaturating HPLC and
gel electrophoresis
Gel electrophoresis is a method for separation and analysis of biomacromolecules ( DNA, RNA, proteins, etc.) and their fragments, based on their size and charge. It is used in clinical chemistry to separate proteins by charge or size (IEF ...
;
restriction fragment length polymorphism; and
hybridization
Hybridization (or hybridisation) may refer to:
*Hybridization (biology), the process of combining different varieties of organisms to create a hybrid
*Orbital hybridization, in chemistry, the mixing of atomic orbitals into new hybrid orbitals
*Nu ...
analysis.
Programs for prediction of SNP effects
An important group of SNPs are those that corresponds to
missense mutations causing amino acid change on protein level.
Point mutation of particular residue can have different effect on protein function (from no effect to complete disruption its function). Usually, change in amino acids with similar size and physico-chemical properties (e.g. substitution from leucine to valine) has mild effect, and opposite. Similarly, if SNP disrupts
secondary structure
Protein secondary structure is the three dimensional conformational isomerism, form of ''local segments'' of proteins. The two most common Protein structure#Secondary structure, secondary structural elements are alpha helix, alpha helices and beta ...
elements (e.g. substitution to proline in
alpha helix
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues e ...
region) such mutation usually may affect whole protein structure and function. Using those simple and many other
machine learning derived rules a group of programs for the prediction of SNP effect was developed:
SIFTThis program provides insight into how a laboratory induced missense or nonsynonymous mutation will affect protein function based on physical properties of the amino acid and sequence homology.
LIST(Local Identity and Shared Taxa) estimates the potential deleteriousness of mutations resulted from altering their protein functions. It is based on the assumption that variations observed in closely related species are more significant when assessing conservation compared to those in distantly related species.
SNAP2PolyPhen-2PredictSNP*
MutationTaster
MutationTaster is a free web-based application to evaluate DNA sequence variants for their disease-causing potential. The software performs a battery of ''in silico'' tests to estimate the impact of the variant on the gene product / protein. Tests ...
official websitefrom the
Ensembl project
SNPVizref> This program provides a 3D representation of the protein affected, highlighting the amino acid change so doctors can determine pathogenicity of the mutant protein.
PROVEANPhyreRiskis a database which maps variants to experimental and predicted protein structures.
Missense3Dis a tool which provides a stereochemical report on the effect of missense variants on protein structure.
See also
*
Affymetrix
Affymetrix is now Applied Biosystems, a brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name. The Santa Clara, Califor ...
*
HapMap
*
Illumina
*
International HapMap Project
*
Short tandem repeat (STR)
*
Single-base extension Single-base extension (SBE) is a method for determining the identity of a nucleotide base at a specific position along a nucleic acid. The method is used to identify a single-nucleotide polymorphism (SNP).
In the method, an oligonucleotide primer h ...
*
SNP array
In molecular biology, SNP array is a type of DNA microarray which is used to detect polymorphisms within a population. A single nucleotide polymorphism (SNP), a variation at a single site in DNA, is the most frequent type of variation in the geno ...
*
SNP genotyping
*
SNPedia
SNPedia (pronounced "snipedia") is a wiki-based bioinformatics web site that serves as a database of single nucleotide polymorphisms (SNPs). Each article on a SNP provides a short description, links to scientific articles and personal genomics web ...
*
Snpstr A SNPSTR is a compound genetic marker composed of one or more SNPs and one microsatellite (STR). Autosomal SNPSTRs, which contain an SNP and a microsatellite within 500 base pairs of one another, were discovered in 2002. More recently a database th ...
*
SNV calling from NGS data
*
Suspension array technology Suspension array technology (or SAT) is a high throughput, large-scale, and multiplexed screening platform used in molecular biology. SAT has been widely applied to genomic and proteomic research, such as single nucleotide polymorphism (SNP) genot ...
*
Tag SNP A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenot ...
*
TaqMan TaqMan probes are hydrolysis probes that are designed to increase the specificity of quantitative PCR. The method was first reported in 1991 by researcher Kary Mullis at Cetus Corporation, and the technology was subsequently developed by Hoffmann-L ...
*
Variome
References
Further reading
*
Human Genome Project Information— SNP Fact Sheet
External links
– Introduction to SNPs from NCBI
The SNP Consortium LTD– SNP search
NCBI dbSNP database– "a central repository for both single base nucleotide substitutions and short deletion and insertion polymorphisms"
HGMD– the Human Gene Mutation Database, includes rare mutations and functional SNPs
GWAS Central– a central database of summary-level genetic association findings
1000 Genomes Project– A Deep Catalog of Human Genetic Variation
WatCut – an online tool for the design of SNP-RFLP assays
SNPStats – SNPStats, a web tool for analysis of genetic association studies
Restriction HomePage– a set of tools for DNA restriction and SNP detection, including design of mutagenic primers
American Association for Cancer Research Cancer Concepts Factsheet on SNPsPharmGKB– The Pharmacogenetics and Pharmacogenomics Knowledge Base, a resource for SNPs associated with drug response and disease outcomes.
GEN-SNiP – Online tool that identifies polymorphisms in test DNA sequences.
Rules for Nomenclature of Genes, Genetic Markers, Alleles, and Mutations in Mouse and RatSNP effect predictor with galaxy integrationOpen SNP– a portal for sharing own SNP test results
– SNP database for protein variation detection
{{DEFAULTSORT:Single-Nucleotide Polymorphism
Molecular biology
Population genetics
DNA
Genetic genealogy
*
Biotechnology
Mutation