
Gene mapping or genome mapping describes the methods used to identify the
location of a
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
on a
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
and the distances between genes.
Gene mapping can also describe the distances between different sites within a gene.
The essence of all
genome
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
mapping is to place a collection of
molecular markers onto their respective positions on the genome. Molecular markers come in all forms. Genes can be viewed as one special type of genetic markers in the construction of genome maps, and mapped the same way as any other markers. In some areas of study, gene mapping contributes to the creation of new recombinants within an organism.
Gene maps help describe the spatial arrangement of genes on a
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
. Genes are designated to a specific location on a chromosome known as the
locus and can be used as
molecular markers to find the distance between other genes on a chromosome. Maps provide researchers with the opportunity to predict the inheritance patterns of specific traits, which can eventually lead to a better understanding of disease-linked traits.
The genetic basis to gene maps is to provide an outline that can potentially help researchers carry out
DNA sequencing
DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, thymine, cytosine, and guanine. The ...
. A gene map helps point out the relative positions of genes and allows researchers to locate regions of interest in the
genome
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
. Genes can then be identified quickly and
sequenced quickly.
Two approaches to generating gene maps (gene mapping) include physical mapping and genetic mapping. Physical mapping utilizes molecular biology techniques to inspect chromosomes. These techniques consequently allow researchers to observe chromosomes directly so that a map may be constructed with relative gene positions. Genetic mapping on the other hand uses genetic techniques to indirectly find association between genes. Techniques can include cross-breeding (
hybrid) experiments and examining
pedigrees. These technique allow for maps to be constructed so that relative positions of genes and other important sequences can be analyzed.
Mapping approaches
There are two distinctive mapping approaches used in the field of genome mapping: genetic maps (also known as linkage maps)
and physical maps.
While both maps are a collection of
genetic markers and gene
loci, genetic maps' distances are based on the
genetic linkage information, while physical maps use actual physical distances usually measured in number of
base pair
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s. While the physical map could be a more accurate representation of the genome, genetic maps often offer insights into the nature of different regions of the chromosome, for example the genetic distance to physical distance ratio varies greatly at different genomic regions which reflects different recombination rates, and such rate is often indicative of euchromatic (usually gene-rich) vs heterochromatic (usually gene-poor) regions of the genome.
Genetic mapping
Researchers begin a genetic map by collecting samples of blood, saliva, or tissue from family members that carry a prominent disease or trait and family members that do not. The most common sample used in gene mapping, especially in personal genomic tests is saliva. Scientists then isolate DNA from the samples and closely examine it, looking for unique patterns in the DNA of the family members who do carry the disease and the DNA of those who do not carry the disease do not have. These unique molecular patterns in the DNA are referred to as polymorphisms, or markers.
The first steps of building a genetic map are the development of
genetic markers and a mapping population. The closer two markers are on the chromosome, the more likely they are to be passed on to the next generation together. Therefore, the "co-segregation" patterns of all markers can be used to reconstruct their order. With this in mind, the genotypes of each genetic marker are recorded for both parents and each individual in the following generations. The quality of the genetic maps is largely dependent upon these factors: the number of genetic markers on the map and the size of the mapping population. The two factors are interlinked, as a larger mapping population could increase the "resolution" of the map and prevent the map from being "saturated".
In gene mapping, any sequence feature that can be faithfully distinguished from the two parents can be used as a genetic marker. Genes, in this regard, are represented by "traits" that can be faithfully distinguished between two parents. Their linkage with other genetic markers is calculated in the same way as if they are common markers and the actual gene loci are then bracketed in a region between the two nearest neighboring markers. The entire process is then repeated by looking at more markers that target that region to map the gene neighborhood to a higher resolution until a specific causative locus can be identified. This process is often referred to as "
positional cloning
A genetic screen or mutagenesis screen is an experimental technique used to identify and select individuals who possess a phenotype of interest in a mutagenized population. Hence a genetic screen is a type of phenotypic screen. Genetic screens ...
", and it is used extensively in the study of plant species. One plant species, in particular in which positional cloning is utilized is in
maize
Maize (; ''Zea mays''), also known as corn in North American English, is a tall stout grass that produces cereal grain. It was domesticated by indigenous peoples in southern Mexico about 9,000 years ago from wild teosinte. Native American ...
. The great advantage of genetic mapping is that it can identify the relative position of genes based solely on their phenotypic effect.
Genetic mapping is a way to identify exactly which chromosome has which gene and exactly pinpointing where that gene lies on that particular chromosome. Mapping also acts as a method in determining which gene is most likely to recombine based on the distance between two genes. The distance between two genes is measured in units known as centimorgan or map units, these terms are interchangeable. A centimorgan is a distance between genes for which one product of meiosis in one hundred is recombinant.
The farther two genes are from each other, the more likely they are going to recombine. If it were closer, the opposite would occur.
Linkage analysis
The basis to
linkage analysis is understanding chromosomal location and identifying disease genes. Certain genes that are
genetically linked or associated with each other reside close to each other on the same chromosome. During
meiosis, these genes are capable of being inherited together and can be used as a
genetic marker to help identify the
phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
of diseases. Because linkage analysis can identify inheritance patterns, these studies are usually family based.
The earliest gene maps were done by linkage analysis of fruitflies, in the research group around
Thomas Hunt Morgan. The first was published in 1913.
Gene association analysis
Gene association analysis is population based; it is not focused on inheritance patterns, but rather is based on the entire history of a population. Gene association analysis looks at a particular population and tries to identify whether the frequency of an
allele
An allele is a variant of the sequence of nucleotides at a particular location, or Locus (genetics), locus, on a DNA molecule.
Alleles can differ at a single position through Single-nucleotide polymorphism, single nucleotide polymorphisms (SNP), ...
in affected individuals is different from that of a control set of unaffected individuals of the same population. This method is particularly useful to identify complex diseases that do not have a
Mendelian inheritance
Mendelian inheritance (also known as Mendelism) is a type of biological inheritance following the principles originally proposed by Gregor Mendel in 1865 and 1866, re-discovered in 1900 by Hugo de Vries and Carl Correns, and later popularize ...
pattern.
Physical mapping
Since actual base-pair distances are generally hard or not possible to directly measure, physical maps are actually constructed by first shattering the genome into hierarchically smaller pieces. By characterizing each single piece and assembling back together, the overlapping path or "tiling path" of these small fragments would allow researchers to infer physical distances between genomic features.
Restriction mapping is a method in which structural information regarding a segment of
DNA is obtained using
restriction enzyme
A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
s. Restriction enzymes are
enzyme
An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different mol ...
s that help cut segments of DNA at specific recognition sequences. The basis to restriction mapping involves digesting (or cutting) DNA with restriction enzymes. The digested DNA fragments are then run on an agarose gel using
electrophoresis, which provides one with information regarding the size of these digested fragments. The sizes of these fragments help indicate the distance between restriction enzyme sites on the DNA analyzed, and provides researchers with information regarding the structure of DNA analyzed.
The resulting pattern of DNA migration – its
genetic fingerprint is used to identify what stretch of DNA is in the
clone. By analyzing the fingerprints,
contigs are assembled by automated (FPC) or manual means (pathfinders) into overlapping DNA stretches. Now a good choice of clones can be made to efficiently sequence the clones to determine the
DNA sequence of the organism under study.
In physical mapping, there are no direct ways of marking up a specific gene since the mapping does not include any information that concerns traits and functions. Genetic markers can be linked to a physical map by processes like
in situ hybridization. By this approach, physical map contigs can be "anchored" onto a genetic map. The clones used in the physical map contigs can then be sequenced on a local scale to help new genetic marker design and identification of the causative loci.
Macrorestriction is a type of physical mapping wherein the high molecular weight DNA is digested with a restriction enzyme having a low number of restriction sites.
There are alternative ways to determine how
DNA in a group of clones overlaps without completely sequencing the clones. Once the map is determined, the clones can be used as a resource to efficiently contain large stretches of the genome. This type of mapping is more accurate than genetic maps.
Restriction mapping
Restriction mapping is a method in which structural information regarding a segment of
DNA is obtained using
restriction enzyme
A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
s. Restriction enzymes are
enzyme
An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different mol ...
s that help cut segments of DNA at specific recognition sequences. The basis to restriction mapping involves digesting (or cutting) DNA with restriction enzymes. The digested DNA fragments are then run on an agarose gel using
electrophoresis, which provides one with information regarding the size of these digested fragments. The sizes of these fragments help indicate the distance between restriction enzyme sites on the DNA analyzed, and provides researchers with information regarding the structure of DNA analyzed.
Fluorescent in situ hybridization
Fluorescence in situ hybridization (FISH) is a method used to detect the presence (or absence) of a DNA sequence within a cell.
DNA probes that are specific for chromosomal regions or genes of interest are labeled with
fluorochromes. By attaching fluorochromes to probes, researchers are able to visualize multiple DNA sequences simultaneously. When a probe comes into contact with DNA on a specific chromosome, hybridization will occur. Consequently, information regarding the location of that sequence of DNA will be attained. FISH analyzes single stranded DNA (
ssDNA). Once the DNA is in its single stranded state, the DNA can bind to its specific probe.
Sequence-tagged site (STS) mapping
A
sequence-tagged site (STS) is a short sequence of DNA (about 100 - 500
base pair
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s in length) that is seen to appear multiple times within an individual's genome. These sites are easily recognizable, usually appearing at least once in the DNA being analyzed. These sites usually contain genetic
polymorphisms making them sources of viable genetic markers (as they differ from other sequences). Sequenced tagged sites can be mapped within our genome and require a group of overlapping DNA fragments.
PCR is generally used to produce the collection of DNA fragments. After overlapping fragments are created, the
map distance between STSs can be analyzed. In order to calculate the map distance between STSs, researchers determine the frequency at which breaks between the two markers occur (see
shotgun sequencing)
Mapping mutational sites
In the early 1950s the prevailing view was that the genes in a
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
are discrete entities, indivisible by
genetic recombination
Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryot ...
and arranged like beads on a string. During 1955 to 1959, Benzer performed
genetic recombination
Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryot ...
experiments using
rII mutants of
bacteriophage T4. He found that, on the basis of recombination tests, the sites of
mutation
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
could be mapped in a linear order.
This result provided evidence for the key idea that the gene has a linear structure equivalent to a length of
DNA with many sites that can independently mutate.
In 1961, Francis Crick, Leslie Barnett, Sydney Brenner and Richard Watts-Tobin performed genetic experiments that demonstrated the basic nature of the
genetic code
Genetic code is a set of rules used by living cell (biology), cells to Translation (biology), translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished ...
for proteins.
These experiments, involving mapping of mutational sites within the rIIB gene of bacteriophage T4, demonstrated that three sequential
nucleobases of the gene's DNA specify each successive amino acid of its encoded protein. Thus the genetic code was shown to be a triplet code, where each triplet (called a codon) specifies a particular amino acid. They also obtained evidence that the codons do not overlap with each other in the DNA sequence encoding a protein, and that such a sequence is read from a fixed starting point.
Edgar et al.
performed mapping experiments with r mutants of bacteriophage T4 showing that recombination frequencies between rII mutants are not strictly additive. The recombination frequency from a cross of two rII mutants (a x d) is usually less than the sum of recombination frequencies for adjacent internal sub-intervals (a x b) + (b x c) + (c x d). Although not strictly additive, a systematic relationship was demonstrated
that likely reflects the underlying molecular mechanism of
genetic recombination
Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryot ...
.
Genome sequencing
Genome sequencing is sometimes mistakenly referred to as "genome mapping" by non-biologists. The process of
shotgun sequencing resembles the process of physical mapping: it shatters the genome into small fragments, characterizes each fragment, then puts them back together (more recent sequencing technologies are drastically different). While the scope, purpose and process are totally different, a genome assembly can be viewed as the "ultimate" form of physical map, in that it provides in a much better way all the information that a traditional physical map can offer.
Use
Identification of genes is usually the first step in understanding a genome of a species; mapping of the gene is usually the first step of identification of the gene. Gene mapping is usually the starting point of many important downstream studies.
Disease association
The process to identify a genetic element that is responsible for a
disease
A disease is a particular abnormal condition that adversely affects the structure or function (biology), function of all or part of an organism and is not immediately due to any external injury. Diseases are often known to be medical condi ...
is also referred to as "mapping". If the locus in which the search is performed is already considerably constrained, the search is called the ''fine mapping'' of a gene. This information is derived from the investigation of disease manifestations in large families (
genetic linkage) or from populations-based
genetic association
Genetic association is when one or more genotypes within a population co-occur with a phenotype, phenotypic trait association (statistics), more often than would be expected by chance occurrence.
Studies of genetic association aim to test whether ...
studies.
Using the methods mentioned above, researchers are capable of mapping disease genes. Generating a gene map is the critical first step towards identifying disease genes. Gene maps allow for variant alleles to be identified and allow for researchers to make predictions about the genes they think are causing the
mutant phenotype. An example of a disorder that was identified by Linkage analysis is
Cystic Fibrosis
Cystic fibrosis (CF) is a genetic disorder inherited in an autosomal recessive manner that impairs the normal clearance of Sputum, mucus from the lungs, which facilitates the colonization and infection of the lungs by bacteria, notably ''Staphy ...
. For example, with Cystic Fibrosis (CF), DNA samples from fifty families affected by CF were analyzed using linkage analysis. Hundreds of markers pertaining to CF were analyzed throughout the genome until CF was identified on the
long arm of chromosome 7. Researchers then had completed linkage analysis on additional DNA markers within chromosome 7 to identify an even more precise location of the CF gene. They found that the CF gene resides around 7q31-q32 (see
chromosomal nomenclature).
See also
*
Eukaryotic chromosome fine structure
*
Fate mapping
*
G banding
G-banding, G banding or Giemsa banding is a technique used in cytogenetics to produce a visible karyotype by staining condensed chromosomes. It is the most common chromosome banding method. It is useful for identifying genetic diseases (mainly chr ...
*
Genetic fingerprinting
*
Genome project
*
Human Genome Project
*
Optical mapping
*
Quantitative trait locus
*
Sulston score
References
Further reading
*
External links
*
*
{{Authority control
Genetics