Exon
   HOME

TheInfoList



OR:

An exon is any part of a
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
that will form a part of the final mature
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
. Just as the entire set of genes for a
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriate s ...
constitutes the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
, the entire set of exons constitutes the
exome The exome is composed of all of the exons within the genome, the sequences which, when transcribed, remain within the mature RNA after introns are removed by RNA splicing. This includes untranslated regions of messenger RNA (mRNA), and coding re ...
.


History

The term ''exon'' derives from the expressed region and was coined by American biochemist
Walter Gilbert Walter Gilbert (born March 21, 1932) is an American biochemist, physicist, molecular biology pioneer, and Nobel laureate. Education and early life Walter Gilbert was born in Boston, Massachusetts, on March 21, 1932, the son of Emma (Cohen), a c ...
in 1978: "The notion of the
cistron A cistron is an alternative term for "gene". The word cistron is used to emphasize that genes exhibit a specific behavior in a cis-trans test; distinct positions (or loci) within a genome are cistronic. History The words ''cistron'' and ''gene ...
… must be replaced by that of a transcription unit containing regions which will be lost from the mature messengerwhich I suggest we call introns (for intragenic regions)alternating with regions which will be expressedexons." This definition was originally made for protein-coding transcripts that are spliced before being translated. The term later came to include sequences removed from
rRNA Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosoma ...
and
tRNA Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
, and other
ncRNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally important types of non ...
and it also was used later for RNA molecules originating from different parts of the genome that are then ligated by trans-splicing.


Contribution to genomes and size distribution

Although unicellular eukaryotes such as yeast have either no introns or very few,
metazoans Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage in ...
and especially
vertebrate Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, ...
genomes have a large fraction of
non-coding DNA Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and regula ...
. For instance, in the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the ...
only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being
intergenic DNA An intergenic region is a stretch of DNA sequences located between genes. Intergenic regions may contain functional elements and junk DNA. ''Inter''genic regions should not be confused with ''intra''genic regions (or introns), which are non-cod ...
. This can provide a practical advantage in
omics The branches of science known informally as omics are various disciplines in biology whose names end in the suffix '' -omics'', such as genomics, proteomics, metabolomics, metagenomics, phenomics and transcriptomics. Omics aims at the collect ...
-aided health care (such as
precision medicine Precision, precise or precisely may refer to: Science, and technology, and mathematics Mathematics and computing (general) * Accuracy and precision, measurement deviation from true value and its scatter * Significant figures, the number of digi ...
) because it makes commercialized
whole exome sequencing Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome (known as the exome). It consists of two steps: the first step is to select only the sub ...
a smaller and less expensive challenge than commercialized
whole genome sequencing Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a ...
. The large variation in
genome size Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms (trillionths (10−12) of a gram, abbreviated pg) or less frequently in daltons, or as the total ...
and
C-value C-value is the amount, in picograms, of DNA contained within a haploid nucleus (e.g. a gamete) or one half the amount in a diploid somatic cell of a eukaryotic organism. In some cases (notably among diploid organisms), the terms C-value and geno ...
across
life forms Life form (also spelled life-form or lifeform) is an entity that is living, such as plants (flora) and animals (fauna). It is estimated that more than 99% of all species that ever existed on Earth, amounting to over five billion species, are ex ...
has posed an interesting challenge called the
C-value enigma C-value is the amount, in picograms, of DNA contained within a haploid nucleus (e.g. a gamete) or one half the amount in a diploid somatic cell of a eukaryotic organism. In some cases (notably among diploid organisms), the terms C-value and g ...
. Across all eukaryotic genes in GenBank, there were (in 2002), on average, 5.48 exons per protein coding gene. The average exon encoded 30-36
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha a ...
s. While the longest exon in the human genome is 11555 bp long, several exons have been found to be only 2 bp long. A single-nucleotide exon has been reported from the ''
Arabidopsis ''Arabidopsis'' (rockcress) is a genus in the family Brassicaceae. They are small flowering plants related to cabbage and mustard. This genus is of great interest since it contains thale cress (''Arabidopsis thaliana''), one of the model organi ...
'' genome. In humans, like protein coding
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
, most
non-coding RNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not Translation (genetics), translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally im ...
also contain multiple exons


Structure and function

In protein-coding genes, the exons include both the protein-coding sequence and the 5′- and 3′-
untranslated region In molecular genetics, an untranslated region (or UTR) refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5' UTR (or leader sequence), or if it is foun ...
s (UTR). Often the first exon includes both the 5′-UTR and the first part of the coding sequence, but exons containing only regions of 5′-UTR or (more rarely) 3′-UTR occur in some genes, i.e. the UTRs may contain introns. Some
non-coding RNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not Translation (genetics), translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally im ...
transcripts also have exons and introns. Mature mRNAs originating from the same gene need not include the same exons, since different introns in the pre-mRNA can be removed by the process of alternative splicing. Exonization is the creation of a new exon, as a result of mutations in
introns An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of the cistron .e., gene ...
.


Experimental approaches using exons

Exon trapping Exon trapping is a molecular biology technique to identify potential exons in a fragment of eukaryote DNA of unknown intron- exon structure.Duyk, G. M, S. W. Kim, R. M Myers, and D. R Cox. 1990. “Exon Trapping: a Genetic Screen to Identify Ca ...
or '
gene trapping Gene trapping is a high-throughput approach that is used to introduce insertional mutations across an organism's genome. Method Trapping is performed with gene trap vectors whose principal element is a gene trapping cassette consisting of a prom ...
' is a
molecular biology Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and physi ...
technique that exploits the existence of the intron-exon splicing to find new genes. The first exon of a 'trapped' gene splices into the exon that is contained in the insertional DNA. This new exon contains the ORF for a
reporter gene In molecular biology, a reporter gene (often simply reporter) is a gene that researchers attach to a regulatory sequence of another gene of interest in bacteria, cell culture, animals or plants. Such genes are called reporters because the charac ...
that can now be expressed using the enhancers that control the target gene. A scientist knows that a new gene has been trapped when the reporter gene is expressed. Splicing can be experimentally modified so that targeted exons are excluded from mature mRNA transcripts by blocking the access of splice-directing small nuclear ribonucleoprotein particles (snRNPs) to pre-mRNA using Morpholino antisense oligos. This has become a standard technique in developmental biology. Morpholino oligos can also be targeted to prevent molecules that regulate splicing (e.g. splice enhancers, splice suppressors) from binding to pre-mRNA, altering patterns of splicing.


Common misuse of the term

Common incorrect uses of the term ''exon'' are that 'exons code for protein', or 'exons code for amino-acids' or 'exons are translated'. As indicated in this article exons may become part of a
non-coding RNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not Translation (genetics), translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally im ...
or the
untranslated region In molecular genetics, an untranslated region (or UTR) refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5' UTR (or leader sequence), or if it is foun ...
of
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
s. These incorrect definitions (Feb 2022) are found on overall reputable secondary source
NHGRI Nature


See also

* DBASS3/5 * Exitron *
Exon-intron database The Exon-Intron Database (EID) is a database of spliced mRNA sequences. See also * Alternative splicing * Exon * Intron An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final ...
*
Exon shuffling Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron st ...
*
Interrupted gene An interrupted gene (also called a split gene) is a gene that contains expressed regions of DNA called exons, split with unexpressed regions called introns (also called intervening regions). Exons provide instructions for coding proteins, which c ...
*
Outron An outron is a nucleotide sequence at the 5' end of the primary transcript of a gene that is removed by a special form of RNA splicing during maturation of the final RNA product. Whereas intron sequences are located inside the gene, outron sequ ...
* Twintron *
Untranslated region In molecular genetics, an untranslated region (or UTR) refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5' UTR (or leader sequence), or if it is foun ...
(UTR)


References


Bibliography

* *


External links


Exon-intron graphic maker
{{Authority control DNA Spliceosome RNA splicing