In genomics and related disciplines, NONCODING
DNA sequences are
components of an organism's
DNA that do not encode protein sequences.
DNA is transcribed into functional non-coding RNA
molecules (e.g. transfer
RNA , ribosomal
RNA , and regulatory RNAs ).
Other functions of noncoding
DNA include the transcriptional and
translational regulation of protein-coding sequences, scaffold
attachment regions , origins of
DNA replication , centromeres and
The amount of noncoding
DNA varies greatly among species. Often, only
a small percentage of the genome is responsible for coding proteins,
but a rising percentage is being shown to have regulatory functions.
When there is much non-coding DNA, a large proportion appears to have
no biological function, as predicted in the 1960s. Since that time,
this non-functional portion has controversially been called JUNK DNA.
The international Encyclopedia of
DNA Elements (
ENCODE ) project
uncovered, by direct biochemical approaches, that at least 80% of
DNA has biochemical activity. Though this was not
necessarily unexpected due to previous decades of research discovering
many functional noncoding regions, some scientists criticized the
conclusion for conflating biochemical activity with biological
function . Estimates for the biologically functional fraction of
our genome based on comparative genomics range between 8 and 15%.
However, others have argued against relying solely on estimates from
comparative genomics due to its limited scope. Non-coding
DNA has been
found to be involved in epigenetic activity and complex networks of
genetic interactions , and is being explored in evolutionary
developmental biology .
* 1 Fraction of noncoding genomic
* 2 Types of noncoding
* 2.1 Noncoding functional
* 2.2 Cis- and trans-regulatory elements
* 2.4 Pseudogenes
* 2.5 Repeat sequences, transposons and viral elements
* 3 Junk
* 4 Evidence of functionality
* 5 Regulating gene expression
* 5.1 Transcription factors
* 5.2 Operators
* 5.3 Enhancers
* 5.4 Silencers
* 5.5 Promoters
* 5.6 Insulators
* 6 Uses
* 6.2 Long range correlations
* 6.3 Forensic anthropology
* 7 See also
* 8 References
* 9 Further reading
* 10 External links
FRACTION OF NONCODING GENOMIC DNA
Utricularia gibba has only 3% noncoding DNA.
The amount of total genomic
DNA varies widely between organisms, and
the proportion of coding and noncoding
DNA within these genomes varies
greatly as well. For example, it was originally suggested that over
98% of the human genome does not encode protein sequences, including
most sequences within introns and most intergenic
DNA , whilst 20% of
a typical prokaryote genome is noncoding.
While overall genome size , and by extension the amount of noncoding
DNA, are correlated to organism complexity, there are many exceptions.
For example, the genome of the unicellular
Polychaos dubium (formerly
known as Amoeba dubia) has been reported to contain more than 200
times the amount of
DNA in humans. The pufferfish
genome is only about one eighth the size of the human genome, yet
seems to have a comparable number of genes; approximately 90% of the
Takifugu genome is noncoding DNA. The extensive variation in nuclear
genome size among eukaryotic species is known as the
C-value enigma or
C-value paradox. Most of the genome size difference appears to lie in
the noncoding DNA.
In 2013, a new "record" for the most efficient eukaryotic genome was
Utricularia gibba , a bladderwort plant that has only
DNA and 97% of coding DNA. Parts of the noncoding DNA
were being deleted by the plant and this suggested that noncoding DNA
may not be as critical for plants, even though noncoding
DNA is useful
for humans. Other studies on plants have discovered crucial functions
in portions of noncoding
DNA that were previously thought to be
negligible and have added a new layer to the understanding of gene
TYPES OF NONCODING
Conserved non-coding sequence
NONCODING FUNCTIONAL RNA
Transfer RNA and ribosomal
RNA are not translated into protein,
but they are functional, synthesising proteins by translating the
Noncoding RNAs are functional
RNA molecules that are not translated
into protein. Examples of noncoding
RNA include ribosomal
RNA , Piwi-interacting
RNA and micro
MicroRNAs are predicted to control the translational activity of
approximately 30% of all protein-coding genes in mammals and may be
vital components in the progression or treatment of various diseases
including cancer , cardiovascular disease , and the immune system
response to infection .
CIS- AND TRANS-REGULATORY ELEMENTS
Cis-regulatory elements are sequences that control the transcription
of a nearby gene. Many such elements are involved in the evolution and
control of development . Cis-elements may be located in 5\' or 3\'
untranslated regions or within introns . Trans-regulatory elements
control the transcription of a distant gene.
Promoters facilitate the transcription of a particular gene and are
typically upstream of the coding region. Enhancer sequences may also
exert very distant effects on the transcription levels of genes.
Simple illustration of an unspliced m
RNA precursor, with two
introns and three exons (top). After the introns have been removed via
splicing, the mature m
RNA sequence is ready for translation (bottom).
Introns are non-coding sections of a gene, transcribed into the
RNA sequence, but ultimately removed by
RNA splicing during
the processing to mature messenger
RNA . Many introns appear to be
mobile genetic elements .
Studies of group I introns from
Tetrahymena protozoans indicate that
some introns appear to be selfish genetic elements, neutral to the
host because they remove themselves from flanking exons during RNA
processing and do not produce an expression bias between alleles with
and without the intron. Some introns appear to have significant
biological function, possibly through ribozyme functionality that may
RNA and r
RNA activity as well as protein-coding gene
expression, evident in hosts that have become dependent on such
introns over long periods of time; for example, the trnL-intron is
found in all green plants and appears to have been vertically
inherited for several billions of years, including more than a billion
years within chloroplasts and an additional 2–3 billion years prior
in the cyanobacterial ancestors of chloroplasts.
DNA sequences, related to known genes , that have
lost their protein-coding ability or are otherwise no longer expressed
in the cell. Pseudogenes arise from retrotransposition or genomic
duplication of functional genes, and become "genomic fossils" that are
nonfunctional due to mutations that prevent the transcription of the
gene, such as within the gene promoter region, or fatally alter the
translation of the gene, such as premature stop codons or frameshifts
. Pseudogenes resulting from the retrotransposition of an RNA
intermediate are known as processed pseudogenes; pseudogenes that
arise from the genomic remains of duplicated genes or residues of
inactivated genes are nonprocessed pseudogenes.
While Dollo\'s Law suggests that the loss of function in pseudogenes
is likely permanent, silenced genes may actually retain function for
several million years and can be "reactivated" into protein-coding
sequences and a substantial number of pseudogenes are actively
transcribed. Because pseudogenes are presumed to change without
evolutionary constraint, they can serve as a useful model of the type
and frequencies of various spontaneous genetic mutations .
REPEAT SEQUENCES, TRANSPOSONS AND VIRAL ELEMENTS
Mobile genetic elements in the cell (left) and how they can be
Transposons and retrotransposons are mobile genetic elements .
Retrotransposon repeated sequences , which include long interspersed
nuclear elements (LINEs) and short interspersed nuclear elements
(SINEs), account for a large proportion of the genomic sequences in
many species. Alu sequences , classified as a short interspersed
nuclear element, are the most abundant mobile elements in the human
genome. Some examples have been found of SINEs exerting
transcriptional control of some protein-encoding genes.
Endogenous retrovirus sequences are the product of reverse
transcription of retrovirus genomes into the genomes of germ cells .
Mutation within these retro-transcribed sequences can inactivate the
Over 8% of the human genome is made up of (mostly decayed) endogenous
retrovirus sequences, as part of the over 42% fraction that is
recognizably derived of retrotransposons, while another 3% can be
identified to be the remains of
DNA transposons . Much of the
remaining half of the genome that is currently without an explained
origin is expected to have found its origin in transposable elements
that were active so long ago (> 200 million years) that random
mutations have rendered them unrecognizable.
Genome size variation in
at least two kinds of plants is mostly the result of retrotransposon
Telomeres are regions of repetitive
DNA at the end of a chromosome ,
which provide protection from chromosomal deterioration during DNA
The term "junk DNA" became popular in the 1960s. According to T.
Ryan Gregory , a genomic biologist, David Comings was the first to
discuss the nature of junk
DNA explicitly in 1972, and he applied the
term to all noncoding DNA. The term was formalized in 1972 by Susumu
Ohno , who noted that the mutational load from deleterious mutations
placed an upper limit on the number of functional loci that could be
expected given a typical mutation rate. Ohno hypothesized that mammal
genomes could not have more than 30,000 loci under selection before
the "cost" from the mutational load would cause an inescapable decline
in fitness, and eventually extinction. This prediction remains robust,
with the human genome containing approximately 20,000 genes. Another
source for Ohno's theory was the observation that even closely related
species can have widely (orders-of-magnitude) different genome sizes,
which had been dubbed the
C-value paradox in 1971.
Though the fruitfulness of the term "junk DNA" has been questioned on
the grounds that it provokes a strong a priori assumption of total
non-functionality and though some have recommended using more neutral
terminology such as "noncoding DNA" instead; "junk DNA" remains a
label for the portions of a genome sequence for which no discernible
function has been identified and that through comparative genomics
analysis appear under no functional constraint suggesting that the
sequence itself has provided no adaptive advantage . Since the late
70s it has become apparent that the majority of non-coding
large genomes finds its origin in the selfish amplification of
transposable elements , of which W.
Ford Doolittle and Carmen Sapienza
in 1980 wrote in the journal Nature : "When a given DNA, or class of
DNAs, of unproven phenotypic function can be shown to have evolved a
strategy (such as transposition) which ensures its genomic survival,
then no other explanation for its existence is necessary." The amount
DNA can be expected to depend on the rate of amplification of
these elements and the rate at which non-functional
DNA is lost. In
the same issue of Nature,
Leslie Orgel and
Francis Crick wrote that
DNA has "little specificity and conveys little or no selective
advantage to the organism". The term occurs mainly in popular science
and in a colloquial way in scientific publications, and it has been
suggested that its connotations may have delayed interest in the
biological functions of noncoding DNA.
Several lines of evidence indicate that some "junk DNA" sequences are
likely to have unidentified functional activity and that the process
of exaptation of fragments of originally selfish or non-functional DNA
has been commonplace throughout evolution. In 2012, the ENCODE
project, a research program supported by the National
Research Institute , reported that 76% of the human genome's noncoding
DNA sequences were transcribed and that nearly half of the genome was
in some way accessible to genetic regulatory proteins such as
transcription factors .
However, the suggestion by
ENCODE that over 80% of the human genome
is biochemically functional has been criticized by other scientists,
who argue that neither accessibility of segments of the genome to
transcription factors nor their transcription guarantees that those
segments have biochemical function and that their transcription is
selectively advantageous . Furthermore, the much lower estimates of
functionality prior to
ENCODE were based on genomic conservation
estimates across mammalian lineages.
In response to such views, other scientists argue that the wide
spread transcription and splicing that is observed in the human genome
directly by biochemical testing is a more accurate indicator of
genetic function than genomic conservation because conservation
estimates are relative due to incredible variations in genome sizes of
even closely related species, it is partially tautological, and these
estimates are not based on direct testing for functionality on the
genome. Conservation estimates may be used to provide clues to
identify possible functional elements in the genome, but it does not
limit or cap the total amount of functional elements that could
possibly exist in the genome since elements that do things at the
molecular level can be missed by comparative genomics. Furthermore,
much of the apparent junk
DNA is involved in epigenetic regulation and
appears to be necessary for the development of complex organisms.
In a 2014 paper,
ENCODE researchers tried to address "the question of
whether nonconserved but biochemically active regions are truly
functional". They noted that in the literature, functional parts of
the genome have been identified differently in previous studies
depending on the approaches used. There have been three general
approaches used to identify functional parts of the human genome:
genetic approaches (which rely on changes in phenotype), evolutionary
approaches (which rely on conservation) and biochemical approaches
(which rely on biochemical testing and was used by ENCODE). All three
have limitations: genetic approaches may miss functional elements that
do not manifest physically on the organism, evolutionary approaches
have difficulties using accurate multispecies sequence alignments
since genomes of even closely related species vary considerably, and
with biochemical approaches, though having high reproducibility, the
biochemical signatures do not always automatically signify a function.
They noted that 70% of the transcription coverage was less than 1
transcript per cell. They noted that this "larger proportion of genome
with reproducible but low biochemical signal strength and less
evolutionary conservation is challenging to parse between specific
functions and biological noise". Furthermore, assay resolution often
is much broader than the underlying functional sites so some of the
reproducibly “biochemically active but selectively neutral”
sequences are unlikely to serve critical functions, especially those
with lower-level biochemical signal. To this they added, "However, we
also acknowledge substantial limitations in our current detection of
constraint, given that some human-specific functions are essential but
not conserved and that disease-relevant regions need not be
selectively constrained to be functional." On the other hand, they
argued that the 12–15% fraction of human
DNA under functional
constraint, as estimated by a variety of extrapolative evolutionary
methods, may still be an underestimate. They concluded that in
contrast to evolutionary and genetic evidence, biochemical data offer
clues about both the molecular function served by underlying DNA
elements and the cell types in which they act. Ultimately genetic,
evolutionary, and biochemical approaches can all be used in a
complementary way to identify regions that may be functional in human
biology and disease.
Some critics have argued that functionality can only be assessed in
reference to an appropriate null hypothesis . In this case, the null
hypothesis would be that these parts of the genome are non-functional
and have properties, be it on the basis of conservation or biochemical
activity, that would be expected of such regions based on our general
understanding of molecular evolution and biochemistry . According to
these critics, until a region in question has been shown to have
additional features, beyond what is expected of the null hypothesis,
it should provisionally be labelled as non-functional.
EVIDENCE OF FUNCTIONALITY
DNA sequences must have some important biological
function. This is indicated by comparative genomics studies that
report highly conserved regions of noncoding
DNA , sometimes on
time-scales of hundreds of millions of years. This implies that these
noncoding regions are under strong evolutionary pressure and positive
selection . For example, in the genomes of humans and mice , which
diverged from a common ancestor 65–75 million years ago,
DNA sequences account for only about 20% of conserved
DNA, with the remaining 80% of conserved
DNA represented in noncoding
Linkage mapping often identifies chromosomal regions
associated with a disease with no evidence of functional coding
variants of genes within the region, suggesting that disease-causing
genetic variants lie in the noncoding DNA. The significance of
DNA mutations in cancer was explored in April 2013.
Noncoding genetic polymorphisms play a role in infectious disease
susceptibility, such as hepatitis C. Moreover, noncoding genetic
polymorphisms contribute to susceptibility to
Ewing sarcoma , an
aggressive pediatric bone cancer.
Some specific sequences of noncoding
DNA may be features essential to
chromosome structure, centromere function and homolog recognition in
According to a comparative study of over 300 prokaryotic and over 30
eukaryotic genomes, eukaryotes appear to require a minimum amount of
non-coding DNA. The amount can be predicted using a growth model for
regulatory genetic networks, implying that it is required for
regulatory purposes. In humans the predicted minimum is about 5% of
the total genome.
Over 10% of 32 mammalian genomes may function through the formation
RNA secondary structures . The study used comparative
genomics to identify compensatory
DNA mutations that maintain RNA
base-pairings, a distinctive feature of
RNA molecules. Over 80% of the
genomic regions presenting evolutionary evidence of
conservation do not present strong
DNA sequence conservation.
DNA separate genes from each other with long gaps, so
mutation in one gene or part of a chromosome, for example deletion or
insertion, does not have a frameshift effect on the whole chromosome.
When genome complexity is relatively high, like in the case of human
genome, not only between different genes, but also inside many genes,
there are gaps of introns to protect the entire coding segment and
minimise the changes caused by mutation. Non-coding
DNA may perhaps
serve to decrease the probability of gene disruption during
chromosomal crossover .
REGULATING GENE EXPRESSION
Regulation of gene expression
DNA sequences determine the expression levels of
various genes, both those that are transcribed to proteins and those
that themselves are involved in gene regulation.
DNA sequences determine where transcription factors
attach. A transcription factor is a protein that binds to specific
DNA sequences, thereby controlling the flow (or
transcription) of genetic information from
DNA to mRNA.
An operator is a segment of
DNA to which a repressor binds. A
repressor is a DNA-binding protein that regulates the expression of
one or more genes by binding to the operator and blocking the
RNA polymerase to the promoter, thus preventing
transcription of the genes. This blocking of expression is called
An enhancer is a short region of
DNA that can be bound with proteins
(trans-acting factors ), much like a set of transcription factors, to
enhance transcription levels of genes in a gene cluster..
A silencer is a region of
DNA that inactivates gene expression when
bound by a regulatory protein. It functions in a very similar way as
enhancers, only differing in the inactivation of genes.
A promoter is a region of
DNA that facilitates transcription of a
particular gene when a transcription factor binds to it. Promoters are
typically located near the genes they regulate and upstream of them.
A genetic insulator is a boundary element that plays two distinct
roles in gene expression, either as an enhancer-blocking code, or
rarely as a barrier against condensed chromatin. An insulator in a DNA
sequence is comparable to a linguistic word divider such as a comma in
a sentence, because the insulator indicates where an enhanced or
repressed sequence ends.
Shared sequences of apparently non-functional
DNA are a major line of
evidence of common descent .
Pseudogene sequences appear to accumulate mutations more rapidly than
coding sequences due to a loss of selective pressure. This allows for
the creation of mutant alleles that incorporate new functions that may
be favored by natural selection; thus, pseudogenes can serve as raw
material for evolution and can be considered "protogenes".
LONG RANGE CORRELATIONS
A statistical distinction between coding and noncoding
has been found. It has been observed that nucleotides in non-coding
DNA sequences display long range power law correlations while coding
sequences do not.
Police sometimes gather
DNA as evidence for purposes of forensic
identification . As described in
Maryland v. King , a 2013 U.S.
Supreme Court decision:
The current standard for forensic
DNA testing relies on an analysis
of the chromosomes located within the nucleus of all human cells. 'The
DNA material in chromosomes is composed of "coding" and "noncoding"
regions. The coding regions are known as genes and contain the
information necessary for a cell to make proteins. . . . Non-protein
coding regions . . . are not related directly to making proteins,
have been referred to as "junk" DNA.' The adjective "junk" may mislead
the lay person, for in fact this is the
DNA region used with near
certainty to identify a person.
Conserved non-coding sequence
Eukaryotic chromosome fine structure
Gene-centered view of evolution
Gene regulatory network
* ^ A B Pennisi, E. (6 September 2012). "
ENCODE Project Writes
Eulogy for Junk DNA". Science. 337 (6099): 1159–1161. PMID 22955811
. doi :10.1126/science.337.6099.1159 .
* ^ The
ENCODE Project Consortium (2012). "An integrated
DNA elements in the human genome" . Nature. 489
Bibcode :2012Natur.489...57T. PMC 3439153 . PMID
22955616 . doi :10.1038/nature11247 . .
* ^ A B Costa, Fabrico (2012). "7 Non-coding RNAs, Epigenomics, and
Human Cells". In Morris, Kevin V. Non-coding RNAs and
Epigenetic Regulation of
Gene Expression: Drivers of Natural
Caister Academic Press . ISBN 1904455948 .
* ^ A B C Carey, Nessa (2015). Junk DNA: A Journey Through the Dark
Matter of the Genome. Columbia University Press. ISBN 9780231170840 .
* ^ A B Robin McKie (24 February 2013). "Scientists attacked over
claim that \'junk DNA\' is vital to life". The Observer.
* ^ A B C Eddy, Sean (2012). "The
C-value paradox, junk DNA, and
ENCODE". Curr Biol. 22 (21): R898–R899. doi
* ^ A B Doolittle, W. Ford (2013). "Is junk
DNA bunk? A critique of
Proc Natl Acad Sci USA . 110 (14): 5294–5300. Bibcode
:2013PNAS..110.5294D. PMC 3619371 . PMID 23479647 . doi
* ^ A B Palazzo, Alexander F.; Gregory, T. Ryan (2014). "The Case
for Junk DNA" .
PLoS Genetics . 10 (5): e1004351. PMC 4014423 .
PMID 24809441 . doi :10.1371/journal.pgen.1004351 .
* ^ A B
Dan Graur , Yichen Zheng, Nicholas Price, Ricardo B. R.
Azevedo1, Rebecca A. Zufall and Eran Elhaik (2013). "On the
immortality of television sets: "function" in the human genome
according to the evolution-free gospel of ENCODE" (PDF). Genome
Evolution . 5 (3): 578–90. PMC 3622293 . PMID
23431001 . doi :10.1093/gbe/evt028 . CS1 maint: Multiple names:
authors list (link )
* ^ Ponting, CP; Hardison, RC (2011). "What fraction of the human
genome is functional?".
Genome Research . 21 (11): 1769–1776. PMC
3205562 . PMID 21875934 . doi :10.1101/gr.116814.110 .
* ^ A B C D E F Kellis, M.; et al. (2014). "Defining functional DNA
elements in the human genome".
PNAS . 111 (17): 6131–6138. Bibcode
:2014PNAS..111.6131K. PMC 4035993 . PMID 24753594 . doi
* ^ Chris M. Rands,
Stephen Meader ,
Chris P. Ponting and Gerton
Lunter (2014). "8.2% of the
Genome Is Constrained: Variation in
Rates of Turnover across Functional Element Classes in the Human
Lineage" . PLoS Genet. 10 (7): e1004525. PMC 4109858 . PMID
25057982 . doi :10.1371/journal.pgen.1004525 . CS1 maint: Multiple
names: authors list (link )
* ^ A B C Mattick JS, Dinger ME (2013). "The extent of
functionality in the human genome" . The HUGO Journal. 7 (1): 2. PMC
4685169 . doi :10.1186/1877-6566-7-2 .
* ^ A B Morris, Kevin, ed. (2012). Non-Coding RNAs and Epigenetic
Gene Expression: Drivers of Natural Selection. Norfolk,
UK: Caister Academic Press. ISBN 1904455948 .
* ^ A B "Worlds Record Breaking Plant: Deletes its Noncoding "Junk"
DNA". Design & Trend. May 12, 2013. Retrieved 2013-06-04.
* ^ A B Elgar G, Vavouri T; Vavouri (July 2008). "Tuning in to the
signals: noncoding sequence conservation in vertebrate genomes".
Trends Genet. 24 (7): 344–52. PMID 18514361 . doi
* ^ Gregory TR, Hebert PD; Hebert (April 1999). "The modulation of
DNA content: proximate causes and ultimate consequences".
9 (4): 317–24. PMID 10207154 . doi :10.1101/gr.9.4.317 (inactive
* ^ Wahls, W.P.; et al. (1990). "Hypervariable minisatellite
a hotspot for homologous recombination in human cells". Cell. 60 (1):
95–103. PMID 2295091 . doi :10.1016/0092-8674(90)90719-U .
* ^ Waterhouse, Peter M.; Hellens, Roger P. (25 March 2015). "Plant
biology: Coding in non-coding RNAs". Nature. 520 (7545): 41–42.
Bibcode :2015Natur.520...41W. doi :10.1038/nature14378 .
* ^ Li M, Marin-Muller C, Bharadwaj U, Chow KH, Yao Q, Chen C;
Marin-Muller; Bharadwaj; Chow; Yao; Chen (April 2009). "MicroRNAs:
Control and Loss of Control in
Human Physiology and Disease" . World J
Surg. 33 (4): 667–84. PMC 2933043 . PMID 19030926 . doi
:10.1007/s00268-008-9836-x . CS1 maint: Multiple names: authors list
* ^ Carroll, Sean B. (2008). "Evo-Devo and an Expanding
Evolutionary Synthesis: A Genetic Theory of Morphological Evolution".
Cell. 134 (1): 25–36. PMID 18614008 . doi
* ^ Visel A, Rubin EM, Pennacchio LA (September 2009). "Genomic
Views of Distant-Acting Enhancers" . Nature. 461 (7261): 199–205.
Bibcode :2009Natur.461..199V. PMC 2923221 . PMID 19741700 . doi
* ^ A B C Nielsen H, Johansen SD; Johansen (2009). "Group I
introns: Moving in new directions".
RNA Biol. 6 (4): 375–83. PMID
19667762 . doi :10.4161/rna.6.4.9334 .
* ^ A B C Zheng D, Frankish A, Baertsch R, et al. (June 2007).
"Pseudogenes in the
ENCODE regions: Consensus annotation, analysis of
transcription, and evolution" .
Genome Res. 17 (6): 839–51. PMC
1891343 . PMID 17568002 . doi :10.1101/gr.5586307 .
* ^ Marshall CR, Raff EC, Raff RA; Raff; Raff (December 1994).
"Dollo\'s law and the death and resurrection of genes" . Proc. Natl.
Acad. Sci. U.S.A. 91 (25): 12283–7.
PMC 45421 . PMID 7991619 . doi :10.1073/pnas.91.25.12283 . CS1
maint: Multiple names: authors list (link )
* ^ Tutar, Y. (2012). "Pseudogenes" . Comp Funct Genomics. 2012:
424526. PMC 3352212 . PMID 22611337 . doi :10.1155/2012/424526 .
* ^ A B Petrov DA, Hartl DL; Hartl (2000). "
and natural selection for a compact genome". J. Hered. 91 (3):
221–7. PMID 10833048 . doi :10.1093/jhered/91.3.221 .
* ^ Ponicsan SL, Kugel JF, Goodrich JA; Kugel; Goodrich (February
2010). "Genomic gems: SINE RNAs regulate m
RNA production" . Current
Opinion in Genetics & Development. 20 (2): 149–55. PMC 2859989 .
PMID 20176473 . doi :10.1016/j.gde.2010.01.004 . CS1 maint: Multiple
names: authors list (link )
* ^ Häsler J, Samuelsson T, Strub K; Samuelsson; Strub (July
2007). "Useful 'junk': Alu RNAs in the human transcriptome". Cell.
Mol. Life Sci. 64 (14): 1793–800. PMID 17514354 . doi
:10.1007/s00018-007-7084-0 . CS1 maint: Multiple names: authors list
* ^ Walters RD, Kugel JF, Goodrich JA; Kugel; Goodrich (Aug 2009).
"InvAluable junk: the cellular impact and function of Alu and B2 RNAs"
. IUBMB Life. 61 (8): 831–7. PMC 4049031 . PMID 19621349 . doi
:10.1002/iub.227 . CS1 maint: Multiple names: authors list (link )
* ^ Nelson, PN.; Hooley, P.; Roden, D.; Davari Ejtehadi, H.;
Rylance, P.; Warren, P.; Martin, J.; Murray, PG. (Oct 2004). "Human
endogenous retroviruses: transposable elements with potential?" . Clin
Exp Immunol. 138 (1): 1–9. PMC 1809191 . PMID 15373898 . doi
* ^ International
Genome Sequencing Consortium (February
2001). "Initial sequencing and analysis of the human genome". Nature.
409 (6822): 879–888.
Bibcode :2001Natur.409..860L. PMID 11237011 .
doi :10.1038/35057062 .
* ^ Piegu, B.; Guyot, R.; Picault, N.; Roulin, A.; Sanyal, A.;
Saniyal, A.; Kim, H.; Collura, K.; et al. (Oct 2006). "Doubling genome
size without polyploidization: dynamics of retrotransposition-driven
genomic expansions in Oryza australiensis, a wild relative of rice" .
Genome Res. 16 (10): 1262–9. PMC 1581435 . PMID 16963705 . doi
* ^ Hawkins, JS.; Kim, H.; Nason, JD.; Wing, RA.; Wendel, JF. (Oct
2006). "Differential lineage-specific amplification of transposable
elements is responsible for genome size variation in Gossypium" .
Genome Res. 16 (10): 1252–61. PMC 1581434 . PMID 16954538 . doi
* ^ Ehret CF, De Haller G; De Haller (1963). "Origin, development,
and maturation of organelles and organelle systems of the cell surface
in Paramecium". Journal of Ultrastructure Research. 9 Supplement 1: 1,
3–42. PMID 14073743 . doi :10.1016/S0022-5320(63)80088-X .
* ^ Dan Graur, The Origin of Junk DNA: A Historical Whodunnit
* ^ A B Gregory, T. Ryan, ed. (2005). The
Evolution of the Genome.
Elsevier. pp. 29–31. ISBN 0123014638 . Comings (1972), on the other
hand, gave what must be considered the first explicit discussion of
the nature of "junk DNA," and was the first to apply the term to all
noncoding DNA."; "For this reason, it is unlikely that any one
function for noncoding
DNA can account for either its sheer mass or
its unequal distribution among taxa. However, dismissing it as no more
than "junk" in the pejorative sense of "useless" or "wasteful" does
little to advance the understanding of genome evolution. For this
reason, the far less loaded term "noncoding DNA" is used throughout
this chapter and is recommended in preference to "junk DNA" for future
treatments of the subject."
* ^ Ohno, Susumu (1972). H. H. Smith, ed. So Much "junk"
DNA in Our
Genome. Gordon and Breach, New York. pp. 366–370. Retrieved
* ^ Doolittle WF, Sapienza C; Sapienza (1980). "Selfish genes, the
phenotype paradigm and genome evolution". Nature. 284 (5757):
Bibcode :1980Natur.284..601D. PMID 6245369 . doi
* ^ Another source is genome duplication followed by a loss of
function due to redundancy.
* ^ Orgel LE, Crick FH; Crick (April 1980). "Selfish DNA: the
ultimate parasite". Nature. 284 (5757): 604–7. Bibcode
:1980Natur.284..604O. PMID 7366731 . doi :10.1038/284604a0 .
* ^ Khajavinia A, Makalowski W; Makalowski (May 2007). "What is
"junk" DNA, and what is it worth?".
Scientific American . 296 (5):
104. PMID 17503549 . doi :10.1038/scientificamerican0307-104 . The
term "junk DNA" repelled mainstream researchers from studying
noncoding genetic material for many years
* ^ Biémont, Christian; Vieira, C (2006). "Genetics: Junk
an evolutionary force". Nature. 443 (7111): 521–4. Bibcode
:2006Natur.443..521B. PMID 17024082 . doi :10.1038/443521a .
* ^ Palazzo, Alexander F.; Lee, Eliza S. (2015). "Non-coding RNA:
what is functional and what is junk?" . Frontiers in Genetics. 6: 2.
ISSN 1664-8021 . PMC 4306305 . PMID 25674102 . doi
* ^ Ludwig MZ (December 2002). "Functional evolution of noncoding
DNA". Current Opinion in Genetics & Development. 12 (6): 634–639.
PMID 12433575 . doi :10.1016/S0959-437X(02)00355-6 .
* ^ A B Cobb J, Büsst C, Petrou S, Harrap S, Ellis J; Büsst;
Petrou; Harrap; Ellis (April 2008). "Searching for functional genetic
variants in non-coding DNA". Clin. Exp. Pharmacol. Physiol. 35 (4):
372–5. PMID 18307723 . doi :10.1111/j.1440-1681.2008.04880.x . CS1
maint: Multiple names: authors list (link )
* ^ E Khurana; et al. (April 2013). "Integrative annotation of
variants from 1092 humans: application to cancer genomics". Science.
342 (6154): 372–5. PMC 3947637 . PMID 24092746 . doi
* ^ Lu, Yi-Fan; Mauger, David M.; Goldstein, David B.; Urban,
Thomas J.; Weeks, Kevin M.; Bradrick, Shelton S. (4 November 2015).
RNA structure is remodeled by a functional non-coding
polymorphism associated with hepatitis C virus clearance" . Scientific
Reports. 5: 16037.
Bibcode :2015NatSR...516037L. PMC 4631997 . PMID
26531896 . doi :10.1038/srep16037 .
* ^ Grünewald, Thomas G. P.; Bernard, Virginie;
Gilardi-Hebenstreit, Pascale; Raynal, Virginie; Surdez, Didier;
Aynaud, Marie-Ming; Mirabeau, Olivier; Cidre-Aranaz, Florencia;
Tirode, Franck (2015). "Chimeric EWSR1-FLI1 regulates the Ewing
sarcoma susceptibility gene EGR2 via a GGAA microsatellite". Nature
Genetics. 47 (9): 1073–1078. PMC 4591073 . PMID 26214589 . doi
* ^ Subirana, J.A.; Messeguer, X.; Messeguer (March 2010). "The
most frequent short sequences in non-coding DNA" . Nucleic Acids Res.
38 (4): 1172–81. PMC 2831315 . PMID 19966278 . doi
:10.1093/nar/gkp1094 . CS1 maint: Multiple names: authors list (link )
* ^ S. E. Ahnert; T. M. A. Fink (2008). "How much non-coding
eukaryotes require?" (PDF).
J. Theor. Biol. 252 (4): 587–592. PMID
18384817 . doi :10.1016/j.jtbi.2008.02.005 .
* ^ Smith MA; et al. (June 2013). "Widespread purifying selection
RNA structure in mammals" . Nucleic Acids Research. 41 (17):
8220–8236. PMC 3783177 . PMID 23847102 . doi :10.1093/nar/gkt596
* ^ Dileep, V. (2009). "The place and function of non-coding
the evolution of variability". Hypothesis. 7 (1): e7. doi
* ^ A B Callaway, Ewen (March 2010). "Junk
DNA gets credit for
making us who we are". New Scientist.
* ^ Carroll, Sean B.; et al. (May 2008). "Regulating Evolution".
Scientific American. 298 (5): 60–67. PMID 18444326 . doi
* ^ Stojic, L. "Transcriptional silencing of long noncoding RNA
GNG12-AS1 uncouples its transcriptional and product-related
functions". nature.com. Nature. Retrieved 21 Feb 2016.
* ^ Latchman, D.S. (December 1997). "Transcription factors: an
overview" . The International Journal of
Biochemistry & Cell Biology.
29 (12): 1305–12. PMC 2002184 . PMID 9570129 . doi
* ^ Karin, M. (February 1990). "Too many transcription factors:
positive and negative interactions". The New Biologist. 2 (2):
126–31. PMID 2128034 .
* ^ Lewin, Benjamin (1990). Genes IV (4th ed.). Oxford: Oxford
University Press. pp. 243–58. ISBN 0-19-854267-4 .
* ^ Blackwood, E. M.; Kadonaga, J. T. (1998). "Going the Distance:
A Current View of Enhancer Action". Science. 281 (5373): 60–3. PMID
9679020 . doi :10.1126/science.281.5373.60 .
* ^ Maston, Glenn; Sarah Evans; Michael Green (23 May 2006).
"Transcriptional regulatory elements in the
Human Genome" (PDF).
Annual Reviews. Retrieved 2 April 2013.
* ^ "Analysis of Biological Networks: Transcriptional Networks –
Promoter Sequence Analysis" (PDF). Tel Aviv University. Retrieved 30
* ^ Burgess-Beusse B, Farrell C, Gaszner M, Litt M, Mutskov V,
Recillas-Targa F, Simpson M, West A, Felsenfeld G (December 2002).
"The insulation of genes from external enhancers and silencing
chromatin" . Proc. Natl. Acad. Sci. U.S.A. 99 Suppl 4: 16433–7. PMC
139905 . PMID 12154228 . doi :10.1073/pnas.162342499 .
* ^ "Plagiarized Errors and Molecular Genetics", talkorigins , by
Edward E. Max, M.D., Ph.D.
* ^ Balakirev ES, Ayala FJ; Ayala (2003). "Pseudogenes: are they
"junk" or functional DNA?". Annu. Rev. Genet. 37: 123–51. PMID
14616058 . doi :10.1146/annurev.genet.37.040103.103949 .
* ^ C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin , F.
Sciortino, M. Simons, H. E. Stanley; Buldyrev, SV; Goldberger, AL;
Havlin, S; Sciortino, F; Simons, M; Stanley, HE (1992). "Long-range
correlations in nucleotide sequences". Nature. 356 (6365): 168–70.
Bibcode :1992Natur.356..168P. PMID 1301010 . doi :10.1038/356168a0 .
* ^ W. Li and, K. Kaneko; Kaneko, K (1992). "Long-Range Correlation
and Partial 1/falpha Spectrum in a Non-Coding
DNA Sequence" (PDF).
Europhys. Lett. 17 (7): 655–660.
Bibcode :1992EL.....17..655L. doi
* ^ S. V. Buldyrev, A. L. Goldberger, S. Havlin , R. N. Mantegna,
M. Matsa, C.-K. Peng, M. Simons, and H. E. Stanley; Goldberger, A.;
Havlin, S.; Mantegna, R.; Matsa, M.; Peng, C.-K.; Simons, M.; Stanley,
H. (1995). "Long-range correlations properties of coding and noncoding
DNA sequences: GenBank analysis". Phys. Rev. E. 51 (5): 5084–5091.
Bibcode :1995PhRvE..51.5084B. doi :10.1103/PhysRevE.51.5084 .
* ^ A B Slip opinion for
Maryland v. King from the U.S. Supreme
Court. Retrieved 2013-06-04.
Bennett, Michael D.; Leitch, Ilia J. (2005). "
Genome size evolution
in plants". In Gregory, T. Ryan. The
Evolution of the Genome. San
Diego: Elsevier. pp. 89–162. ISBN 978-0-08-047052-8 . Gregory, T.R
Genome size evolution in animals". In T.R. Gregory. The
Evolution of the Genome. San Diego: Elsevier. ISBN 0-12-301463-8 .
Shabalina SA, Spiridonov NA; Spiridonov (2004). "The mammalian
transcriptome and the function of non-coding
DNA sequences" . Genome
Biol. 5 (4): 105. PMC 395773 . PMID 15059247 . doi
:10.1186/gb-2004-5-4-105 . Castillo-Davis CI (October 2005). "The
evolution of noncoding DNA: how much junk, how much func?". Trends
Genet. 21 (10): 533–6. PMID 16098630 . doi
DNA C-values Database at
Royal Botanic Gardens, Kew
Genome Size Database at Estonian Institute of Zoology and