HOME

TheInfoList



OR:

Retrotransposons (also called Class I transposable elements or transposons via RNA intermediates) are a type of
gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ...
tic component that copy and paste themselves into different genomic locations ( transposon) by converting RNA back into DNA through the reverse transcription process using an RNA transposition intermediate. Through reverse transcription, retrotransposons amplify themselves quickly to become abundant in eukaryotic genomes such as
maize Maize ( ; ''Zea mays'' subsp. ''mays'', from es, maíz after tnq, mahiz), also known as corn ( North American and Australian English), is a cereal grain first domesticated by indigenous peoples in southern Mexico about 10,000 years ago. T ...
(49–78%) and humans (42%). They are only present in eukaryotes but share features with
retrovirus A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptas ...
es such as HIV, for example, discontinuous reverse transcriptase-mediated extrachromosomal recombination. These retrotransposons are regulated by a family of short non-coding RNAs termed as PIWI -element induced wimpy testisinteracting RNAs (piRNAs). piRNA is a recently discovered class of ncRNAs, which are in the length range of ~24-32 nucleotides. Initially, piRNAs were described as repeat-associated siRNAs (rasiRNAs) because of their origin from the repetitive elements such as transposable sequences of the genome. However, later it was identified that they acted via PIWI-protein. In addition to having a role in the suppression of genomic transposons, various roles of piRNAs have been recently reported like regulation of 3’ UTR of protein-coding genes via RNAi, transgenerational epigenetic inheritance to convey a memory of past transposon activity, and RNA-induced epigenetic silencing. There are two main types of retrotransposons,
long terminal repeat A long terminal repeat (LTR) is a pair of identical sequences of DNA, several hundred base pairs long, which occur in eukaryotic genomes on either end of a series of genes or pseudogenes that form a retrotransposon or an endogenous retrovirus or a ...
s (LTRs) and non-long terminal repeats (non-LTRs). Retrotransposons are classified based on sequence and method of transposition. Most retrotransposons in the maize genome are LTR, whereas in humans they are mostly non-LTR. Retrotransposons (mostly of the LTR type) can be passed onto the next generation of a host species through the germline. The other type of transposon is the DNA transposon. DNA transposons encode a transposase which, when translated, catalyses the excision of the transposase gene and its flanking region and its insertion into a different genomic location: a 'jumping' DNA element. Hence retrotransposons can be thought of as replicative, whereas DNA transposons are non-replicative. Due to their replicative nature, retrotransposons can increase eukaryotic genome size quickly and survive in eukaryotic genomes permanently. It is thought that staying in eukaryotic genomes for such long periods gave rise to special insertion methods that do not affect eukaryotic gene function drastically.


LTR retrotransposons

Long strands of repetitive DNA can be found at each end of a LTR retrotransposon. These are termed
long terminal repeat A long terminal repeat (LTR) is a pair of identical sequences of DNA, several hundred base pairs long, which occur in eukaryotic genomes on either end of a series of genes or pseudogenes that form a retrotransposon or an endogenous retrovirus or a ...
s (LTRs) that are each a few hundred base pairs long, hence retrotransposons with LTRs have the name long terminal repeat (LTR) retrotransposon. LTR retrotransposons are over 5 kilobases long. Between the long terminal repeats there are genes that can be transcribed equivalent to retrovirus genes gag and pol. These genes overlap so they encode a protease that processes the resulting transcript into functional gene products. Gag gene products associate with other retrotransposon transcripts to form virus-like particles. Pol gene products include enzymes reverse transcriptase,
integrase Retroviral integrase (IN) is an enzyme produced by a retrovirus (such as HIV) that integrates—forms covalent links between—its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage int ...
and
ribonuclease Ribonuclease (commonly abbreviated RNase) is a type of nuclease that catalyzes the degradation of RNA into smaller components. Ribonucleases can be divided into endoribonucleases and exoribonucleases, and comprise several sub-classes within ...
H domains. Reverse transcriptase carries out reverse transcription of retrotransposon DNA. Integrase 'integrates' retrotransposon DNA into eukaryotic genome DNA. Ribonuclease cleaves
phosphodiester bond In chemistry, a phosphodiester bond occurs when exactly two of the hydroxyl groups () in phosphoric acid react with hydroxyl groups on other molecules to form two ester bonds. The "bond" involves this linkage . Discussion of phosphodiesters is ...
s between RNA nucleotides. LTR retrotransposons encode transcripts with
tRNA Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino a ...
binding sites so that they can undergo reverse transcription. The tRNA-bound RNA transcript binds to a genomic RNA sequence. Template strand of retrotransposon DNA can hence be synthesised. Ribonuclease H domains degrade eukaryotic genomic RNA to give adenine- and guanine-rich DNA sequences that flag where the complementary noncoding strand has to be synthesised. Integrase then 'integrates' the retrotransposon into eukaryotic DNA using the hydroxyl group at the start of retrotransposon DNA. This results in a retrotransposon flagged by long terminal repeats at its ends. Because the retrotransposon contains eukaryotic genome information it can insert copies of itself into other genomic locations within a eukaryotic cell.


Endogenous retrovirus

An endogenous retrovirus is a retrovirus without virus pathogenic effects that has been integrated into the host genome by inserting their inheritable genetic information into cells that can be passed onto the next generation like a retrotransposon. Because of this, they share features with retroviruses and retrotransposons. When the retroviral DNA is integrated into the host genome they evolve into endogenous retroviruses that influence eukaryotic genomes. So many endogenous retroviruses have inserted themselves into eukaryotic genomes that they allow insight into biology between viral-host interactions and the role of retrotransposons in evolution and disease. Many retrotransposons share features with endogenous retroviruses, the property of recognising and fusing with the host genome. However, there is a key difference between retroviruses and retrotransposons, which is indicated by the env gene. Although similar to the gene carrying out the same function in retroviruses, the env gene is used to determine whether the gene is retroviral or retrotransposon. If the gene is retroviral it can evolve from a retrotransposon into a retrovirus. They differ by the order of sequences in pol genes. Env genes are found in LTR retrotransposon types Ty1-copia ( Pseudoviridae), Ty3-gypsy ( Metaviridae) and BEL/Pao. They encode glycoproteins on the retrovirus envelope needed for entry into the host cell. Retroviruses can move between cells whereas LTR retrotransposons can only move themselves into the genome of the same cell. Many vertebrate genes were formed from retroviruses and LTR retrotransposons. One endogenous retrovirus or LTR retrotransposon has the same function and genomic locations in different species, suggesting their role in evolution.


Non-LTR retrotransposons

Like LTR retrotransposons, non-LTR retrotransposons contain genes for reverse transcriptase, RNA-binding protein, nuclease, and sometimes ribonuclease H domain but they lack the long terminal repeats. RNA-binding proteins bind the RNA-transposition intermediate and nucleases are enzymes that break phosphodiester bonds between nucleotides in nucleic acids. Instead of LTRs, non-LTR retrotransposons have short repeats that can have an inverted order of bases next to each other aside from direct repeats found in LTR retrotransposons that is just one sequence of bases repeating itself. Although they are retrotransposons, they cannot carry out reverse transcription using an RNA transposition intermediate in the same way as LTR retrotransposons. Those two key components of the retrotransposon are still necessary but the way they are incorporated into the chemical reactions is different. This is because unlike LTR retrotransposons, non-LTR retrotransposons do not contain sequences that bind tRNA. They mostly fall into two types – LINEs and SINEs. SVA elements are the exception between the two as they share similarities with both LINEs and SINEs, containing Alu elements and different numbers of the same repeat. SVAs are shorter than LINEs but longer than SINEs. While historically viewed as "junk DNA", research suggests in some cases, both LINEs and SINEs were incorporated into novel genes to form new functions.


LINEs

When a LINE is transcribed, the transcript contains an RNA polymerase II promoter that ensures LINEs can be copied into whichever location it inserts itself into. RNA polymerase II is the enzyme that transcribes genes into mRNA transcripts. The ends of LINE transcripts are rich in multiple adenines, the bases that are added at the end of transcription so that LINE transcripts would not be degraded. This transcript is the RNA transposition intermediate. The RNA transposition intermediate moves from the nucleus into the cytoplasm for translation. This gives the two coding regions of a LINE that in turn binds back to the RNA it is transcribed from. The LINE RNA then moves back into the nucleus to insert into the eukaryotic genome. LINEs insert themselves into regions of the eukaryotic genome that are rich in bases AT. At AT regions LINE uses its nuclease to cut one strand of the eukaryotic double-stranded DNA. The adenine-rich sequence in LINE transcript base pairs with the cut strand to flag where the LINE will be inserted with hydroxyl groups. Reverse transcriptase recognises these hydroxyl groups to synthesise LINE retrotransposon where the DNA is cut. Like with LTR retrotransposons, this new inserted LINE contains eukaryotic genome information so it can be copied and pasted into other genomic regions easily. The information sequences are longer and more variable than those in LTR retrotransposons. Most LINE copies have variable length at the start because reverse transcription usually stops before DNA synthesis is complete. In some cases this causes RNA polymerase II promoter to be lost so LINEs cannot transpose further.


Human L1

LINE-1 (L1) retrotransposons make up a significant portion of the human genome, with an estimated 500,000 copies per genome. Genes encoding for human LINE1 usually have their transcription inhibited by methyl groups binding to its DNA carried out by PIWI proteins and enzymes DNA methyltransferases. L1 retrotransposition can disrupt the nature of genes transcribed by pasting themselves inside or near genes which could in turn lead to human disease. LINE1s can only retrotranspose in some cases to form different chromosome structures contributing to differences in genetics between individuals. There is an estimate of 80–100 active L1s in the reference genome of the Human Genome Project, and an even smaller number of L1s within those active L1s retrotranspose often. L1 insertions have been associated with
tumorigenesis Carcinogenesis, also called oncogenesis or tumorigenesis, is the formation of a cancer, whereby normal cells are transformed into cancer cells. The process is characterized by changes at the cellular, genetic, and epigenetic levels and abnor ...
by activating cancer-related genes oncogenes and diminishing tumor suppressor genes. Each human LINE1 contains two regions from which gene products can be encoded. The first coding region contains a leucine zipper protein involved in protein-protein interactions and a protein that binds to the terminus of nucleic acids. The second coding region has a purine/pyrimidine nuclease, reverse transcriptase and protein rich in amino acids cysteines and histidines. The end of the human LINE1, as with other retrotransposons is adenine-rich.


SINEs

SINEs are much shorter (300bp) than LINEs. They share similarity with genes transcribed by RNA polymerase II, the enzyme that transcribes genes into mRNA transcripts, and the initiation sequence of RNA polymerase III, the enzyme that transcribes genes into ribosomal RNA, tRNA and other small RNA molecules. SINEs such as mammalian MIR elements have tRNA gene at the start and adenine-rich at the end like in LINEs. SINEs do not encode a functional reverse transcriptase protein and rely on other mobile transposons, especially
LINEs Line most often refers to: * Line (geometry), object with zero thickness and curvature that stretches to infinity * Telephone line, a single-user circuit on a telephone communication system Line, lines, The Line, or LINE may also refer to: Ar ...
. SINEs exploit LINE transposition components despite LINE-binding proteins prefer binding to LINE RNA. SINEs cannot transpose by themselves because they cannot encode SINE transcripts. They usually consist of parts derived from tRNA and LINEs. The tRNA portion contains an RNA polymerase III promoter which the same kind of enzyme as RNA polymerase II. This makes sure the LINE copies would be transcribed into RNA for further transposition. The LINE component remains so LINE-binding proteins can recognise the LINE part of the SINE.


Alu elements

''Alu''s are the most common SINE in primates. They are approximately 350 base pairs long, do not encode proteins and can be recognized by the
restriction enzyme A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class ...
AluI Alui is a village in the Ghatal CD block in the Ghatal subdivision of the Paschim Medinipur district in the state of West Bengal, India. Geography Location Alui is located at . Area overview Ishwar Chandra Vidyasagar, scholar, social refor ...
(hence the name). Their distribution may be important in some genetic diseases and cancers. Copy and pasting Alu RNA requires the Alu's adenine-rich end and the rest of the sequence bound to a signal. The signal-bound Alu can then associate with ribosomes. LINE RNA associates on the same ribosomes as the Alu. Binding to the same ribosome allows Alus of SINEs to interact with LINE. This simultaneous translation of Alu element and LINE allows SINE copy and pasting.


SVA elements

SVA elements are present at lower levels than SINES and LINEs in humans. The starts of SVA and Alu elements are similar, followed by repeats and an end similar to endogenous retrovirus. LINEs bind to sites flanking SVA elements to transpose them. SVA are one of the youngest transposons in great apes genome and among the most active and polymorphic in the human population.


Role in human disease

Retrotransposons ensure they are not lost by chance by occurring only in cell genetics that can be passed on from one generation to the next from parent gametes. However, LINEs can transpose into the human embryo cells that eventually develop into the nervous system, raising the question whether this LINE retrotransposition affects brain function. LINE retrotransposition is also a feature of several cancers, but it is unclear whether retrotransposition itself causes cancer instead of just a symptom. Uncontrolled retrotransposition is bad for both the host organism and retrotransposons themselves so they have to be regulated. Retrotransposons are regulated by
RNA interference RNA interference (RNAi) is a biological process in which RNA molecules are involved in sequence-specific suppression of gene expression by double-stranded RNA, through translational or transcriptional repression. Historically, RNAi was known by ...
. RNA interference is carried out by a bunch of short
non-coding RNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally important types of non ...
s. The short non-coding RNA interacts with protein Argonaute to degrade retrotransposon transcripts and change their DNA histone structure to reduce their transcription.


Role in evolution

LTR retrotransposons came about later than non-LTR retrotransposons, possibly from an ancestral non-LTR retrotransposon acquiring an integrase from a DNA transposon. Retroviruses gained additional properties to their virus envelopes by taking the relevant genes from other viruses using the power of LTR retrotransposon. Due to their retrotransposition mechanism, retrotransposons amplify in number quickly, composing 40% of the human genome. The insertion rates for LINE1, Alu and SVA elements are 1/200 – 1/20, 1/20 and 1/900 respectively. The LINE1 insertion rates have varied a lot over the past 35 million years, so they indicate points in genome evolution. Notably a large number of 100 kilobases in the maize genome show variety due to the presence or absence of retrotransposons. However since maize is unusual genetically as compared to other plants it cannot be used to predict retrotransposition in other plants. Mutations caused by retrotransposons include: * Gene inactivation * Changing gene regulation * Changing gene products * Acting as DNA repair sites


Role in biotechnology


See also

*
Copy-number variation Copy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals. Copy number variation is a type of structural variation: specifically, it is a type of ...
* Genomic organization * Insertion sequences * Interspersed repeat *
Paleogenetics Paleogenetics is the study of the past through the examination of preserved genetic material from the remains of ancient organisms. Emile Zuckerkandl and Linus Pauling introduced the term in 1963, long before the sequencing of DNA, in reference to ...
* Paleovirology * RetrOryza * Retrotransposon markers, a powerful method of reconstructing phylogenies. *
Tn3 transposon The Tn3 transposon is a 4957 base pair mobile genetic element, found in prokaryotes. It encodes three proteins: * β-lactamase, an enzyme that confers resistance to β-lactam antibiotics (and is encoded by the gene Bla). * Tn3 transposase (encode ...
* Transposon * Retron


References

{{Self-replicating organic structures Mobile genetic elements Molecular biology Non-coding DNA