A long terminal repeat (LTR) is a pair of identical sequences of
DNA, several hundred
base pairs
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
long, which occur in
eukaryotic
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
genomes
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding gen ...
on either end of a series of
genes
In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
or
pseudogenes
Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by DNA duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes are ...
that form a
retrotransposon
Retrotransposons (also called Class I transposable elements or transposons via RNA intermediates) are a type of genetic component that copy and paste themselves into different genomic locations (transposon) by converting RNA back into DNA through ...
or an
endogenous retrovirus
Endogenous retroviruses (ERVs) are endogenous viral elements in the genome that closely resemble and can be derived from retroviruses. They are abundant in the genomes of jawed vertebrates, and they comprise up to 5–8% of the human genome (l ...
or a
retroviral
A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptase e ...
provirus A provirus is a virus genome that is integrated into the DNA of a host cell. In the case of bacterial viruses (bacteriophages), proviruses are often referred to as prophages. However, proviruses are distinctly different from prophages and these ter ...
. All retroviral genomes are flanked by LTRs, while there are some retrotransposons without LTRs. Typically, an element flanked by a pair of LTRs will encode a
reverse transcriptase
A reverse transcriptase (RT) is an enzyme used to generate complementary DNA (cDNA) from an RNA template, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, ...
and an
integrase
Retroviral integrase (IN) is an enzyme produced by a retrovirus (such as HIV) that integrates—forms covalent links between—its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage int ...
, allowing the element to be copied and inserted at a different location of the genome. Copies of such an LTR-flanked element can often be found hundreds or thousands of times in a genome.
LTR retrotransposons
LTR retrotransposons are class I transposable element characterized by the presence of long terminal repeats (LTRs) directly flanking an internal coding region. As retrotransposons, they mobilize through reverse transcription of their mRNA and in ...
comprise about 8% of the
human genome
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the n ...
.
The first LTR sequences were found by A.P. Czernilofsky and
J. Shine in 1977 and 1980.
Transcription
The LTR-flanked sequences are partially
transcribed into an RNA intermediate, followed by
reverse transcription
A reverse transcriptase (RT) is an enzyme used to generate complementary DNA (cDNA) from an RNA template, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, ...
into
complementary DNA
In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA (e.g., messenger RNA (mRNA) or microRNA (miRNA)) template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a spe ...
(cDNA) and ultimately dsDNA (double-stranded DNA) with full LTRs. The LTRs then mediate integration of the DNA via an LTR specific
integrase
Retroviral integrase (IN) is an enzyme produced by a retrovirus (such as HIV) that integrates—forms covalent links between—its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage int ...
into another region of the host
chromosome
A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins are ...
.
Retroviruses such as human immunodeficiency virus (
HIV
The human immunodeficiency viruses (HIV) are two species of ''Lentivirus'' (a subgroup of retrovirus) that infect humans. Over time, they cause acquired immunodeficiency syndrome (AIDS), a condition in which progressive failure of the immune ...
) use this basic mechanism.
Dating retroviral insertions
As 5' and 3' LTRs are identical upon insertion, the difference between paired LTRs can be used to estimate the age of ancient retroviral insertions. This method of dating is used by
paleovirologists, though it fails to take into account confounding factors such as
gene conversion
Gene conversion is the process by which one DNA sequence replaces a homologous sequence such that the sequences become identical after the conversion event. Gene conversion can be either allelic, meaning that one allele of the same gene replaces a ...
and
homologous recombination
Homologous recombination is a type of genetic recombination in which genetic information is exchanged between two similar or identical molecules of double-stranded or single-stranded nucleic acids (usually DNA as in cellular organisms but may ...
.
HIV-1
The
HIV-1
The subtypes of HIV include two major types, HIV type 1 (HIV-1) and HIV type 2 (HIV-2). HIV-1 is related to viruses found in chimpanzees and gorillas living in western Africa, while HIV-2 viruses are related to viruses found in the sooty mangabey, ...
LTR is 634 bp in length and, like other
retroviral
A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptase e ...
LTRs, is segmented into the U3, R, and U5 regions. U3 and U5 has been further subdivided according to transcription factor sites and their impact on LTR activity and viral gene expression. The multi-step process of reverse transcription results in the placement of two identical LTRs, each consisting of a U3, R, and U5 region, at either end of the proviral DNA. The ends of the LTRs subsequently participate in integration of the provirus into the host
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
. Once the provirus has been integrated, the LTR on the 5′ end serves as the promoter for the entire retroviral genome, while the LTR at the 3′ end provides for nascent viral RNA
polyadenylation
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
and, in HIV-1, HIV-2, and SIV, encodes the accessory protein,
Nef.
All of the required signals for gene expression are found in the LTRs: Enhancer, promoter (can have both transcriptional enhancers or regulatory elements), transcription initiation (such as capping), transcription terminator and polyadenylation signal.
In HIV-1, the
5'UTR
The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a messenger RNA (mRNA) that is directly upstream from the initiation codon. This region is important for the regulation of t ...
region has been characterized according to functional and structural differences into several sub-regions:
*TAR, or
trans-activation response element
The HIV trans-activation response (TAR) element is an RNA element which is known to be required for the trans-activation of the viral promoter and for virus replication. The TAR hairpin is a dynamic structure that acts as a binding site for the ...
, plays a critical role in transcriptional activation via its interaction with viral proteins. It forms a highly stable stem–loop structure consisting of 26 base pairs with a bulge in its secondary structure that interfaces with the viral transcription activator protein
Tat.
*Poly A plays roles both in
dimer
Dimer may refer to:
* Dimer (chemistry), a chemical structure formed from two similar sub-units
** Protein dimer, a protein quaternary structure
** d-dimer
* Dimer model, an item in statistical mechanics, based on ''domino tiling''
* Julius Dimer ( ...
ization and genome packaging since it is necessary for cleavage and
polyadenylation
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
. It has been reported that sequences upstream (U3 region) and downstream (U5 region) are needed in order to make the cleavage process efficient.
*PBS, or
primer binding site A primer binding site is a region of a nucleotide sequence where an RNA or DNA single-stranded primer binds to start replication. The primer binding site is on one of the two complementary strands of a double-stranded nucleotide polymer, in the ...
, is 18 nucleotides long and has a specific sequence that binds to the tRNA
Lys primer required for initiation of reverse transcription.
*Psi (Ψ), or the
Psi packaging element, is a unique motif involved in regulating the packaging of the viral genome into the
capsid
A capsid is the protein shell of a virus, enclosing its genetic material. It consists of several oligomeric (repeating) structural subunits made of protein called protomers. The observable 3-dimensional morphological subunits, which may or may ...
. It is composed of four stem-loop (SL) structures with a major splicing donor site embedded in the second SL.
*DIS, or dimer initiation site, is a highly conserved RNA–RNA interacting sequence constituting the SL1 stem–loop in the Psi packaging element of many retroviruses. DIS is characterized by a conserved stem and palindromic loop that forms a
kissing-loop complex between HIV-1 RNA genomes to dimerize them for