T-REX (Webserver)
   HOME

TheInfoList



OR:

T-REX (Tree and Reticulogram Reconstruction) is a freely available web server, developed at the department of Computer Science of the
Université du Québec à Montréal The Université du Québec à Montréal (English: University of Quebec in Montreal), also known as UQAM, is a French-language public university based in Montreal, Quebec, Canada. It is the largest constituent element of the Université du Qué ...
, dedicated to the inference, validation and visualization of phylogenetic trees and
phylogenetic network A phylogenetic network is any graph used to visualize evolutionary relationships (either abstractly or explicitly) between nucleotide sequences, genes, chromosomes, genomes, or species. They are employed when reticulation events such as hybrid ...
s. The T-REX web server allows the users to perform several popular methods of phylogenetic analysis as well as some new phylogenetic applications for inferring, drawing and validating phylogenetic trees and networks.


Phylogenetic inference

The following methods for inferring and validating phylogenetic trees using distances are available:
Neighbor joining In bioinformatics, neighbor joining is a bottom-up (agglomerative) clustering method for the creation of phylogenetic trees, created by Naruya Saitou and Masatoshi Nei in 1987. Usually based on DNA or protein sequence data, the algorithm requir ...
(NJ),
NINJA large-scale Neighbor JoiningBioNJ
UNJ, ADDTREE, MW, FITCH and Circular order reconstruction. For the maximum parsimony: DNAPARS, PROTPARS, PARS and DOLLOP, all of them from the
PHYLIP PHYLogeny Inference Package (PHYLIP) is a free computational phylogenetics package of programs for inferring evolutionary trees (Phylogenetics, phylogenies). It consists of 65 Porting, portable programs, i.e., the source code is written in the prog ...
package, and for the maximum likelihood: PhyML, RAxML, DNAML, DNAMLK, PROML and PROMLK, the four latter methods are from the
PHYLIP PHYLogeny Inference Package (PHYLIP) is a free computational phylogenetics package of programs for inferring evolutionary trees (Phylogenetics, phylogenies). It consists of 65 Porting, portable programs, i.e., the source code is written in the prog ...
package, are available.


Tree drawing

Hierarchical vertical, horizontal, radial and axial types of tree drawing are available. Input data can be in the three following formats:
Newick format In mathematics, Newick tree format (or Newick notation or New Hampshire tree format) is a way of representing graph-theoretical trees with edge lengths using parentheses and commas. It was adopted by James Archie, William H. E. Day, Joseph Fels ...
, PHYLIP and
FASTA format In bioinformatics and biochemistry, the FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes. The format ...
. All graphical results provided by the T-REX server can be saved in the SVG (Scalable Vector Graphics) format and then opened and modified (e.g. prepared for a publication or presentation) in the user’s preferred graphics editor.


Tree building

A developed application for drawing phylogenetic trees allowing for saving them in the
Newick format In mathematics, Newick tree format (or Newick notation or New Hampshire tree format) is a way of representing graph-theoretical trees with edge lengths using parentheses and commas. It was adopted by James Archie, William H. E. Day, Joseph Fels ...
.


Tree inference from incomplete matrices

The following methods for reconstructing phylogenetic trees from a distance matrix containing missing values, i.e. incomplete matrices, are available: Triangles method by Guénoche and Leclerc (2001), Ultrametric procedure for the estimation of missing values by Landry, Lapointe and Kirsch (1996) followed by NJ, Additive procedure for the estimation of missing values by Landry, Lapointe and Kirsch (1996) followed by NJ, and the Modified Weighted least-squares method (MW*) by Makarenkov and Lapointe (2004). The MW* method assigns the weight of 1 to the existing entries, the weight of 0.5 to the estimated entries and the weight of 0 when the entry estimation was impossible. The simulations described in (Makarenkov and Lapointe 2004) showed that the MW* method clearly outperforms the Triangles, Ultrametric and Additive procedures.


Horizontal gene transfer detection

Complete and partial
Horizontal gene transfer Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between unicellular and/or multicellular organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). H ...
detection and validation methods are included in the T-REX server. The HGT-Detection program aims to determine an optimal, i.e. minimum-cost, scenario of horizontal gene transfers while proceeding by a gradual reconciliation of the given species and gene trees.


Reticulogram inference

The reticulogram i.e. reticulated network reconstruction program first builds a supporting phylogenetic tree using one of the existing tree inferring methods. Following this, a reticulation branch that minimizes the least-square or the weighted least-square objective function is added to the tree (or network starting from Step 2) at each step of the algorithm. Two statistical criteria, Q1 and Q2, have been proposed in order to measure the gain in fit provided by each reticulation branch. The web server version of T-REX also provides the possibility of inferring the supporting tree from one distance matrix and then for adding reticulation branches using another distance matrix. Such an algorithm can be useful for depicting morphological or genetic similarities among given species or for identifying HGT events by using the first distance matrix to infer the species tree and the second matrix (containing the gene-related distances) to infer the reticulation branches representing putative horizontal gene transfers .


Sequence alignment

MAFFT In bioinformatics, MAFFT (for multiple alignment using fast Fourier transform) is a program used to create multiple sequence alignments of amino acid or nucleotide sequences. Published in 2002, the first version of MAFFT used an algorithm based on ...
,
MUSCLE (alignment software) MUltiple Sequence Comparison by Log-Expectation (MUSCLE) is computer software for multiple sequence alignment of protein and nucleotide sequences. It is licensed as public domain. The method was published by Robert C. Edgar in two papers in 2004. ...
and ClustalW, which are among the most widely used
multiple sequence alignment Multiple sequence alignment (MSA) may refer to the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutio ...
tools, are available with slow and fast pairwise alignment options.


Substitution models (sequence to distance transformation)

The following popular substitution models of DNA and amino acids evolution, allowing for estimating evolutionary distances from sequence data, have been included to T-REX: Uncorrected distance, Jukes-Cantor (Jukes and Cantor 1969), K80 – 2 parameters (Kimura 1980), T92 (Tamura 1992), Tajima-Nei (Tajima and Nei 1984), Jin-Nei gamma (Jin and Nei 1990), Kimura protein (Kimura 1983), LogDet (Lockhart et al. 1994), F84 (Felsenstein 1981), WAG (Whelan and Goldman 2001), JTT (Jones et al. 1992) and LG (Le and Gascuel 2008).


Robinson and Foulds topological distance

This program computes the Robinson–Foulds metric (RF) topological distance (Robinson and Foulds 1981), which is a popular measure of the trees similarity, between the first tree and all the following trees specified by the user. The trees can be supplied in the newick or distance matrix formats. An optimal algorithm described in (Makarenkov and Leclerc 2000) is carried out to compute the RF metric.


Newick to Matrix conversion

Newick to Distance matrix and Distance matrix to Newick format conversion. An in-house application allows the user to convert a phylogenetic tree from the
Newick Newick is a village, civil parish and electoral ward in the Lewes District of East Sussex, England. It is located on the A272 road east of Haywards Heath. The parish church, St. Mary's, dates mainly from the Victorian era, but still has a N ...
format to the
Distance matrix In mathematics, computer science and especially graph theory, a distance matrix is a square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the ''dist ...
format and vice versa.


Random tree generator

This application generates ''k'' random phylogenetic trees with n leaves, i.e. species or taxa, and an average branch length ''l'' using the random tree generation procedure described by Kuhner and Felsenstein (1994), where the variables ''k'', ''n'' and ''l'' are defined by the user. The branch lengths of trees follow an exponential distribution. The branch lengths are multiplied by 1+''ax'', where the variable ''x'' is obtained from an exponential distribution (P(''x''>''k'') = exp(-''k'')), and the constant a is a tuning factor accounting for the deviation intensity (as described in Guindon and Gascuel (2002), the value of a was set to 0.8). The random trees generated by this procedure have depth of O(log (''n'')).


References

{{Reflist die


External links


Official T-REX Web server page
Data visualization software Phylogenetics software pt:T-REX (Webserver), Tree and Reticulogram Reconstruction