Sequence Alignment Software
This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment. See structural alignment software for structural alignment of proteins. Database search only *Sequence type: protein or nucleotide Pairwise alignment *Sequence type: protein or nucleotide **Alignment type: local or global Multiple sequence alignment *Sequence type: protein or nucleotide. **Alignment type: local or global Genomics analysis *Sequence type: protein or nucleotide Motif finding *Sequence type: protein or nucleotide Benchmarking Alignment viewers, editors Please see List of alignment visualization software. Short-read sequence alignment See also * List of open source bioinformatics software References {{Reflist Sequence Sequence alignment software This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple seque ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Sequence Alignment
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences, such as calculating the distance cost between strings in a natural language or in financial data. Interpretation If two sequences in an alignment share a common ancestor, mismatches can be interpreted as point mutations and gaps as indels (that is, insertion or deletion mutations) introduced in one or both lineages in the time since they diverged from one another. In sequence alignments of proteins, the degree of similarity between amino acids occupying a parti ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
FASTA
FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics. History The original FASTA program was designed for protein sequence similarity searching. Because of the exponentially expanding genetic information and the limited speed and memory of computers in the 1980s heuristic methods were introduced aligning a query sequence to entire data-bases. FASTA, published in 1987, added the ability to do DNA:DNA searches, translated protein:DNA searches, and also provided a more sophisticated shuffling program for evaluating statistical significance. There are several programs in this package that allow the alignment of protein sequences and DNA sequences. Nowadays, increased computer performance makes it possible to perform searches for local alignment detection in a database using the Smith–Waterman algorithm. FASTA is pronounced "fas ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
BioPerl
BioPerl is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It has played an integral role in the Human Genome Project. Background BioPerl is an active open source software project supported by the Open Bioinformatics Foundation. The first set of Perl codes of BioPerl was created by Tim Hubbard and Jong Bhak at MRC Centre Cambridge, where the first genome sequencing was carried out by Fred Sanger. MRC Centre was one of the hubs and birth places of modern bioinformatics as it had a large quantity of DNA sequences and 3D protein structures. Hubbard was using the th_lib.pl Perl library, which contained many useful Perl subroutines for bioinformatics. Bhak, Hubbard's first PhD student, created jong_lib.pl. Bhak merged the two Perl subroutine libraries into Bio.pl. The name BioPerl was coined jointly by Bhak and Steven Brenner at the Centre for Protein Engineering (CPE). In 1995, Brenner organized a BioPerl session at the In ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Bioconductor
Bioconductor is a Free software, free, Open-source software, open source and Open source software development, open development software project for the analysis and comprehension of Genome, genomic data generated by Wet laboratory, wet lab experiments in molecular biology. Bioconductor is based primarily on the statistics, statistical R (programming language), R programming language, but does contain contributions in other programming languages. It has two Software release life cycle, releases each year that follow the semiannual releases of R. At any one time there is a Software versioning, release version, which corresponds to the released version of R, and a Software versioning, development version, which corresponds to the development version of R. Most users will find the release version appropriate for their needs. In addition there are many genome annotation packages available that are mainly, but not solely, oriented towards different types of microarrays. While computati ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Kevin Karplus
Kevin Karplus is a professor emeritus at University of California, Santa Cruz, currently in the Biomolecular Engineering Department. He is probably best known for work he did as a computer science graduate student at Stanford University on the Karplus-Strong string synthesis algorithm. He taught VLSI design and computer engineering for several years, helping create the Computer Engineering Department at University of California, Santa Cruz. He made some contributions to VLSI CAD, particularly to logic minimization, where he invented the if-then-else DAG (a generalization of the binary decision diagram) and a canonical form for it, before switching to protein structure prediction and bioinformatics in 1995. He has participated in CASP (Critical Assessment of Techniques for Protein Structure Prediction) since CASP2 in 1996, and has been invited to present papers at CASP2, 3, 4, 5, 6, 7, and 8. He served on thBoard of Directorsfor the International Society for Computational Biolog ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Position-specific Scoring Matrix
A position weight matrix (PWM), also known as a position-specific weight matrix (PSWM) or position-specific scoring matrix (PSSM), is a commonly used representation of motifs (patterns) in biological sequences. PWMs are often derived from a set of aligned sequences that are thought to be functionally related and have become an important part of many software tools for computational motif discovery. Background Creation Conversion of sequence to position probability matrix A PWM has one row for each symbol of the alphabet (4 rows for nucleotides in DNA sequences or 20 rows for amino acids in protein sequences) and one column for each position in the pattern. In the first step in constructing a PWM, a basic position frequency matrix (PFM) is created by counting the occurrences of each nucleotide at each position. From the PFM, a position probability matrix (PPM) can now be created by dividing that former nucleotide count at each position by the number of sequences, thereb ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
PSI-BLAST
In bioinformatics, BLAST (basic local alignment search tool) is an algorithm and program for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA and/or RNA sequences. A BLAST search enables a researcher to compare a subject protein or nucleotide sequence (called a query) with a library or database of sequences, and identify database sequences that resemble alphabet above a certain threshold. For example, following the discovery of a previously unknown gene in the mouse, a scientist will typically perform a BLAST search of the human genome to see if humans carry a similar gene; BLAST will identify sequences in the pig genome that resemble the mouse gene based on similarity of sequence. Background BLAST, which ''The New York Times'' called ''the Google of biological research'', is one of the most widely used bioinformatics programs for sequence searching. It addresses a fundamental problem in bioinformatics ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Stochastic Context-free Grammar
Grammar theory to model symbol strings originated from work in computational linguistics aiming to understand the structure of natural languages. Probabilistic context free grammars (PCFGs) have been applied in probabilistic modeling of RNA structures almost 40 years after they were introduced in computational linguistics. PCFGs extend context-free grammars similar to how hidden Markov models extend regular grammars. Each production is assigned a probability. The probability of a derivation (parse) is the product of the probabilities of the productions used in that derivation. These probabilities can be viewed as parameters of the model, and for large problems it is convenient to learn these parameters via machine learning. A probabilistic grammar's validity is constrained by context of its training dataset. PCFGs have application in areas as diverse as natural language processing to the study the structure of RNA molecules and design of programming languages. Designing efficient PC ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
HH-suite
The HH-suite is an open-source software package for sensitive protein sequence searching. It contains programs that can search for similar protein sequences in protein sequence databases. Sequence searches are a standard tool in modern biology with which the function of unknown proteins can be inferred from the functions of proteins with similar sequences. HHsearch and HHblits are two main programs in the package and the entry point to its search function, the latter being a faster iteration. HHpred is an online server for protein structure prediction that uses homology information from HH-suite. The HH-suite searches for sequences using hidden Markov models (HMMs). The name comes from the fact that it performs HMM-HMM alignments. Among the most popular methods for protein sequence matching, the programs have been cited more than 5000 times total according to Google Scholar. Background Proteins are central players in all of life's processes. Understanding them is central to un ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Anders Krogh
Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics (together with David Haussler), and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks.Introduction to the Theory of Neural Computation (Santa Fe Institute Studies in the Sciences of Complexity). (1991) John A. Hertz, Richard G. Palmer, Anders Krogh, Westview Press His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction. In 2017, Krogh was elected a Fellow of the International Society for Computational Biology (ISCB). See also *ELIXIR ELIXIR (the European life-sciences Infrastructure for biological Information) is an initiative that will allow life science laboratories across Europe to share and store their res ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |