Pfam
   HOME
*



picture info

Pfam
Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families. Uses The general purpose of the Pfam database is to provide a complete and accurate classification of protein families and domains. Originally, the rationale behind creating the database was to have a semi-automated method of curating information on known protein families to improve the efficiency of annotating genomes. The Pfam classification of protein families has been widely adopted by biologists because of its wide coverage of proteins and sensible naming conventions. It is used by experimental biologists researching specific proteins, by structural biologists to identify new targets for structure determination, by computational biologists to organise sequences and by evolutionary biologists tracing the origins of proteins. Early genome ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Pfam Logo
Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families. Uses The general purpose of the Pfam database is to provide a complete and accurate classification of protein families and domains. Originally, the rationale behind creating the database was to have a semi-automated method of curating information on known protein families to improve the efficiency of annotating genomes. The Pfam classification of protein families has been widely adopted by biologists because of its wide coverage of proteins and sensible naming conventions. It is used by experimental biologists researching specific proteins, by structural biologists to identify new targets for structure determination, by computational biologists to organise sequences and by evolutionary biologists tracing the origins of proteins. Early genome ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Stockholm Format
Stockholm format is a multiple sequence alignment format used by Pfam, Rfam anDfam to disseminate protein, RNA and DNA sequence alignments. The alignment editorRaleehtml" ;"title=",;()[">,;()[aBb.-_--supports pseudoknot and further structure markup (see WUSS documentation) For protein [HGIEBTSCX] SA Surface Accessibility [0-9X] (0=0%-10%; ...; 9=90%-100%) TM TransMembrane [Mio] PP Posterior Probability [0-9*] (0=0.00-0.05; 1=0.05-0.15; *=0.95-1.00) LI LIgand binding AS Active Site pAS AS - Pfam predicted sAS AS - from SwissProt IN INtron (in or after) -2 For RNA tertiary interactions: ------------------------------ tWW WC/WC in trans For basepairs: />AaBb...Zz.html" ;"title=">AaBb...Zz">>AaBb...Zz For unpaired: cWH W ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Zinc Finger
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) in order to stabilize the fold. It was originally coined to describe the finger-like appearance of a hypothesized structure from the African clawed frog (''Xenopus laevis'') transcription factor IIIA. However, it has been found to encompass a wide variety of differing protein structures in eukaryotic cells. ''Xenopus laevis'' TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein followed soon thereafter by the Krüppel factor in ''Drosophila''. It often appears as a metal-binding domain in multi-domain proteins. Proteins that contain zinc fingers (zinc finger proteins) are classified into several different structural families. Unlike many other clearly defined supersecondary structures such as Greek keys or β hairpins, there are a number of t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Domains Of Unknown Function
A domain of unknown function (DUF) is a protein domain that has no characterised function. These families have been collected together in the Pfam database using the prefix DUF followed by a number, with examples being DUF2992 and DUF1220. As of 2019, there are almost 4,000 DUF families within the Pfam database representing over 22% of known families. Some DUFs are not named using the nomenclature due to popular usage but are nevertheless DUFs. The DUF designation is tentative, and such families tend to be renamed to a more specific name (or merged to an existing domain) after a function is identified. History The DUF naming scheme was introduced by Chris Ponting, through the addition of DUF1 and DUF2 to the SMART database. These two domains were found to be widely distributed in bacterial signaling proteins. Subsequently, the functions of these domains were identified and they have since been renamed as the GGDEF domain and EAL domain respectively. Characterisation Structural gen ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


InterPro
InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them. The contents of InterPro consist of diagnostic signatures and the proteins that they significantly match. The signatures consist of models (simple types, such as regular expressions or more complex ones, such as Hidden Markov models) which describe protein families, domains or sites. Models are built from the amino acid sequences of known families or domains and they are subsequently used to search unknown sequences (such as those arising from novel genome sequencing) in order to classify them. Each of the member databases of InterPro contributes towards a different niche, from very high-level, structure-based classifications ( SUPERFAMILY and CATH-Gene3D) through to quite specific sub-family classifications ( PRINTS and PANTHER). InterPro's intention is to pro ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

HMMER
HMMER is a free and commonly used software package for sequence analysis written by Sean Eddy. Its general usage is to identify homologous protein or nucleotide sequences, and to perform sequence alignments. It detects homology by comparing a ''profile-HMM'' (a Hidden Markov model constructed explicitly for a particular search) to either a single sequence or a database of sequences. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM. Profile-HMMs are constructed from a multiple sequence alignment in the HMMER package using the ''hmmbuild'' program. The profile-HMM implementation used in the HMMER software was based on the work of Krogh and colleagues. HMMER is a console utility ported to every major operating system, including different versions of Linux, Windows, and Mac OS. HMMER is the core utility that protein family databases such as Pfam and ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Rfam
Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families. Unlike proteins, ncRNAs often have similar secondary structure without sharing much similarity in the primary sequence. Rfam divides ncRNAs into families based on evolution from a common ancestor. Producing multiple sequence alignments (MSA) of these families can provide insight into their structure and function, similar to the case of protein families. These MSAs become more useful with the addition of secondary structure information. Rfam researchers also contribute to Wikipedia's RNA WikiProject. Uses The Rfam database can be used for a variety of functions. For each ncRNA fa ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Protein Family
A protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be confused with Family (biology), family as it is used in taxonomy. Proteins in a family descend from a common ancestor and typically have similar protein structure, three-dimensional structures, functions, and significant Sequence homology, sequence similarity. The most important of these is sequence similarity (usually amino-acid sequence), since it is the strictest indicator of homology and therefore the clearest indicator of common ancestry. A fairly well developed framework exists for evaluating the significance of similarity between a group of sequences using sequence alignment methods. Proteins that do not share a common ancestor are very unlikely to show statistically significant sequence similarity, making sequence alignment a powerf ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Conserved Sequence
In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids ( DNA and RNA) or proteins across species ( orthologous sequences), or within a genome ( paralogous sequences), or between donor and receptor taxa ( xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. A highly conserved sequence is one that has remained relatively unchanged far back up the phylogenetic tree, and hence far back in geological time. Examples of highly conserved sequences include the RNA components of ribosomes present in all domains of life, the homeobox sequences widespread amongst Eukaryotes, and the tmRNA in Bacteria. The study of sequence conservation overlaps with the fields of genomics, proteomics, evolutionary biology, phylogenetics, bioinformatics and mathematics. History The discovery of the role of DNA in heredity, and observations by Frederick Sanger of variation between animal insulins in 1949, prompt ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




PANDIT (database)
PANDIT is a database of multiple sequence alignments and phylogenetic trees covering many common protein domains. See also * Pfam: database of protein domains * Phylogeny * Sequence alignment In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Alig ... References External links * http://www.ebi.ac.uk/goldman-srv/pandit Biological databases Computational phylogenetics Genetics in the United Kingdom Protein domains Science and technology in Cambridgeshire South Cambridgeshire District {{Biodatabase-stub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


List Of Biological Databases
Biological databases are stores of biological information. The journal ''Nucleic Acids Research'' regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases and updates to previously described databasesOmics Discovery Indexcan be used to browse and search several biological databases. Meta databases Meta databases are databases of databases that collect data about data to generate new data. They are capable of merging information from different sources and making it available in a new and more convenient form, or with an emphasis on a particular disease or organism.[metadatabase is a database model for metadata management, global query of independent database, and distributed data processing. The word metadatabase is an addition to the dictionary]. originally ,metadata was only common term referring simply to ''data about data '' such a tags ,keywords, and markup headers. * ConsensusPathDB: a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

South Cambridgeshire District
South is one of the cardinal directions or Points of the compass, compass points. The direction is the opposite of north and is perpendicular to both east and west. Etymology The word ''south'' comes from Old English ''sūþ'', from earlier Proto-Germanic language, Proto-Germanic ''*sunþaz'' ("south"), possibly related to the same Proto-Indo-European language, Proto-Indo-European root that the word ''sun'' derived from. Some languages describe south in the same way, from the fact that it is the direction of the sun at noon (in the Northern Hemisphere), like Latin meridies 'noon, south' (from medius 'middle' + dies 'day', cf English meridional), while others describe south as the right-hand side of the rising sun, like Biblical Hebrew תֵּימָן teiman 'south' from יָמִין yamin 'right', Aramaic תַּימנַא taymna from יָמִין yamin 'right' and Syriac ܬܰܝܡܢܳܐ taymna from ܝܰܡܝܺܢܳܐ yamina (hence the name of Yemen, the land to the south/right of the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]