SonicParanoid

	SonicParanoid SonicParanoid is an algorithm for the de-novo prediction of orthologous genes among multiple species. It borrows the main idea from InParanoid with substantial changes to the algorithm that drastically reduce the time required for the analysis. Additionally, SonicParanoid generates groups of orthologous genes shared among the input proteomes using single-linkage hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two ... or markov clustering. The latest iteration of SonicParanoid uses machine learning to substantially reduce execution times, and language models to infer orthologs at the domain level. References External links Source code on GitLab [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Sequence Homology Sequence homology is the homology (biology), biological homology between DNA sequence, DNA, RNA sequence, RNA, or Protein primary structure, protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a Gene duplication, duplication event (paralogs), or else a Horizontal gene transfer, horizontal (or lateral) gene transfer event (xenologs). Homology among DNA, RNA, or proteins is typically inferred from their nucleotide or amino acid sequence similarity. Significant similarity is strong evidence that two sequences are related by evolutionary changes from a common ancestral sequence. Sequence alignment, Alignments of multiple sequences are used to indicate which regions of each sequence are homologous. Identity, similarity, and conservation The term "percent homology" is often used to mean "sequence similarity”, that is the percen ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Single-linkage Clustering In statistics, single-linkage clustering is one of several methods of hierarchical clustering. It is based on grouping clusters in bottom-up fashion (agglomerative clustering), at each step combining two clusters that contain the closest pair of elements not yet belonging to the same cluster as each other. This method tends to produce long thin clusters in which nearby elements of the same cluster have small distances, but elements at opposite ends of a cluster may be much farther from each other than two elements of other clusters. For some classes of data, this may lead to difficulties in defining classes that could usefully subdivide the data. However, it is popular in astronomy for analyzing galaxy clusters, which may often involve long strings of matter; in this application, it is also known as the friends-of-friends algorithm. Overview of agglomerative clustering methods In the beginning of the agglomerative clustering process, each element is in a cluster of its own. The ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories: * Agglomerative: Agglomerative: Agglomerative clustering, often referred to as a "bottom-up" approach, begins with each data point as an individual cluster. At each step, the algorithm merges the two most similar clusters based on a chosen distance metric (e.g., Euclidean distance) and linkage criterion (e.g., single-linkage, complete-linkage). This process continues until all data points are combined into a single cluster or a stopping criterion is met. Agglomerative methods are more commonly used due to their simplicity and computational efficiency for small to medium-sized datasets . * Divisive: Divisive clustering, known as a "top-down" approach, starts with all data points in a single cluster and recursively splits the clu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]