Frequent Subtree Mining
   HOME





Frequent Subtree Mining
In computer science, frequent subtree mining is the problem of finding all patterns in a given database whose support (a metric related to its number of occurrences in other subtrees) is over a given threshold. It is a more general form of the maximum agreement subtree problem. Definition Frequent subtree mining is the problem of trying to find all of the patterns whose "support" is over a certain user-specified level, where "support" is calculated as the number of trees in a database which have at least one subtree isomorphic to a given pattern. Formal definition The problem of frequent subtree mining has been formally defined as: :Given a threshold ''minfreq'', a class of trees \mathcal, a transitive subtree relation P\preceq T between trees P, T \in\mathcal, a finite set of trees \mathcal\subseteq\mathcal, the frequent subtree mining problem is the problem of finding all trees \mathcal\subset\mathcal such that no two trees in \mathcal are isomorphic and ::\forall P\in\mathc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computer Science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, applied disciplines (including the design and implementation of Computer architecture, hardware and Software engineering, software). Algorithms and data structures are central to computer science. The theory of computation concerns abstract models of computation and general classes of computational problem, problems that can be solved using them. The fields of cryptography and computer security involve studying the means for secure communication and preventing security vulnerabilities. Computer graphics (computer science), Computer graphics and computational geometry address the generation of images. Programming language theory considers different ways to describe computational processes, and database theory concerns the management of re ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Maximum Agreement Subtree Problem
The maximum agreement subtree problem is any of several closely related problems in graph theory and computer science. In all of these problems one is given a collection of trees T_1,\ldots, T_m each containing n leaves. The leaves of these trees are given labels from some set L with , L, =n so that no pair of leaves in the same tree sharing the same label, within the same tree the labelling for each leaf is distinct. In this problem one would like to find the largest subset L'\subset L such that the minimal spanning subtrees containing the leaves in L', of T_1\mid S,\ldots, T_m\mid S are the "same" while preserving the labelling. Formulations Maximum homeomorphic agreement subtree Source: This version requires that the subtrees T_1\mid S,\ldots, T_m\mid S are homeomorphic to one another. Rooted maximum homeomorphic agreement subtree This version is the same as the ''maximum homeomorphic agreement subtree'', but we further assume that T_1,\ldots,T_m are rooted and that the sub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Graph Isomorphism
In graph theory, an isomorphism of graphs ''G'' and ''H'' is a bijection between the vertex sets of ''G'' and ''H'' : f \colon V(G) \to V(H) such that any two vertices ''u'' and ''v'' of ''G'' are adjacent in ''G'' if and only if f(u) and f(v) are adjacent in ''H''. This kind of bijection is commonly described as "edge-preserving bijection", in accordance with the general notion of isomorphism being a structure-preserving bijection. If an isomorphism exists between two graphs, then the graphs are called isomorphic, often denoted by G\simeq H. In the case when the isomorphism is a mapping of a graph onto itself, i.e., when ''G'' and ''H'' are one and the same graph, the isomorphism is called an automorphism of ''G''. Graph isomorphism is an equivalence relation on graphs and as such it partitions the class of all graphs into equivalence classes. A set of graphs isomorphic to each other is called an isomorphism class of graphs. The question of whether graph isomorphism can be dete ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computational Biology
Computational biology refers to the use of techniques in computer science, data analysis, mathematical modeling and Computer simulation, computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and data science, the field also has foundations in applied mathematics, molecular biology, cell biology, chemistry, and genetics. History Bioinformatics, the analysis of informatics processes in biological systems, began in the early 1970s. At this time, research in artificial intelligence was using network models of the human brain in order to generate new algorithms. This use of biological data pushed biological researchers to use computers to evaluate and compare large data sets in their own field. By 1982, researchers shared information via Punched card, punch cards. The amount of data grew exponentially by the end of the 1980s, requiring new computational methods for quickly interpreting relevant information. Per ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


KEGG
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development. The KEGG database project was initiated in 1995 by Minoru Kanehisa, professor at the Institute for Chemical Research, Kyoto University, under the then ongoing Japanese Human Genome Program. Foreseeing the need for a computerized resource that can be used for biological interpretation of genome sequence data, he started developing the KEGG PATHWAY database. It is a collection of manually drawn KEGG pathway maps representing experimental knowledge on metabolism and various other functions of the cell and the organism. Each pathway map contains a network of molecular interactions and ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

NP-complete
In computational complexity theory, NP-complete problems are the hardest of the problems to which ''solutions'' can be verified ''quickly''. Somewhat more precisely, a problem is NP-complete when: # It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no". # When the answer is "yes", this can be demonstrated through the existence of a short (polynomial length) ''solution''. # The correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying all possible solutions. # The problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. Hence, if we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified. The name "NP-complete" is short for "nondeterministic polynomial-time complete". In this name, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Subgraph Isomorphism Problem
In theoretical computer science, the subgraph isomorphism problem is a computational task in which two graphs G and H are given as input, and one must determine whether G contains a subgraph that is isomorphic to H. Subgraph isomorphism is a generalization of both the maximum clique problem and the problem of testing whether a graph contains a Hamiltonian cycle, and is therefore NP-complete. However certain other cases of subgraph isomorphism may be solved in polynomial time. Sometimes the name subgraph matching is also used for the same problem. This name puts emphasis on finding such a subgraph as opposed to the bare decision problem. Decision problem and computational complexity To prove subgraph isomorphism is NP-complete, it must be formulated as a decision problem. The input to the decision problem is a pair of graphs G and ''H''. The answer to the problem is positive if ''H'' is isomorphic to a subgraph of ''G'', and negative otherwise. Formal question: Let G=(V,E), ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Combinatorial Explosion
In mathematics, a combinatorial explosion is the rapid growth of the complexity of a problem due to the way its combinatorics depends on input, constraints and bounds. Combinatorial explosion is sometimes used to justify the intractability of certain problems.http://intelligence.worldofcomputing/combinatorial-explosion
Combinatorial Explosion.
Examples of such problems include certain mathematical functions, the analysis of some puzzles and games, and some pathological examples which can be modelled as the Ackermann function.


...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]