HOME

TheInfoList



OR:

Frequent pattern discovery (or FP discovery, FP mining, or Frequent itemset mining) is part of knowledge discovery in databases,
Massive Online Analysis Massive Online Analysis (MOA) is a free open-source software project specific for data stream mining with concept drift. It is written in Java and developed at the University of Waikato, New Zealand. Description MOA is an open-source framework ...
, and data mining; it describes the task of finding the most frequent and relevant
pattern A pattern is a regularity in the world, in human-made design, or in abstract ideas. As such, the elements of a pattern repeat in a predictable manner. A geometric pattern is a kind of pattern formed of geometric shapes and typically repeated li ...
s in large datasets. The concept was first introduced for mining transaction databases. Frequent patterns are defined as subsets (itemsets, subsequences, or substructures) that appear in a data set with frequency no less than a user-specified or auto-determined threshold.


Techniques

Techniques for FP mining include: *
market basket analysis Affinity analysis falls under the umbrella term of data mining which uncovers meaningful correlations between different entities according to their co-occurrence in a data set. In almost all systems and processes, the application of affinity anal ...
* cross-marketing * catalog design * clustering * classification * recommendation systems For the most part, FP discovery can be done using
association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
with particular algorithms Eclat, FP-growth and the
Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
. Other strategies include: *
Frequent subtree mining In computer science, frequent subtree mining is the problem of finding all patterns in a given database whose support (a metric related to its number of occurrences in other subtrees) is over a given threshold. It is a more general form of the maxim ...
* Structure mining *
Sequential pattern mining Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time serie ...
and respective specific techniques. Implementations exist for various
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
systems or modules like MLlib for Apache Spark.


References

{{reflist Data mining Cluster analysis