ELKI
ELKI (''Environment for Developing KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally created by the database systems research unit at the Ludwig Maximilian University of Munich, Germany, led by Professor Hans-Peter Kriegel. The project has continued at the Technical University of Dortmund, Germany. It aims at allowing the development and evaluation of advanced data mining algorithms and their interaction with database index structures. Description The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms perform clustering, outlier detection, and database indexes. The object-oriented architecture allows the combination of arbitrary algorithms, data types, distance functions, indexes, and evaluation measures. The Java just-in-time compiler optimizes all combinations to a simila ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Arthur Zimek
Arthur Zimek is a professor in data mining, data science and machine learning at the University of Southern Denmark in Odense, Denmark. He graduated from the Ludwig Maximilian University of Munich in Munich, Germany, where he worked with Prof. Hans-Peter Kriegel. His dissertation on "Correlation Clustering" was awarded the "SIGKDD Doctoral Dissertation Award 2009 Runner-up" by the Association for Computing Machinery. He is well known for his work on outlier detection, density-based clustering, correlation clustering, and the curse of dimensionality. He is one of the founders and core developers of the open-source ELKI data mining framework. References External links University homepagePublicationsin the Digital Bibliography & Library Project DBLP is a computer science bibliography website. Starting in 1993 at Universität Trier in Germany, it grew from a small collection of HTML files and became an organization hosting a database and logic programming bibliography site ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
OPTICS Algorithm
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based Cluster analysis, clusters in spatial data. It was presented in 1999 by Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel and Jörg Sander. Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN's major weaknesses: the problem of detecting meaningful clusters in data of varying density. To do so, the points of the database are (linearly) ordered such that spatially closest points become neighbors in the ordering. Additionally, a special distance is stored for each point that represents the density that must be accepted for a cluster so that both points belong to the same cluster. This is represented as a dendrogram. Basic idea Like DBSCAN, OPTICS requires two parameters: , which describes the maximum distance (radius) to consider, and , describing the number of points required to form a cluster. A point is a ''core point'' if at least points are found within it ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Data Mining
Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the " knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The term "data mining" is a misnomer because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (''mining'') of data itself. It also is a buzzwo ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hans-Peter Kriegel
Hans-Peter Kriegel (1 October 1948, Germany) is a German computer scientist and professor at the Ludwig Maximilian University of Munich and leading the Database Systems Group in the Department of Computer Science. He was previously professor at the University of Würzburg and the University of Bremen after habilitation at the Technical University of Dortmund and doctorate from Karlsruhe Institute of Technology. Research His most important contributions are the database index structures R*-tree, X-tree and IQ-Tree, the cluster analysis algorithms DBSCAN, OPTICS and SUBCLU and the anomaly detection method Local Outlier Factor (LOF). His research is focused around correlation clustering, high-dimensional data indexing and analysis, spatial data mining and spatial data management as well as multimedia databases. His research group developed a software framework titled ELKI that is designed for the parallel research of index structures, data mining algorithms and their interact ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Technical University Of Dortmund
TU Dortmund University () is a technical university in Dortmund, North Rhine-Westphalia, Germany with over 35,000 students, and over 6,000 staff including 300 professors, offering around 80 Bachelor's and master's degree programs. It is situated in the Ruhr area, the fourth largest urban area in Europe. The university pioneered the Internet in Germany, and contributed to machine learning (in particular, to support-vector machines, and RapidMiner). History The University of Dortmund (German: ''Universität Dortmund'') was founded in 1968, during the decline of the coal and steel industry in the Ruhr region. Its establishment was seen as an important move in the economic change (''Strukturwandel'') from heavy industry to technology. The university's main areas of research are the natural sciences, engineering, pedagogy/teacher training in a wide spectrum of subjects, special education, and journalism. The University of Dortmund was originally designed to be a technical university ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Just-in-time Compilation
In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is compilation (of computer code) during execution of a program (at run time) rather than before execution. This may consist of source code translation but is more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code. JIT compilation is a combination of the two traditional approaches to translation to machine code— ahead-of-time compilation (AOT), and interpretation—and combines some advantages and drawbacks of both. Roughly, JIT compilation combines the speed of compiled code with the flexibility of interpretation, with the overhead of an interpreter and the additional overhead of compiling and linking (not j ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Affero General Public License
The GNU Affero General Public License (GNU AGPL) is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU GPL version 3 and the ''Affero General Public License'' (non-GNU). It is intended for software designed to be run over a network, adding a provision requiring that the corresponding source code of modified versions of the software be prominently offered to all users who interact with the software over a network. The Open Source Initiative approved the GNU AGPLv3 as an open source license in March 2008 after the company Funambol submitted it for consideration through its CEO Fabrizio Capobianco. History In 2000, while developing an e-learning and e-service business model at Mandriva, Henry Poole met with Richard Stallman in Amsterdam and discussed the issue of the GPLv2 license not requiring Web application providers to share source code with the users interacting with their software over a network. Over the followin ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Copyleft
Copyleft is the legal technique of granting certain freedoms over copies of copyrighted works with the requirement that the same rights be preserved in derivative works. In this sense, ''freedoms'' refers to the use of the work for any purpose, and the ability to modify, copy, share, and redistribute the work, with or without a fee. Licenses which implement copyleft can be used to maintain copyright conditions for works ranging from computer software, to documents, art, and scientific discoveries, and similar approaches have even been applied to Patentleft, certain patents. Copyleft software licenses are considered ''protective'' or ''reciprocal'' (in contrast with Permissive software license, permissive free software licenses): they require that information necessary for reproducing and modifying the work be made available to recipients of the software program. This information is most commonly in the form of source code files, which usually contain a copy of the license terms ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Database Management System
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database. Before digital storage and retrieval of data have become widespread, index cards were used for data storage in a wide range of applications and environments: in the home to record and store recipes, shopping lists, contact information and other organizational data; in business to record presentation notes, project research and notes, and contact information; in schools as flash ca ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Business Intelligence
Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information. Common functions of BI technologies include Financial reporting, reporting, online analytical processing, analytics, Dashboard (business), dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, Predictive Analysis, predictive analytics, and prescriptive analytics. BI tools can handle large amounts of structured and sometimes unstructured data to help organizations identify, develop, and otherwise create new strategic business opportunities. They aim to allow for the easy interpretation of these big data. Identifying new opportunities and implementing an effective strategy based on insights is assumed to potentially provide businesses with a competitive market advantage and long-term stability, and help them take strategic decisions. Busine ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Evaluation
In common usage, evaluation is a systematic determination and assessment of a subject's merit, worth and significance, using criteria governed by a set of Standardization, standards. It can assist an organization, program, design, project or any other intervention or initiative to assess any aim, realizable concept/proposal, or any alternative, to help in decision-making; or to generate the degree of achievement or value in regard to the aim and Goal, objectives and results of any such action that has been completed. The primary purpose of evaluation, in addition to gaining insight into prior or existing initiatives, is to enable Human self-reflection, reflection and assist in the identification of future change. Evaluation is often used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts, criminal justice, foundation (charity), foundations, non-profit organizations, government, health care, and other human services. It is long ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |