HOME

TheInfoList



OR:

KNIME (), the Konstanz Information Miner, is a
free and open-source Free and open-source software (FOSS) is a term used to refer to groups of software consisting of both free software and open-source software where anyone is freely licensed to use, copy, study, and change the software in any way, and the source ...
data analytics, reporting and integration platform. KNIME integrates various components for
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
and data mining through its modular data pipelining "Building Blocks of Analytics" concept. A
graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
and use of
JDBC Java Database Connectivity (JDBC) is an application programming interface (API) for the programming language Java, which defines how a client may access a database. It is a Java-based data access technology used for Java database connectivity. I ...
allows assembly of nodes blending different data sources, including preprocessing ( ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming. Since 2006, KNIME has been used in pharmaceutical research, it also used in other areas such as CRM customer data analysis,
business intelligence Business intelligence (BI) comprises the strategies and technologies used by enterprises for the data analysis and management of business information. Common functions of business intelligence technologies include reporting, online analytical pr ...
, text mining and financial data analysis. Recently attempts were made to use KNIME as robotic process automation (RPA) tool. KNIME's headquarters are based in Zurich, with additional offices in Konstanz, Berlin, and Austin (USA).


History

The Development of KNIME was started January 2004 by a team of software engineers at
University of Konstanz The University of Konstanz (german: Universität Konstanz) is a university in the city of Konstanz in Baden-Württemberg, Germany. Its main campus was opened on the Gießberg in 1972 after being founded in 1966. The university is Germany's ...
as a proprietary product. The original developer team headed by
Michael Berthold Michael R. Berthold is a German computer scientist, entrepreneur, academic and author. He is a professor, and chair for bioinformatics and information mining at Konstanz University, and an honorary professor at Óbuda University. He is also t ...
came from a company in Silicon Valley providing software for the pharmaceutical industry. The initial goal was to create a modular, highly scalable and open data processing platform that allowed for the easy integration of different data loading, processing, transformation, analysis and visual exploration modules without the focus on any particular application area. The platform was intended to be a collaboration and research platform and also serve as an integration platform for various other data analysis projects. In 2006 the first version of KNIME was released and several pharmaceutical companies started using KNIME and a number of life science software vendors began integrating their tools into KNIME. Later that year, after an article in the German magazine ''
c't ''c't'' – ' (''Magazine for Computer Technology'') is a German computer magazine, published by the Heinz Heise, Heinz Heise publishing house. file:Ct jubilaeum 30 turm.jpg, The 5.71 meter high tower from the 587 published c't editions up to t ...
'', users from a number of other areas joined ship. As of 2012, KNIME is in use by over 15,000 actual users (i.e. not counting downloads but users regularly retrieving updates when they become available) not only in the life sciences and also at banks, publishers, car manufacturer, telcos, consulting firms, and various other industries as well as at a large number of research groups worldwide. Latest updates to KNIME Server and KNIME Big Data Extensions, provide support for Apache Spark 2.3, Parquet and HDFS-type storage. For the sixth year in a row, KNIME has been placed as a leader for Data Science and Machine Learning Platforms in Gartner's
Magic Quadrant Magic Quadrant (MQ) is a series of market research reports published by IT consulting firm Gartner that rely on proprietary qualitative data analysis methods to demonstrate market trends, such as direction, maturity and participants. Their anal ...
.


Internals

KNIME allows users to visually create data flows (or pipelines), selectively execute some or all analysis steps, and later inspect the results, models, using interactive widgets and views. KNIME is written in Java and based on
Eclipse An eclipse is an astronomical event that occurs when an astronomical object or spacecraft is temporarily obscured, by passing into the shadow of another body or by having another body pass between it and the viewer. This alignment of three ce ...
. It makes use of extension mechanism to add plugins providing additional functionality. The core version already includes hundreds of modules for data integration (file I/O, database nodes supporting all common database management systems through JDBC or native connectors: SQLite, MS-Access, SQL Server, MySQL, Oracle, PostgreSQL, Vertica and H2), data transformation (filter, converter, splitter, combiner, joiner) as well as the commonly used methods of statistics, data mining, analysis and text analytics. Visualization supports with the free Report Designer extension. KNIME workflows can be used as data sets to create report templates that can be exported to document formats such as doc, ppt, xls, pdf and others. Other capabilities of KNIME are: * KNIMEs core-architecture allows processing of large data volumes that are only limited by the available hard disk space (not limited to the available RAM). E.g. KNIME allows analysis of 300 million customer addresses, 20 million cell images and 10 million molecular structures. * Additional plugins allows the integration of methods for
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
, Image mining, as well as time series analysis and network. * KNIME integrates various other open-source projects, e.g. machine learning algorithms from
Weka The weka, also known as the Māori hen or woodhen (''Gallirallus australis'') is a flightless bird species of the rail family. It is endemic to New Zealand. It is the only extant member of the genus '' Gallirallus''. Four subspecies are recogni ...
, H2O.ai,
Keras Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library. Up until version 2.3, Keras supported multiple backends, including TensorFlow, Mic ...
,
Spark Spark commonly refers to: * Spark (fire), a small glowing particle or ember * Electric spark, a form of electrical discharge Spark may also refer to: Places * Spark Point, a rocky point in the South Shetland Islands People * Spark (surname) * ...
, the R project and
LIBSVM LIBSVM and LIBLINEAR are two popular open source machine learning libraries, both developed at the National Taiwan University and both written in C++ though with a C API. LIBSVM implements the Sequential minimal optimization (SMO) algorithm ...
; as well as plotly, JFreeChart,
ImageJ ImageJ is a Java-based image processing program developed at the National Institutes of Health and the Laboratory for Optical and Computational Instrumentation (LOCI, University of Wisconsin). Its first version, ImageJ 1.x, is developed in the pub ...
, and the
Chemistry Development Kit The Chemistry Development Kit (CDK) is computer software, a library in the programming language Java, for chemoinformatics and bioinformatics. It is available for Windows, Linux, Unix, and macOS. It is free and open-source software distributed und ...
. KNIME is implemented in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
nevertheless it allows for wrappers calling other code in addition to providing nodes that allow to run
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, R,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
and other code fragments.


License

As of version 2.1, KNIME is released under GPLv3 with an exception that allows others to use the well defined node API to add proprietary extensions. This allows also commercial SW vendors to add wrappers calling their tools from KNIME.


KNIME Courses

KNIME provides two lines of online courses based on Data Wrangling and Data Science lines.the new learning path
/ref>


See also

*
Weka The weka, also known as the Māori hen or woodhen (''Gallirallus australis'') is a flightless bird species of the rail family. It is endemic to New Zealand. It is the only extant member of the genus '' Gallirallus''. Four subspecies are recogni ...
– machine-learning algorithms that can be integrated in KNIME *
ELKI ELKI (for ''Environment for DeveLoping KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally at the database s ...
– data mining framework with many clustering algorithms *
Keras Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library. Up until version 2.3, Keras supported multiple backends, including TensorFlow, Mic ...
- neural network library *
Orange Orange most often refers to: *Orange (fruit), the fruit of the tree species '' Citrus'' × ''sinensis'' ** Orange blossom, its fragrant flower *Orange (colour), from the color of an orange, occurs between red and yellow in the visible spectrum * ...
- an open-source data visualization, machine learning and data mining toolkit with a similar visual programming front-end *
List of free and open-source software packages This is a list of free and open-source software packages, computer software licensed under free software licenses and open-source licenses. Software that fits the Free Software Definition may be more appropriately called free software; the GNU p ...


References


External links


KNIME HomepageKNIME Hub
- Official community plattform to search and find nodes, components, workflows and collaborate on new solutions
Nodepit
- KNIME node collection supporting versioning and node installation {{DEFAULTSORT:Knime Extract, transform, load tools Free software programmed in Java (programming language) Free software projects Free bioinformatics software Data mining and machine learning software Image processing software