Pentaho Data Integration
   HOME

TheInfoList



OR:

Pentaho is business intelligence (BI) software that provides data integration, OLAP services, reporting, information dashboards, data mining and extract, transform, load (ETL) capabilities. Its headquarters are in Orlando, Florida. Pentaho was acquired by Hitachi Data Systems in 2015 and in 2017 became part of Hitachi Vantara.


Overview

Pentaho is a Java framework to create Business Intelligence solutions. Although most known for its Business Analysis Server (formerly known as Business Intelligence Server), the Pentaho software is indeed a couple of Java classes with specific functionality. On top of those Java classes one can build any BI solution. The only exception to this model is the ETL tool Pentaho Data Integration - PDI (formerly known as Kettle.) PDI is a set of softwares used to design data flows that can be run either in a server or standalone processes. PDI encompasses Kitchen, a job and transformation runner, and Spoon, a graphical user interface to design such jobs and transformations. Features such as reporting and OLAP are achieved by integrating subprojects into the Pentaho framework, like Mondrian OLAP engine and jFree Report. For some time by now those projects have been brought into Pentaho's curating. Some of those subprojects even have standalone clients like Pentaho Report Designer, a front-end for jFree Reports, and Pentaho Schema Workbench, a GUI to write XMLs used by Mondrian to serve OLAP cubes. Pentaho offers enterprise and community editions of those softwares. The enterprise software is obtained through an annual subscription and contains extra features and support not found in the community edition. Pentaho's core offering is frequently enhanced by add-on products, usually in the form of plug-ins, from the company and the broader community of users.


Products


Server applications

Pentaho Enterprise Edition (EE) and Pentaho Community Edition (CE).


Desktop/client applications


Community driven, open-source Pentaho server plug-ins

All of these plug-ins function with Pentaho Enterprise Edition (EE) and Pentaho Community Edition (CE).


Licensing

Pentaho follows an open core business model. It provides two different editions of Pentaho Business Analytics: a community edition and an enterprise edition. The enterprise edition needs to be purchased on a
subscription The subscription business model is a business model in which a customer must pay a recurring price at regular intervals for access to a product or service. The model was pioneered by publishers of books and periodicals in the 17th century, and ...
model. The subscription model includes support, services, and product enhancements via annual subscription.Torben Pedersen and Mukesh Mohania.
Data Warehousing and Knowledge Discovery
" Heidelberg, Germany: Springer Science and Business Media, 2009. . p.296-298. Retrieved April 6, 2012.
The enterprise edition is available under a commercial license. Enterprise license goes with 3 levels o

Enterprise, Premium and Standard. The community edition is a free open source product licensed under the GNU General Public License version 2.0 (GPLv2), GNU Lesser General Public License version 2.0 (LGPLv2), and Mozilla Public License 1.1 (MPL 1.1).


Recognition

* InfoWorld Bossie Award 2008, 2009, 2010, 2011, 2012 * Ventana Research Leadership Award 2010 for StoneGate Senior Care * CRN Emerging Technology Vendor 201

* ROI Awards 2012 - Nucleus ResearchNucleus Research
/ref>


See also

* Nutch - an effort to build an open source search engine based on Lucene and Hadoop, also created by Doug Cutting *
Apache Accumulo Apache Accumulo is a highly scalable sorted, distributed key-value store based on Google's Bigtable. It is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and ...
- Secure Big Table * HBase - Bigtable-model database * Hypertable - HBase alternative * MapReduce - Google's fundamental data filtering algorithm * Apache Mahout - machine learning algorithms implemented on Hadoop * Apache Cassandra - a column-oriented database that supports access from Hadoop * HPCC - LexisNexis Risk Solutions High Performance Computing Cluster * Sector/Sphere - open-source distributed storage and processing * Cloud computing *
Big data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...
* Data-intensive computing


References


External links

* {{Authority control Business intelligence companies Free business software Free reporting software Extract, transform, load tools