Vertica Systems is an
analytic
Generally speaking, analytic (from el, ἀναλυτικός, ''analytikos'') refers to the "having the ability to analyze" or "division into elements or principles".
Analytic or analytical can also have the following meanings:
Chemistry
* ...
database management software company. Vertica was founded in 2005 by the database researcher
Michael Stonebraker, with Andrew Palmer as the founding CEO. Ralph Breslauer and
Christopher P. Lynch served as later CEOs.
Lynch joined as chairman and CEO in 2010 and was responsible for Vertica's acquisition by
Hewlett Packard
The Hewlett-Packard Company, commonly shortened to Hewlett-Packard ( ) or HP, was an American multinational information technology company headquartered in Palo Alto, California. HP developed and provided a wide variety of hardware components ...
in March 2011. The acquisition expanded the
HP Software
Micro Focus International plc is a British multinational software and information technology business based in Newbury, Berkshire, England. The firm provides software and consultancy. The company is listed on the London Stock Exchange and ...
portfolio for enterprise companies and the public sector group. As part of the merger of
Micro Focus
Micro Focus International plc is a British multinational software and information technology business based in Newbury, Berkshire, England. The firm provides software and consultancy. The company is listed on the London Stock Exchange and is ...
and the Software division of
Hewlett Packard Enterprise
The Hewlett Packard Enterprise Company (HPE) is an American multinational information technology company based in Spring, Texas, United States.
HPE was founded on November 1, 2015, in Palo Alto, California, as part of the splitting of the ...
, Vertica joined Micro Focus in September, 2017.
Products
The column-oriented Vertica Analytics Platform was designed to manage large, fast-growing volumes of data and with fast query performance for
data warehouse
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. DWs are central repositories of integra ...
s and other query-intensive applications. The product claims to greatly improve query performance over traditional
relational database systems
Informix Corporation was a software company located in Menlo Park, California. It was a developer of relational database software for computers using the Unix, Microsoft Windows, and Apple Macintosh operating systems.
Timeline
* 1980: Rela ...
, and to provide high availability and
exabyte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
scalability on
commodity enterprise servers. Vertica runs on multiple
cloud computing systems as well as on
Hadoop nodes. Vertica's Eon Mode separates compute from storage, using
S3 object storage and dynamic allocation of compute notes.
Vertica's design features include:
*
Column-oriented storage organization, which increases performance of sequential record access at the expense of common transactional operations such as single record retrieval, updates, and deletes.
*
Massively parallel processing (MPP) architecture to distribute queries on independent nodes and scale performance linearly.
* Standard
SQL interface with many analytics capabilities built-in, such as time series gap filling/
interpolation
In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.
In engineering and science, one often has ...
, event-based windowing and sessionization,
pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
, event series joins, statistical computation (e.g.,
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
), and
geospatial analysis
Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early dev ...
.
* In-database
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
including categorization, fitting and prediction without down-sampling and data movement. Vertica offers a variety of in-database algorithms, including
linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
,
logistic regression
In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
,
''k''-means clustering,
Naive Bayes classification,
random forest decision trees,
XGBoost, and
support vector machine
In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...
regression and classification. It also allows deployment of ML models to multiple clusters.
*
High compression, possible because columns of homogeneous datatype are stored together and because updates to the main store are batched.
* Automated workload management, data replication, server recovery, query optimization, and storage optimization.
* Native integration with open source big data technologies like
Apache Kafka
Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency plat ...
and
Apache Spark
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Califor ...
.
* Support for standard programming interfaces, including
ODBC
In computing, Open Database Connectivity (ODBC) is a standard application programming interface (API) for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems. An ...
,
JDBC
Java Database Connectivity (JDBC) is an application programming interface (API) for the programming language Java, which defines how a client may access a database. It is a Java-based data access technology used for Java database connectivity. I ...
,
ADO.NET, and
OLEDB
OLE DB (''Object Linking and Embedding, Database'', sometimes written as OLEDB or OLE-DB), an API designed by Microsoft, allows accessing data from a variety of sources in a uniform manner. The API provides a set of interfaces implemented using ...
.
* High-performance and parallel data transfer to statistical tools and built-in
machine learning algorithms.
Vertica's specialized approach aims to significantly increase query performance in data warehouses, while reducing hardware costs.
Since 2011, Vertica has offered a limited-capacity community edition for free.
In July, 2021, Vertica announced an
SaaS offering, Vertica Accelerator, running on
Amazon AWS
Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide di ...
.
Optimizations
Vertica originated as the
C-Store column-oriented database
A column-oriented DBMS or columnar DBMS is a database management system (DBMS) that stores data tables by column rather than by row. Benefits include more efficient access to data when only querying a subset of columns (by eliminating the need to ...
, an
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
research project at MIT and other universities, published in 2005.
Vertica runs on
clusters of
commodity servers or on commercial clouds. It integrates with
Hadoop, using
HDFS
Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage an ...
.
In 2018, Vertica introduced Vertica in Eon Mode, a separation of compute and storage architecture. The Eon architecture allows for elastic increase and decrease in compute capability as needed for workload elasticity. It also allows instantiation of multiple isolated sub-clusters dedicated to different workloads while maintaining a single shared data repository. It operates on shared object storage in the cloud, and also runs on object storage compatible hardware on-premises for private cloud implementations.
Version 10.1.1 of Vertica introduced
Docker and Kubernetes support.
Many BI, data visualization, and ETL tools work with Vertica Analytics Platform. Vertica supports
Kafka
Franz Kafka (3 July 1883 – 3 June 1924) was a German-speaking Bohemian novelist and short-story writer, widely regarded as one of the major figures of 20th-century literature. His work fuses elements of realism and the fantastic. It typi ...
for streaming data ingestion.
In 2021, Vertica released a connector for
Spark.
Vertica also integrates with Grafana, Helm, Go, and Distributed R.
Company events
In January 2008,
Sybase
Sybase, Inc. was an enterprise software and services company. The company produced software to manage and analyze information in relational databases, with facilities located in California and Massachusetts. Sybase was acquired by SAP in 2010; ...
filed a patent-infringement lawsuit against Vertica. In January 2010, Vertica prevailed in a preliminary hearing, and in June, 2010, Sybase and Vertica resolved the suit, with the court dismissing all infringement claims.
[Vertica Press Release, "Vertica Resolves Sybase Patent Lawsuits" http://www.vertica.com/news/press/vertica-resolves-sybase-patent-lawsuits/]
Since 2013, Vertica has held an annual user conference, now called Vertica Unify.
References
{{Reflist, 2
External links
Official websiteUnofficial Vertica User Google GroupVertica GithubVertica on DockerHub
Software companies based in Massachusetts
Hewlett-Packard acquisitions
Software companies of the United States