HOME

TheInfoList



OR:

Prometheus is a
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
application used for
event monitoring {{more citations, date=August 2017 In computer science, event monitoring is the process of collecting, analyzing, and signaling event occurrences to subscribers such as operating system processes, active database rules as well as human operators. T ...
and alerting. It records real-time metrics in a
time series database A time series database (TSDB) is a software system that is optimized for storing and serving time series through associated pairs of time(s) and value(s). In some fields, ''time series'' may be called profiles, curves, traces or trends. Several ea ...
(allowing for high
dimensionality In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordin ...
) built using a
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
pull model, with flexible queries and real-time alerting. The project is written in Go and licensed under the Apache 2 License, with
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the wo ...
available on
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
, and is a graduated project of the
Cloud Native Computing Foundation The Cloud Native Computing Foundation (CNCF) is a Linux Foundation project that was founded in 2015 to help advance container technology and align the tech industry around its evolution. It was announced alongside Kubernetes 1.0, an open sour ...
, along with
Kubernetes Kubernetes (, commonly stylized as K8s) is an open-source container orchestration system for automating software deployment, scaling, and management. Google originally designed Kubernetes, but the Cloud Native Computing Foundation now maintains ...
and
Envoy Envoy or Envoys may refer to: Diplomacy * Diplomacy, in general * Envoy (title) * Special envoy, a type of diplomatic rank Brands *Airspeed Envoy, a 1930s British light transport aircraft *Envoy (automobile), an automobile brand used to sell Br ...
.


History

Prometheus was developed at
SoundCloud SoundCloud is an online audio distribution platform and music sharing website that enables its users to upload, promote, and share audio. Founded in 2007 by Alexander Ljung and Eric Wahlforss, SoundCloud is one of the largest music streaming se ...
starting in 2012, when the company discovered that its existing metrics and monitoring solutions (using StatsD and
Graphite Graphite () is a crystalline form of the element carbon. It consists of stacked layers of graphene. Graphite occurs naturally and is the most stable form of carbon under standard conditions. Synthetic and natural graphite are consumed on large ...
) were not sufficient for their needs. Specifically, they identified needs that Prometheus was built to meet including: a multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language, all in a single tool. The project was open-source from the beginning and began to be used by Boxever and Docker users as well, despite not being explicitly announced. Prometheus was inspired by the monitoring tool Borgmon used at Google. By 2013, Prometheus was introduced for production monitoring at SoundCloud. The official public announcement was made in January 2015. In May 2016, the
Cloud Native Computing Foundation The Cloud Native Computing Foundation (CNCF) is a Linux Foundation project that was founded in 2015 to help advance container technology and align the tech industry around its evolution. It was announced alongside Kubernetes 1.0, an open sour ...
accepted Prometheus as its second incubated project, after
Kubernetes Kubernetes (, commonly stylized as K8s) is an open-source container orchestration system for automating software deployment, scaling, and management. Google originally designed Kubernetes, but the Cloud Native Computing Foundation now maintains ...
. The blog post announcing this stated that the tool was in use at many companies including
DigitalOcean DigitalOcean Holdings, Inc. () is an American multinational technology company and cloud service provider. The company is headquartered in New York City, New York, USA, with 15 globally distributed data centers worldwide. DigitalOcean provides ...
,
Ericsson (lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish multinational networking and telecommunications company headquartered in Stockholm. The company sells infrastructure, software, and services in informat ...
,
CoreOS Container Linux (formerly CoreOS Linux) is a discontinued open-source lightweight operating system based on the Linux kernel and designed for providing infrastructure to clustered deployments, while focusing on automation, ease of application ...
, Weaveworks,
Red Hat Red Hat, Inc. is an American software company that provides open source software products to enterprises. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. Red Hat has become ass ...
, and
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
. Prometheus 1.0 was released in July 2016. Subsequent versions were released through 2016 and 2017, leading to Prometheus 2.0 in November 2017. In August 2018, the Cloud Native Computing Foundation announced that the Prometheus project had graduated.


Architecture

A typical monitoring platform with Prometheus is composed of multiple tools: * Multiple ''exporters'' typically run on the monitored host to export local metrics. * Prometheus to centralize and store the metrics. * ''Alertmanager'' to trigger alerts based on those metrics. * ''
Grafana Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources. A licensed Grafana Enterprise version with additional ...
'' to produce dashboards. * ''PromQL'' is the query language used to create dashboards and alerts.


Data storage format

Prometheus data is stored in the form of metrics, with each metric having a name that is used for referencing and querying it. Each metric can be drilled down by an arbitrary number of key=value pairs (labels). Labels can include information on the data source (which server the data is coming from) and other application-specific breakdown information such as the HTTP status code (for metrics related to HTTP responses), query method (GET versus POST), endpoint, etc. The ability to specify an arbitrary list of labels and to query based on these in real time is why Prometheus' data model is called multi-dimensional. Prometheus stores data locally on disk, which helps for fast data storage and fast querying. There is the ability to store metrics in remote storage.


Data collection

Prometheus collects data in the form of
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
. The time series are built through a pull model: the Prometheus server queries a list of data sources (sometimes called exporters) at a specific polling frequency. Each of the data sources serves the current values of the metrics for that data source at the endpoint queried by Prometheus. The Prometheus server then aggregates data across the data sources. Prometheus has a number of mechanisms to automatically discover resources that should be used as data sources.


PromQL

Prometheus provides its own query language PromQL (Prometheus Query Language) that lets users select and aggregate data. PromQL is specifically adjusted to work in convention with a Time-Series Database and therefore provides time-related query functionalities. Examples include the rate() function, the instant vector and the range vector which can provide many samples for each queried time series. Prometheus has four clearly defined metric types around which the PromQL components revolve. The four types are * Gauge * Counter * Histogram * Summary


Alerts and monitoring

Configuration for alerts can be specified in Prometheus which specifies a condition that needs to be maintained for a specific duration in order for an alert to trigger. When alerts trigger, they are forwarded to the Alertmanager service. Alertmanager can include logic to silence alerts and also to forward them to email, Slack, or notification services such as
PagerDuty PagerDuty is an American cloud computing company specializing in a SaaS incident response platform for IT departments. It has been recognized by ''Forbes'' on its "Cloud 100" as well as the ''USA Today'' list for the top small and mid-sized compa ...
. Some other messaging systems like
Microsoft Teams Microsoft Teams is a proprietary business communication platform developed by Microsoft, as part of the Microsoft 365 family of products. Teams primarily competes with the similar service Slack, offering workspace chat and videoconferencin ...
could be configured using the Alertmanager Webhook Receiver as a mechanism for external integrations. also Prometheus Alerts can be used to receive alerts directly on android devices even without the requirement of any targets configuration in Alert Manager.


Dashboards

Prometheus is not intended as a dashboarding solution. Although it can be used to graph specific queries, it is not a full-fledged dashboarding solution and needs to be hooked up with
Grafana Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources. A licensed Grafana Enterprise version with additional ...
to generate dashboards; this has been cited as a disadvantage due to the additional setup complexity.


Interoperability

Prometheus favors white-box monitoring. Applications are encouraged to publish (export) internal metrics to be collected periodically by Prometheus. Some exporters and agents for various applications are available to provide metrics. Prometheus supports some monitoring and administration protocols to allow interoperability for transitioning:
Graphite Graphite () is a crystalline form of the element carbon. It consists of stacked layers of graphene. Graphite occurs naturally and is the most stable form of carbon under standard conditions. Synthetic and natural graphite are consumed on large ...
, StatsD,
SNMP Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behaviour. Devices that typically ...
, JMX, and CollectD. Prometheus focuses on the availability of the platform and basic operations. The metrics are typically stored for a few weeks. For long-term storage, the metrics can be streamed to remote storage solutions.


Standardization into OpenMetrics

There is an effort to promote Prometheus exposition format into a standard known as OpenMetrics. Some products adopted the format: InfluxData's TICK suite,
InfluxDB InfluxDB is an open-source time series database (TSDB) developed by the company InfluxData. It is written in the Go programming language for storage and retrieval of time series data in fields such as operations monitoring, application metr ...
,
Google Cloud Platform Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail, Google Drive, and YouTube. Alongside ...
, and DataDog.


Usage

Prometheus was first used in-house at SoundCloud, where it was developed, for monitoring their systems. The Cloud Native Computing Foundation has a number of case studies of other companies using Prometheus. These include digital hosting service
DigitalOcean DigitalOcean Holdings, Inc. () is an American multinational technology company and cloud service provider. The company is headquartered in New York City, New York, USA, with 15 globally distributed data centers worldwide. DigitalOcean provides ...
, digital festival DreamHack, and email and contact migration service ShuttleCloud. Separately,
Pandora Radio Pandora is a subscription-based music streaming service owned by Sirius XM Holdings based in Oakland, California, United States. The service carries a focus on recommendations based on the "Music Genome Project" — a means of classifying indiv ...
has mentioned using Prometheus to monitor its data pipeline.
GitLab GitLab Inc. is an open-core company that operates GitLab, a DevOps software package which can develop, secure, and operate software. The open source software project was created by Ukrainian developer Dmitriy Zaporozhets and Dutch developer S ...
provides a Prometheus integration guide to export GitLab metrics to Prometheus and it is activated by default since version 9.0


Conferences

A variety of conferences and attached conferences which focused on Prometheus and its ecosystem have been held * PromCon 2016, August 25 & 26,
Berlin Berlin ( , ) is the capital and largest city of Germany by both area and population. Its 3.7 million inhabitants make it the European Union's most populous city, according to population within city limits. One of Germany's sixteen constitue ...
, 80 attendees (sold out) * PrometheusDay 2016,
Seattle Seattle ( ) is a seaport city on the West Coast of the United States. It is the seat of King County, Washington. With a 2020 population of 737,015, it is the largest city in both the state of Washington and the Pacific Northwest regio ...
* PromCon 2017, August 17 & 18,
Munich Munich ( ; german: München ; bar, Minga ) is the capital and most populous city of the States of Germany, German state of Bavaria. With a population of 1,558,395 inhabitants as of 31 July 2020, it is the List of cities in Germany by popu ...
, 220 attendees (sold out) * PromCon 2018, August 09 & 10, Munich, 220 attendees (sold out) * PromCon 2019, November 07 & 08, Munich, 220 attendees (sold out) * PromCon 2020, July 14 - 16, Online * PromCon 2021, May 3, Online * PromCon North America 2021, October 11,
Los Angeles Los Angeles ( ; es, Los Ángeles, link=no , ), often referred to by its initials L.A., is the largest city in the state of California and the second most populous city in the United States after New York City, as well as one of the world' ...
* PrometheusDay Europe 2022, May 17,
Valencia Valencia ( va, València) is the capital of the Autonomous communities of Spain, autonomous community of Valencian Community, Valencia and the Municipalities of Spain, third-most populated municipality in Spain, with 791,413 inhabitants. It is ...
, ~400 attendees * PrometheusDay North America 2022, October 25,
Detroit Detroit ( , ; , ) is the largest city in the U.S. state of Michigan. It is also the largest U.S. city on the United States–Canada border, and the seat of government of Wayne County. The City of Detroit had a population of 639,111 at th ...
* PromCon 2022, November 8 & 9, Munich


See also

*
Check MK Checkmk is software developed in Python and C++ for IT Infrastructure monitoring. It is used for the monitoring of servers, applications, networks, cloud infrastructures (public, private, hybrid), containers, storage, databases and environmen ...
*
Ganglia (software) Ganglia is a scalable, distributed monitoring tool for high-performance computing systems, clusters and networks. The software is used to view either live or recorded statistics covering metrics such as CPU load averages or network utilization f ...
*
Zabbix Zabbix is an open-source software tool to monitor IT infrastructure such as networks Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discre ...
*
Comparison of network monitoring systems The following tables compare general and technical information for a number of notable network monitoring systems. Please see the individual products' articles for further information. Features Legend ; Product Name : The name ...
*
List of systems management systems This is a list of notable systems management systems. __TOC__ Overview {, class="wikitable sortable" style="width: 90%; font-size: smaller; text-align: center" , - ! style="text-align: left" rowspan="2" , System ! rowspan="2" , Creator ! ...


References


Further reading

* * * * * * {{cite book, title=Native Docker Clustering with Swarm , first=Chanwit , last=Kaewkasi , year=2016 , isbn=978-1786469755 Software using the Apache license System monitors Time series software Free software programmed in Go Management systems Systems management