Apache IoTDB is a
column-oriented open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
,
time-series database (TSDB) management system written in
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
.
It has both edge and cloud versions, provides an optimized columnar file format for efficient time-series data storage, and TSDB with high ingestion rate, low latency queries and data analysis support. It is specially optimized for time-series oriented operations like
aggregations query,
downsampling and sub-sequence similarity search. The name IoTDB comes from
Internet of Things
Internet of things (IoT) describes devices with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other communication networks. The IoT encompasse ...
(IoT) Database, which means it was designed as an IoT-native TSDB that resolves the pain points of the typical IoT scenarios, including massive data generation, high frequency sampling, out-of-order data, specific analytics requirements, high costs of storage and operation & maintenance, low computational power of IoT devices.
History
Apache IoTDB is a project initiated by Prof. Jianmin Wang's team in the School of Software at
Tsinghua University
Tsinghua University (THU) is a public university in Haidian, Beijing, China. It is affiliated with and funded by the Ministry of Education of China. The university is part of Project 211, Project 985, and the Double First-Class Constructio ...
.
In 2011, the team chose to use open source
NoSQL
NoSQL (originally meaning "Not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which ...
technology instead of
Oracle
An oracle is a person or thing considered to provide insight, wise counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. If done through occultic means, it is a form of divination.
Descript ...
for a project with mass machine data management, and noticed the insufficiency of NoSQL in the
industrial internet of things (IIoT) scenarios. The team started to develop a data management system and formally proposed TsFile,
an optimized columnar compact file storage format for time series data, in March 2016. The source code was then opened on
GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
.
In June 2016, based on TsFile, the team began to develop IoTDB, an IIoT database supporting real-time reading & writing and analysis.
In November 2018, the project IoTDB entered incubator at
the Apache Software Foundation
The Apache Software Foundation ( ; ASF) is an American nonprofit corporation (classified as a 501(c)(3) organization in the United States) to support a number of open-source software projects. The ASF was formed from a group of developers of the A ...
(ASF).
On September 16, 2020, the ASF officially issued a resolution to promote Apache IoTDB to the global Top-Level Project (TLP) following a public discussion vote by the community and a show of hands vote by the board.
Architecture

The complete storage system of Apache IoTDB follows a client-server architecture, including IoTDB engine (server) and several components as IoTDB suite (client). IoTDB suite can provide a series of functions in the real situation such as data collection, data writing, data storage, data query, data visualization and data analysis. This allows data collected by the sensor to constantly persist in server, where the data can then be used for native query or shipped to other open-source platforms for data analysis. In particular, IoTDB provides a mode called "Edge-Cloud Cooperation", which can synchronize data collected at every user-configured interval from one IoTDB instance to another using Sync Tool.
Users can use
JDBC
Java Database Connectivity (JDBC) is an application programming interface (API) for the Java (programming language), Java programming language which defines how a client may access a database. It is a Java-based data access technology used for Java ...
to write time series data to local/remote IoTDB. This time series data may represent system state data (such as server load and CPU memory, etc.), message queue data, time series data from applications, or other time series data in the database. The data can be directly written to TsFile locally or on
Hadoop
Apache Hadoop () is a collection of Open-source software, open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for Clustered file system, distributed storage and processing of big data usin ...
Distributed File System (HDFS).
TsFile is a column storage file format developed for accessing, compressing and storing time series data in Apache IoTDB. Its structure is based on
LSM-Tree, which reduces the computational resources and optimizes the performance of Apache IoTDB.
TsFile could be written to the HDFS, thereby implementing data processing tasks such as abnormality detection and machine learning on the Hadoop or Spark data processing platform.
For the data written to HDFS or local TsFile, users can use TsFile-Hadoop-Connector or TsFile-Spark-Connector to allow Hadoop or Spark to process data. The results of the analysis can be written back to TsFile in the same way. Also, IoTDB and TsFile provide client tools to meet the various needs of users in writing and viewing data in
SQL
Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel")
is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
form, script form and graphical form.
Features
Flexible and cross-platform deployment
IoTDB is designed to fit three deployment scenarios: 1) file-based storage or embedded time-series database on edge appliance like Raspberry Pi, 2) standalone TSDB on
Industrial PC
An industrial PC is a computer intended for industrial purposes (Production (economics), production of Good (economics), goods and Service (economics), services), with a Computer form factor, form factor between a nettop and a 19-inch rack, se ...
and 3) distributed TSDB or Hadoop cluster with TsFile. IoTDB provides users a one-click installation tool on the cloud, once-decompressed-used terminal tool and the bridging tool between cloud platforms and terminal tools (Data Synchronization Tool).
Low storage cost
IoTDB can reach a high compression ratio of disk storage, which means IoTDB can store the same amount of data with less hardware disk cost.
Efficient directory structure
IoTDB supports efficient organization of complex time-series data structures from intelligent networking devices, organization of time-series data from devices of the same type, fuzzy searching strategy for massive and complex directory of time-series data.
High-throughput read and write
IoTDB supports millions of low-power devices' strong connection data access, high-speed data read and write for intelligent networking devices and mixed devices mentioned above. Currently, IoTDB supports the ingestion rate of up to 30 million data points per second on a single node.
Rich query semantics
IoTDB supports time alignment for timeseries data across devices and sensors, computation in timeseries field (frequency domain transformation) and rich aggregation function support in time dimension.
Easy to get started
IoTDB supports SQL-Like language, JDBC standard API and import/export tools which are easy to use.
Intense integration with open source ecosystem
IoTDB supports Hadoop,
Spark, etc. analysis ecosystems and
Grafana visualization tool.
Licensing
The Apache 2.0 License is a
permissive free software license
A free-software license is a notice that grants the recipient of a piece of software extensive rights to modify and redistribute that software. These actions are usually prohibited by copyright law, but the rights-holder (usually the author) ...
written by the Apache Software Foundation. It allows end users to modify parts of the original code as long as it contains the appropriate documentation that Apache requires within the redistributed code.
References
{{reflist
Proprietary database management systems
Embedded databases
Distributed data stores