Data Build Tool
   HOME

TheInfoList



OR:

data build tool (dbt) is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
command line tool that helps analysts and engineers transform data in their
warehouse A warehouse is a building for storing goods. Warehouses are used by manufacturers, importers, exporters, wholesalers, transport businesses, customs, etc. They are usually large plain buildings in industrial parks on the outskirts of cities ...
more effectively.


History

It started at RJMetrics in 2016 as a solution to add basic transformation capabilities to Stitch (acquired by Talend in 2018). The earliest versions of dbt allowed analysts to contribute to the data transformation process following the best practices of software engineering. From the beginning, dbt was open source. In 2018, the dbt Labs team (then called Fishtown Analytics) released a commercial product on top of dbt Core.


Funding

In April 2020, dbt Labs announced its Series A led by Andreessen Horowitz. In November, dbt Labs announced its Series B led by Andreessen Horowitz and Sequoia. And in June 2021, dbt Labs raised its Series C led by Altimeter, Sequoia, and Andreessen Horowitz. In February 2022, the company raised $222 million for its Series D, at a $4.2 billion valuation


Overview

dbt enables analytics engineers to transform data in their warehouses by writing select statements, and turns these select statements into tables and views. dbt does the transformation (T) in extract, load, transform (ELT) processes – it does not extract or load data, but is designed to be performant at transforming data already inside of a warehouse. dbt has the goal of allowing analysts to work more like software engineers, in line with the dbt viewpoint. dbt uses YAML files to declare properties. seed is a type of reference table used in dbt for static or infrequently changed data, like for example
country codes Country codes are short alphabetic or numeric geographical codes ( geocodes) developed to represent countries and dependent areas, for use in data processing and communications. Several different systems have been developed to do this. The term ...
or
lookup table In computer science, a lookup table (LUT) is an array that replaces runtime computation with a simpler array indexing operation. The process is termed as "direct addressing" and LUTs differ from hash tables in a way that, to retrieve a value v wi ...
s), which are CSV based and typically stored in a seeds folder.


References

{{DEFAULTSORT:dbt Data warehousing Business analytics Free software programmed in Python