Extract, load, transform
   HOME

TheInfoList



OR:

Extract, load, transform (ELT) is an alternative to
extract, transform, load Extract, transform, load (ETL) is a three-phase computing process where data is ''extracted'' from an input source, ''transformed'' (including cleaning), and ''loaded'' into an output data container. The data can be collected from one or mor ...
(ETL) used with
data lake A data lake is a system or data repository, repository of data stored in its natural/raw format, usually object binary large object, blobs or files. A data lake is usually a single store of data including raw copies of source system data, sensor ...
implementations. In contrast to ETL, in ELT models the
data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
is not transformed on entry to the data lake, but stored in its original raw format. This enables faster loading times. However, ELT requires sufficient processing power within the
data processing Data processing is the collection and manipulation of digital data to produce meaningful information. Data processing is a form of ''information processing'', which is the modification (processing) of information in any manner detectable by an o ...
engine to carry out the transformation on demand, to return the results in a timely manner. Since the data is not processed on entry to the data lake, the query and schema do not need to be defined a priori (although often the schema will be available during load since many data sources are extracts from databases or similar structured data systems and hence have an associated schema). ELT is a data
pipeline A pipeline is a system of Pipe (fluid conveyance), pipes for long-distance transportation of a liquid or gas, typically to a market area for consumption. The latest data from 2014 gives a total of slightly less than of pipeline in 120 countries ...
model.


Benefits

Some of the benefits of an ELT process include speed and the ability to handle both structured and unstructured data.


Cloud data lake components


Common storage options

*AWS ** Simple Storage Service (S3) ** Amazon RDS *Azure **Azure Blob Storage *GCP ** Google Storage (GCS)


Querying

*AWS **Redshift Spectrum ** Athena **EMR (Presto) *Azure ** Azure Data Lake *GCP **BigQuery


References


External links

*Dull, Tamara
"The Data Lake Debate: Pro is Up First"
''smartdatacollective.com'', March 20, 2015.
ELT: Extract, Load, and Transform A Complete Guide
, Astera Software Data warehousing {{Computing-stub