Spatial extract, transform, load (spatial ETL), also known as geospatial transformation and load (GTL), is a process for managing and manipulating
geospatial data, for example
map
A map is a symbolic depiction of interrelationships, commonly spatial, between things within a space. A map may be annotated with text and graphics. Like any graphic, a map may be fixed to paper or other durable media, or may be displayed on ...
data. It is a type of
extract, transform, load
Extract, transform, load (ETL) is a three-phase computing process where data is ''extracted'' from an input source, ''transformed'' (including cleaning), and ''loaded'' into an output data container. The data can be collected from one or mor ...
(ETL) process, with software tools and libraries specialised for geographical information.
[
A common use of spatial ETL is to convert geographical information from a data source into another format that can be more easily used, for example by importing it into ]GIS software
A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, Spatial analysis, analyze, and Cartographic design, visualize Geographic data and informati ...
. A tool may translate data directly from one format to another, or via an intermediate format. Intermediate formats are often used when data transformation
In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integrationCIO.com. Agile Comes to Data Integration. Retrieved from: https ...
must be carried out.
Origins and history
Although ETL tools for processing non-spatial data have existed for some time, ETL tools that can manage the unique characteristics of spatial data only emerged in the early 1990s.
Spatial ETL tools emerged in the GIS industry to enable interoperability (or the exchange of information) between the industry's diverse array of mapping applications and associated proprietary formats. However, spatial ETL tools are also becoming increasingly important in the realm of management information system
A management information system (MIS) is an information system used for decision-making, and for the coordination, control, analysis, and visualization of information in an organization. The study of the management information systems involves peo ...
s as a tool to help organizations integrate spatial data with their existing non-spatial databases, and also to leverage their spatial data assets to develop more competitive business strategies.
Traditionally, GIS applications have had the ability to read or import a limited number of spatial data formats, but with few specialist ETL transformation tools; the concept being to import data then carry out step-by-step transformation or analysis within the GIS application itself. Conversely, spatial ETL does not require the user to import or view the data, and generally carries out its tasks in a single predefined process.
With the push to achieve greater interoperability
Interoperability is a characteristic of a product or system to work with other products or systems. While the term was initially defined for information technology or systems engineering services to allow for information exchange, a broader de ...
within the GIS industry, many existing GIS applications are now incorporating spatial ETL tools within their products; the ArcGIS
ArcGIS is a family of client, server and online geographic information system (GIS) software developed and maintained by Esri.
ArcGIS was first released in 1982 as ARC/INFO, a command line-based GIS. ARC/INFO was later merged into ArcGIS De ...
Data Interoperability Extension being an example of this.
Transformation
The transformation phase of a spatial ETL process allows a variety of functions; some of these are similar to standard ETL, but some are unique to spatial data. Spatial data commonly consists of a geographic element and related attribute data; therefore spatial ETL transformations are often described as being either ''geometric transformations'' – transformation of the geographic element – or ''attribute transformations'' – transformations of the related attribute data.
Common geospatial transformations
* Reprojection: the ability to convert spatial data between one coordinate system and another.
*Spatial transformations: the ability to model spatial interactions and calculate spatial predicates
*Topological transformations: the ability to create topological relationships between disparate datasets
*Resymbolisation: the ability to change the cartographic characteristics of a feature, such as colour or line-style
*Geocoding
Address geocoding, or simply geocoding, is the process of taking a text-based description of a location, such as an address or the name of a place, and returning geographic coordinates, frequently latitude/longitude pair, to identify a locati ...
: the ability to convert attributes of tabular data into spatial data
Additional features
Desirable features of a spatial ETL application are:
*Data comparison: Ability to carry out change detection and perform incremental updates
*Conflict management: Ability to manage conflicts between multiple users of the same data
*Data dissemination: Ability to publish data via the internet or deliver by email regardless of source format
*Semantic processing: Ability to understand the rules of different data formats to minimize user input whilst preserving meaning
Uses
Spatial ETL has a number of distinct uses:
*Data cleansing
Data cleansing or data cleaning is the process of identifying and correcting (or removing) corrupt, inaccurate, or irrelevant records from a dataset, table, or database. It involves detecting incomplete, incorrect, or inaccurate parts of the dat ...
: The removal of errors within a dataset
* Data merging: The bringing together of multiple datasets into a common framework – conflation
Conflation is the merging of two or more sets of information, texts, ideas, or opinions into one, often in error. Conflation is defined as 'fusing blending', but is often used colloquially as 'being equal to' - treating two similar but disparate c ...
is a good example of this
*Data verification Data verification is a process in which different types of data are checked for accuracy and inconsistencies after data migration is done. In some domains it is referred to Source Data Verification (SDV), such as in clinical trials.
Data verificat ...
: The comparison of multiple datasets for verification and quality assurance purposes
*Data conversion
Data conversion is the conversion of computer data from one format to another. Throughout a computer environment, data is encoded in a variety of ways. For example, computer hardware is built on the basis of certain standards, which requires ...
: Conversion between different data formats.
Examples of spatial ETL tools
* FME (Feature Manipulation Engine)[
* ]GDAL
The Geospatial Data Abstraction Library (GDAL) is a computer software library for reading and writing raster and vector geospatial data formats (e.g. shapefile), and is released under the permissive X/MIT style free software license by the ...
(Geospatial Data Abstraction Library)[
]
See also
* Business intelligence
Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information. Common functions of BI technologies include Financial reporting, reporting, online an ...
* Object–relational database
An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inherit ...
* Spatial database
A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.
Most ...
References
{{DEFAULTSORT:Spatial Etl
Geographic information systems
Extract, transform, load tools