HOME

TheInfoList



OR:

A data mart is a structure/access pattern specific to ''
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
'' environments, used to retrieve client-facing data. The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team. Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department. In some deployments, each department or business unit is considered the ''owner'' of its data mart including all the ''hardware'', ''software'' and ''data''. This enables each department to isolate the use, manipulation and development of their data. In other deployments where conformed dimensions are used, this business unit owner will not hold true for shared dimensions like customer, product, etc. Warehouses and data marts are built because the information in the database is not organized in a way that makes it readily accessible. This organization requires queries that are too complicated, difficult to access or resource intensive. While transactional databases are designed to be updated, data warehouses or marts are read only. Data warehouses are designed to access large groups of related records. Data marts improve end-user response time by allowing users to have access to the specific type of data they need to view most often, by providing the data in a way that supports the collective view of a group of users. A data mart is basically a condensed and more focused version of a data warehouse that reflects the regulations and process specifications of each business unit within an organization. Each data mart is dedicated to a specific business function or region. This subset of data may span across many or all of an enterprise's functional subject areas. It is common for multiple data marts to be used in order to serve the needs of each individual business unit (different data marts can be used to obtain specific information for various enterprise departments, such as accounting, marketing, sales, etc.). The related term spreadmart is a pejorative describing the situation that occurs when one or more business analysts develop a system of linked
spreadsheets A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in ce ...
to perform a business analysis, then grow it to a size and degree of complexity that makes it nearly impossible to maintain. The term for this condition is "Excel Hell".


Data mart vs data warehouse

Data warehouse: * Holds multiple subject areas * Holds very detailed information * Works to integrate all data sources * Does not necessarily use a dimensional model but feeds dimensional models. Data mart: * Often holds only one subject area- for example, Finance, or Sales * May hold more summarized data (although it may hold full detail) * Concentrates on integrating information from a given subject area or set of source systems * Is built focused on a dimensional model using a star schema.


Design schemas

*
Star schema In computing, the star schema is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dim ...
- fairly popular design choice; enables a
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
to emulate the analytical functionality of a
multidimensional database Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repor ...
*
Snowflake schema In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake shape. The snowflake schema is represented by centralized fact tables which ...
*
Activity schema Activity may refer to: * Action (philosophy), in general * Human activity: human behavior, in sociology behavior may refer to all basic human actions, economics may study human economic activities and along with cybernetics and psychology may st ...
- a
time-series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
based schema


Reasons for creating a data mart

* Easy access to frequently needed data * Creates a collective view by a group of users * Improves end-user
response time Response time may refer to: *The time lag between an electronic input and the output signal which depends upon the value of passive components used. * Responsiveness, how quickly an interactive system responds to user input * Response time (biolog ...
* Ease of creation * Lower cost than implementing a full data warehouse * Potential users are more clearly defined than in a full data warehouse * Contains only business essential data and is less cluttered. * It has key data information


Dependent data mart

According to the Inmon school of data warehousing, a dependent data mart is a logical subset ( view) or a physical subset (extract) of a larger
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
, isolated for one of the following reasons: *A need refreshment for a special
data model A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be co ...
or
schema The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms. Schema may refer to: Science and technology * SCHEMA ...
: e.g., to restructure for
OLAP Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repor ...
*Performance: to offload the data mart to a separate
computer A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
for greater efficiency or to eliminate the need to manage that workload on the centralized data warehouse. *Security: to separate an authorized data subset selectively *Expediency: to bypass the data governance and authorizations required to incorporate a new application on the Enterprise Data Warehouse *Proving Ground: to demonstrate the viability and ROI (return on investment) potential of an application prior to migrating it to the Enterprise Data Warehouse *Politics: a coping strategy for IT (Information Technology) in situations where a user group has more influence than funding or is not a good citizen on the centralized data warehouse. *Politics: a coping strategy for consumers of data in situations where a data warehouse team is unable to create a usable data warehouse. According to the Inmon school of data warehousing, tradeoffs inherent with data marts include limited
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
, duplication of data,
data inconsistency In database systems, consistency (or correctness) refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, includin ...
with other silos of information, and inability to leverage enterprise sources of data. The alternative school of data warehousing is that of
Ralph Kimball Ralph Kimball (born July 18, 1944) is an author on the subject of data warehousing and business intelligence. He is one of the original architects of data warehousing and is known for long-term convictions that data warehouses must be designed to b ...
. In his view, a data warehouse is nothing more than the union of all the data marts. This view helps to reduce costs and provides fast development, but can create an inconsistent data warehouse, especially in large organizations. Therefore, Kimball's approach is more suitable for small-to-medium corporations.


See also

*
Data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
* Enterprise architecture *
OLAP cube An OLAP cube is a multi-dimensional array of data. Online analytical processing (OLAP) is a computer-based technique of analyzing data to look for insights. The term ''cube'' here refers to a multi-dimensional dataset, which is also sometimes c ...


References


External links

{{DEFAULTSORT:Data Mart Data warehousing