OLAP cube
   HOME

TheInfoList



OR:

An OLAP cube is a multi-dimensional array of data.
Online analytical processing Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repor ...
(OLAP) is a computer-based technique of analyzing data to look for insights. The term ''cube'' here refers to a multi-dimensional dataset, which is also sometimes called a hypercube if the number of dimensions is greater than three.


Terminology

A cube can be considered a multi-dimensional generalization of a two- or three-dimensional
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
. For example, a company might wish to summarize financial data by product, by time-period, and by city to compare actual and budget expenses. Product, time, city and scenario (actual and budget) are the data's dimensions. ''Cube'' is a shorthand for ''multidimensional dataset'', given that data can have an arbitrary number of ''
dimensions In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordin ...
''. The term hypercube is sometimes used, especially for data with more than three dimensions. A cube is not a "cube" in the strict mathematical sense, as the sides are not all necessarily equal. But this term is used widely. A ''Slice'' is a term for a subset of the data, generated by picking a value for one dimension and only showing the data for that value (for instance only the data at one point in time). Spreadsheets are only 2-dimensional, so by (continued) slicing or other techniques, it becomes possible to visualise multidimensional data in them. Each cell of the cube holds a number that represents some ''measure'' of the business, such as sales, profits, expenses, budget and forecast. OLAP data is typically stored in a
star schema In computing, the star schema is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dim ...
or
snowflake schema In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake shape. The snowflake schema is represented by centralized fact tables which ...
in a relational
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. DWs are central repositories of integra ...
or in a special-purpose data management system. Measures are derived from the records in the
fact table In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables. Where multiple fact tables are used, thes ...
and dimensions are derived from the
dimension table A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions are people, products, place and time. (Note: People and time sometimes are not modeled as dimensions.) ...
s.


Hierarchy

The elements of a dimension can be organized as a hierarchy, a set of parent-child relationships, typically where a parent member summarizes its children. Parent elements can further be aggregated as the children of another parent. For example, May 2005's parent is Second Quarter 2005 which is in turn the child of Year 2005. Similarly cities are the children of regions; products roll into product groups and individual expense items into types of expenditure.


Operations

Conceiving data as a cube with hierarchical dimensions leads to conceptually straightforward operations to facilitate analysis. Aligning the data content with a familiar visualization enhances analyst learning and productivity. The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up is sometimes called "slice and dice". Common operations include slice and dice, drill down, roll up, and pivot. ''Slice'' is the act of picking a rectangular subset of a cube by choosing a single value for one of its dimensions, creating a new cube with one fewer dimension. The picture shows a slicing operation: The sales figures of all sales regions and all product categories of the company in the year 2005 and 2006 are "sliced" out of the data cube. ''Dice'': The dice operation produces a subcube by allowing the analyst to pick specific values of multiple dimensions. The picture shows a dicing operation: The new cube shows the sales figures of a limited number of product categories, the time and region dimensions cover the same range as before. ''Drill Down/Up'' allows the user to navigate among levels of data ranging from the most summarized (up) to the most detailed (down). The picture shows a drill-down operation: The analyst moves from the summary category "Outdoor protective equipment" to see the sales figures for the individual products. ''Roll-up'': A roll-up involves summarizing the data along a dimension. The summarization rule might be an
aggregate function In database management, an aggregate function or aggregation function is a function where the values of multiple rows are grouped together to form a single summary value. Common aggregate functions include: * Average (i.e., arithmetic mean) * ...
, such as computing totals along a hierarchy or applying a set of formulas such as "profit = sales - expenses". General aggregation functions may be costly to compute when rolling up: if they cannot be determined from the cells of the cube, they must be computed from the base data, either computing them online (slow) or precomputing them for possible rollouts (large space). Aggregation functions that can be determined from the cells are known as decomposable aggregation functions, and allow efficient computation. For example, it is easy to support COUNT, MAX, MIN, and SUM in OLAP, since these can be computed for each cell of the OLAP cube and then rolled up, since on overall sum (or count etc.) is the sum of sub-sums, but it is difficult to support MEDIAN, as that must be computed for every view separately: the median of a set is not the median of medians of subsets. ''
Pivot Pivot may refer to: *Pivot, the point of rotation in a lever system *More generally, the center point of any rotational system *Pivot joint, a kind of joint between bones in the body *Pivot turn, a dance move Companies *Incitec Pivot, an Austra ...
'' allows an analyst to rotate the cube in space to see its various faces. For example, cities could be arranged vertically and products horizontally while viewing data for a particular quarter. Pivoting could replace products with time periods to see data across time for a single product. The picture shows a pivoting operation: The whole cube is rotated, giving another perspective on the data.


Mathematical definition

In
database theory Database theory encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems. Theoretical aspects of data management include, among other areas, the foundations of q ...
, an OLAP cube is an abstract representation of a projection of an RDBMS relation. Given a relation of order ''N'', consider a projection that subtends ''X'', ''Y'', and ''Z'' as the key and ''W'' as the residual
attribute Attribute may refer to: * Attribute (philosophy), an extrinsic property of an object * Attribute (research), a characteristic of an object * Grammatical modifier, in natural languages * Attribute (computing), a specification that defines a prope ...
. Characterizing this as a
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
, :''f'' : (''X'',''Y'',''Z'') → ''W'', the attributes ''X'', ''Y'', and ''Z'' correspond to the axes of the cube, while the ''W'' value corresponds to the data element that populates each cell of the cube. Insofar as two-dimensional output devices cannot readily characterize three dimensions, it is more practical to project "slices" of the data cube (we say ''project'' in the classic vector analytic sense of dimensional reduction, not in the SQL sense, although the two are conceptually similar), :''g'' : (''X'',''Y'') → ''W'' which may suppress a primary key, but still have some semantic significance, perhaps a slice of the triadic functional representation for a given ''Z'' value of interest. The motivation behind
OLAP Online analytical processing, or OLAP (), is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, repor ...
displays harks back to the ''cross-tabbed report'' paradigm of 1980s DBMS, and to earlier contingency tables from 1904. The result is a spreadsheet-style display, where values of ''X'' populate row $1; values of ''Y'' populate column $A; and values of ''g'' : ( ''X'', ''Y'' ) → ''W'' populate the individual cells at intersections of ''X''-labeled columns and ''Y''-labeled rows, "southeast", so to speak, of $B$2, with $B$2 itself included.


See also


References

*


External links

*
The RDF Data Cube Vocabulary

Microsoft Azure: Online analytical processing (OLAP)
{{DEFAULTSORT:Olap Cube Online analytical processing Data warehousing fr:Hypercube OLAP