Mondrian Data Analysis
   HOME

TheInfoList



OR:

Mondrian is a general-purpose statistical data-visualization system, for
interactive data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is nu ...
. All plots in Mondrian are fully linked, and offer various interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots. Currently implemented plots comprise Mosaic Plot, Scatterplots and SPLOM, Maps, Barcharts, Histograms, Missing Value Plot, Parallel Coordinates/Boxplots and Boxplots y by x. Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in databases. Mondrian links to R and offers statistical procedures like interactive
density estimation In statistics, probability density estimation or simply density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of ...
,
scatterplot smoother In statistics, several scatterplot smoothing methods are available to fit a function through the points of a scatterplot to best represent the relationship between the variables. Scatterplots may be smoothed by fitting a line to the data points in ...
s,
multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configurati ...
(MDS) and
principal component analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
(PCA).


Overview

Starting in 1997, Mondrian was first developed with a focus on visualization techniques for
categorical data In statistics, a categorical variable (also called qualitative variable) is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or ...
and enhanced selection techniques. Over the years, a complete suite of visualizations for univariate and multivariate data measured on any scale were added. The link to R offers well tested statistical procedures, which integrate seamlessly into the interactive graphics. Today, even geographical data is supported with highly interactive maps.


Mondrian details

Last stable and beta versions, help and documentations are available on the developer web site, Martin Theus


Supported data sources

Mondrian works on plain text files with tab-separated columns with variable header, as exported from
Microsoft Excel Microsoft Excel is a spreadsheet developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android and iOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro (comp ...
as ".txt". If the Rserve link and R are present, Mondrian also reads data directly from R workspace files (.RData files).


Visualizations

* 1-d:
Barchart A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally. A vertical bar chart is ...
, Spineplot,
Histogram A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or "bucket") the range of values—that is, divide the ent ...
, Spinogram, Boxplot * 2-d:
Scatterplot A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. ...
,
Boxplot y by x In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are cal ...
* High-D: ** Multivariate continuous:
Scatterplot matrix A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of dat ...
,
Parallel coordinates Parallel coordinates are a common way of visualizing and analyzing high-dimensional datasets. To show a set of points in an ''n''-dimensional space, a backdrop is drawn consisting of ''n'' parallel lines, typically vertical and equally spaced ...
** Multivariate categorical:
Mosaic plot A mosaic is a pattern or image made of small regular or irregular pieces of colored stone, glass or ceramic, held in place by plaster/mortar, and covering a surface. Mosaics are often used as floor and wall decoration, and were particularly pop ...
(see also
Treemapping In information visualization and computing, treemapping is a method for displaying hierarchical data using nested figures, usually rectangles. Treemaps display hierarchical ( tree-structured) data as a set of nested rectangles. Each branch of ...
) * Geographical:
Map A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...
* Special: missing value plot


Interaction techniques

Mondrian supports Query, Select, and Modify.


See also

*
Data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is num ...
* GGobi


References

{{Reflist


Further reading

* Theus, M. (2002). ''Interactive Data Visualization using Mondrian'', in Journal of Statistical Software 7 (11): 1–9. * Theus, M. and Urbanek, S. (2008). ''Interactive Graphics for Data Analysis: Principles and Examples'' (Computer Science and Data Analysis), Chapman & Hall / CRC.


External links


Mondrian: Graphical Data Analysis Software

Homepage for the book “Interactive Graphics for Data Analysis – Principles and Examples”
- the book is heavily based on Mondrian
theusrus
- the homepage of Martin Theus Free plotting software Free statistical software Piet Mondrian Plotting software