oneAPI Data Analytics Library (oneDAL; formerly Intel Data Analytics Acceleration Library or Intel DAAL), is a
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
of optimized algorithmic building blocks for
data analysis
Data analysis is the process of inspecting, Data cleansing, cleansing, Data transformation, transforming, and Data modeling, modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Da ...
stages most commonly associated with solving
Big Data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
problems.
The library supports Intel processors and is available for
Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
,
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
and
macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s.
[Intel® Data Analytics Acceleration Library (Intel® DAAL) , Intel® Software](_blank)
/ref> The library is designed for use popular data platforms including Hadoop
Apache Hadoop () is a collection of Open-source software, open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for Clustered file system, distributed storage and processing of big data usin ...
, Spark, R, and MATLAB
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
.
History
Intel launched the Intel Data Analytics Library(oneDAL) on December 8, 2020. It also launched the Data Analytics Acceleration Library on August 25, 2015 and called it Intel Data Analytics Acceleration Library 2016 (Intel DAAL 2016). oneDAL is bundled with Intel oneAPI Base Toolkit as a commercial product. A standalone version is available commercially or freely, the only difference being support and maintenance related.
Details
Functional categories
Intel DAAL has the following algorithms:Developer Guide for Intel(R) Data Analytics Acceleration Library 2020
/ref>
*Analysis
**Low Order Moments: Includes computing min, max, mean, standard deviation, variance, etc. for a dataset.
**Quantiles: splitting observations into equal-sized groups defined by quantile orders.
**Correlation matrix and variance-covariance matrix: A basic tool in understanding statistical dependence among variables. The degree of correlation indicates the tendency of one change to indicate the likely change in another.
**Cosine distance matrix: Measuring pairwise distance using cosine distance.
**Correlation distance matrix: Measuring pairwise distance between items using correlation distance.
**Clustering: Grouping data into unlabeled groups. This is a typical technique used in “unsupervised learning” where there is not established model to rely on. Intel DAAL provides 2 algorithms for clustering: K-Means and “EM for GMM.”
**Principal Component Analysis (PCA): the most popular algorithm for dimensionality reduction.
**Association rules mining: Detecting co-occurrence patterns. Commonly known as “shopping basket mining.”
**Data transformation through matrix decomposition: DAAL provides Cholesky, QR, and SVD decomposition algorithms.
**Outlier detection: Identifying observations that are abnormally distant from typical distribution of other observations.
*Training and Prediction
**Regression
***Linear regression: The simplest regression method. Fitting a linear equation to model the relationship between dependent variables (things to be predicted) and explanatory variables (things known).
**Classification: Building a model to assign items into different labeled groups. DAAL provides multiple algorithms in this area, including Naïve Bayes classifier, Support Vector Machine, and multi-class classifiers.
**Recommendation systems
**Neural networks
Intel DAAL supported three processing modes:
*Batch processing: When all data fits in the memory, a function is called to process the data all at once.
*Online processing (also called Streaming): when all data does not fit in memory. Intel® DAAL can process data chunks individually and combine all partial results at the finalizing stage.
*Distributed processing: DAAL supports a model similar to MapReduce. Consumers in a cluster process local data (map stage), and then the Producer process collects and combines partial results from Consumers (reduce stage). Intel DAAL offers flexibility in this mode by leaving the communication functions completely to the developer. Developers can choose to use the data movement in a framework such as Hadoop or Spark, or explicitly coding communications most likely with MPI.
References
External links
*OneAPI (compute acceleration)
oneAPI is an open standard, adopted by Intel, for a unified application programming interface (API) intended to be used across different computing accelerator ( coprocessor) architectures, including GPUs, AI accelerators and field-programmab ...
oneAPI oneDAL Specification
*
DAAL Support
DAAL User Forum
DAAL Support Channel
{{Numerical linear algebra
Intel software
Numerical software
Numerical linear algebra