Perl Data Language (abbreviated PDL) is a set of
free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
array programming extensions to the
Perl programming language. PDL extends the data structures built into Perl, to include large
multidimensional arrays, and adds functionality to manipulate those arrays as vector objects. It also provides tools for
image processing
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensiona ...
,
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
,
computer model
Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be deter ...
ing of physical systems, and graphical plotting and presentation. Simple operations are automatically vectorized across complete arrays, and higher-dimensional operations (such as matrix multiplication) are supported.
Language design
PDL is a vectorized
array programming
In computer science, array programming refers to solutions which allow the application of operations to an entire set of values at once. Such solutions are commonly used in scientific and engineering settings.
Modern programming languages that s ...
language: the expression syntax is a variation on standard mathematical
vector
Vector most often refers to:
*Euclidean vector, a quantity with a magnitude and a direction
*Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematic ...
notation, so that the user can combine and operate on large arrays with simple expressions. In this respect, PDL follows in the footsteps of the
APL programming language
APL (named after the book ''A Programming Language'') is a programming language developed in the 1960s by Kenneth E. Iverson. Its central datatype is the multidimensional array. It uses a large range of special graphic symbols to represent mos ...
, and it has been compared to commercial languages such as
MATLAB
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
and
Interactive Data Language
IDL, short for Interactive Data Language, is a programming language used for data analysis. It is popular in particular areas of science, such as astronomy, atmospheric physics and medical imaging. IDL shares a common syntax with PV-Wave and or ...
, and to other free languages such as
NumPy and
Octave
In music, an octave ( la, octavus: eighth) or perfect octave (sometimes called the diapason) is the interval between one musical pitch and another with double its frequency. The octave relationship is a natural phenomenon that has been refer ...
. Unlike MATLAB and IDL, PDL allows great flexibility in indexing and vectorization: for example, if a subroutine normally operates on a 2-D
matrix
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** ''The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
array, passing it a 3-D
data cube will generally cause the same operation to happen to each 2-D layer of the cube.
PDL borrows from Perl at least three basic types of program structure:
imperative programming
In computer science, imperative programming is a programming paradigm of software that uses statements that change a program's state. In much the same way that the imperative mood in natural languages expresses commands, an imperative program c ...
,
functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declar ...
, and
pipeline programming
In software engineering, a pipeline consists of a chain of processing elements (processes, threads, coroutines, functions, ''etc.''), arranged so that the output of each element is the input of the next; the name is by analogy to a physical pi ...
forms may be combined. Subroutines may be loaded either via a built-in
autoload mechanism or via the usual Perl module mechanism. In 2000, it was proposed that PDL-like functionality be included in the development of what is now
Raku.
Graphics
True to the
glue language
A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled.
A scripting ...
roots of Perl, PDL borrows from several different modules for graphics and plotting support.
NetPBM
Netpbm (formerly Pbmplus) is an open-source package of graphics programs and a programming library. It is used mainly in the Unix world, where one can find it included in all major open-source operating system distributions, but also works on Mic ...
provides image file I/O (though FITS is supported natively).
Gnuplot
gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits. The program runs on all major computers and operating systems (Linux, Unix, Microsoft Windows, macOS, FreeDOS, an ...
,
PLplot
PLplot is a library of subroutines that are often used to make scientific plots in compiled languages such as C, C++, D, Fortran, Ada, OCaml and Java. The library also exists as an unofficial binding for the .NET runtime. ''PLplot'' can als ...
,
PGPLOT, and
Prima modules are supported for 2-D graphics and plotting applications, and
Gnuplot
gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits. The program runs on all major computers and operating systems (Linux, Unix, Microsoft Windows, macOS, FreeDOS, an ...
and
OpenGL
OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardwa ...
are supported for 3-D plotting and rendering.
I/O
PDL provides facilities to read and write many open data formats, including
JPEG
JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
,
PNG,
GIF
The Graphics Interchange Format (GIF; or , see pronunciation) is a bitmap image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on 15 June 1987. ...
,
PPM,
MPEG
The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...
,
FITS,
NetCDF
NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata ...
,
GRIB
GRIB (GRIdded Binary or General Regularly-distributed Information in Binary form) is a concise data format commonly used in meteorology to store historical and forecast weather data. It is standardized by the World Meteorological Organization's C ...
, raw binary files, and delimited ASCII tables. PDL programmers can use the
CPAN
The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. ''CPAN'' can denote eith ...
Perl I/O libraries to read and write data in hundreds of standard and niche file formats.
Machine learning
PDL can be used for
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
. It includes modules that are used to perform classic
k-means clustering or general and generalized linear modeling methods such as ANOVA, linear regression, PCA, and logistic regression. Examples of PDL usage for regression modelling tasks include evaluating association between education attainment and ancestry differences of parents,
comparison of RNA-protein interaction profiles that needs regression-based normalization
and analysis of spectra of galaxies.
perldl
An installation of PDL usually comes with an interactive
shell
Shell may refer to:
Architecture and design
* Shell (structure), a thin structure
** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses
** Thin-shell structure
Science Biology
* Seashell, a hard o ...
known as perldl, which can be used to perform simple calculations without requiring the user to create a Perl program file. A typical session of perldl would look something like the following:
perldl> $x = pdl 1, 2 , 4
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
;
perldl> $y = pdl 5, 6, 7 , 9, 0;
perldl> $z = $x x $y;
perldl> p $z;
_[21_24__7_[47_54_21.html"_;"title="1_24__7.html"_;"title="_[21_24__7">_[21_24__7_[47_54_21">1_24__7.html"_;"title="_[21_24__7">_[21_24__7_[47_54_21.html" ;"title="1_24__7">_[21_24__7_[47_54_21.html" ;"title="1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21">1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21">1_24__7">_[21_24__7_[47_54_21.html" ;"title="1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21">1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21
The commands used in the shell are Perl statements that can be used in a program with
PDL
module included.
x
is an overloaded operator for matrix multiplication, and
p
in the last command is a shortcut for
print
.
Implementation
The core of PDL is written in C (programming language), C. Most of the functionality is written in PP, a PDL-specific metalanguage that handles the vectorization of simple C snippets and interfaces them with the Perl host language via Perl's
XS (Perl), XS compiler. Some modules are written in
Fortran, with a C/PP interface layer. Many of the supplied functions are written in PDL itself. PP is available to the user to write C-language extensions to PDL. There is also an Inline module (Inline::Pdlpp) that allows PP function definitions to be inserted directly into a Perl script; the relevant code is low-level compiled and made available as a Perl subroutine.
The PDL API uses the basic Perl 5 object-oriented functionality: PDL defines a new type of Perl scalar object (
eponym
An eponym is a person, a place, or a thing after whom or which someone or something is, or is believed to be, named. The adjectives which are derived from the word eponym include ''eponymous'' and ''eponymic''.
Usage of the word
The term ''epon ...
ously called a "PDL", or "ndarray") that acts as a Perl scalar, but that contains a conventional
typed array
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of numeric or character values. All of the standard Perl operators are overloaded so that they can be used on PDL objects transparently, and PDLs can be mixed-and-matched with normal Perl scalars. Several hundred object methods for operating on PDLs are supplied by the core modules.
Raku version
In
Raku, PDL is specified as a trait in Synopsis 9.
[ ] As of January 2013, this feature is not yet implemented in
Rakudo
Rakudo is a Raku compiler targeting MoarVM, and the Java Virtual Machine, that implements the Raku specification. It is currently the only major Raku compiler in active development.
Originally developed within the Parrot project, the Rakudo sour ...
.
See also
*
Comparison of numerical-analysis software
The following tables provide a comparison of numerical-analysis software.
Applications
General
Operating system support
The operating systems the software can run on natively (without emulation).
Language features
Colors indicate ...
*
List of numerical-analysis software
Listed here are notable end-user computer applications intended for use with numerical or data analysis:
Numerical-software packages
General-purpose computer algebra systems
Interface-oriented
Language-oriented
Historically signific ...
References
External links
* {{Official website, http://pdl.perl.org/
PDL Quick ReferencePDL Intro & resources
Tutorial lecture on PDLDraft release of the PDL Book for PDL-2.006Example of PDL usage in the scientific literature
Array programming languages
Free mathematics software
Free science software
Numerical programming languages
Perl modules