HOME

TheInfoList



OR:

Perl Data Language (abbreviated PDL) is a set of
free software Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
array programming extensions to the Perl programming language. PDL extends the data structures built into Perl, to include large multidimensional arrays, and adds functionality to manipulate those arrays as vector objects. It also provides tools for
image processing An image or picture is a visual representation. An image can be two-dimensional, such as a drawing, painting, or photograph, or three-dimensional, such as a carving or sculpture. Images may be displayed through other media, including a pr ...
,
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
,
computer model Computer simulation is the running of a mathematical model on a computer, the model being designed to represent the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be determin ...
ing of physical systems, and graphical plotting and presentation. Simple operations are automatically vectorized across complete arrays, and higher-dimensional operations (such as matrix multiplication) are supported.


Language design

PDL is a vectorized
array programming In computer science, array programming refers to solutions that allow the application of operations to an entire set of values at once. Such solutions are commonly used in computational science, scientific and engineering settings. Modern program ...
language: the expression syntax is a variation on standard mathematical
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
notation, so that the user can combine and operate on large arrays with simple expressions. In this respect, PDL follows in the footsteps of the
APL programming language APL (named after the book ''A Programming Language'') is a programming language developed in the 1960s by Kenneth E. Iverson. Its central datatype is the multidimensional array. It uses a large range of special graphic symbols to represent m ...
, and it has been compared to commercial languages such as
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
and
Interactive Data Language IDL, short for Interactive Data Language, is a programming language used for data analysis. It is popular in particular areas of science, such as astronomy, atmospheric physics and medical imaging. IDL shares a common Syntax (programming langua ...
, and to other free languages such as
NumPy NumPy (pronounced ) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. The predeces ...
and
Octave In music, an octave (: eighth) or perfect octave (sometimes called the diapason) is an interval between two notes, one having twice the frequency of vibration of the other. The octave relationship is a natural phenomenon that has been referr ...
. Unlike MATLAB and IDL, PDL allows great flexibility in indexing and vectorization: for example, if a subroutine normally operates on a 2-D
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
array, passing it a 3-D
data cube In computer programming contexts, a data cube (or datacube) is a multi-dimensional ("n-D") array of values. Typically, the term data cube is applied in contexts where these arrays are massively larger than the hosting computer's main memory; exa ...
will generally cause the same operation to happen to each 2-D layer of the cube. PDL borrows from Perl at least three basic types of program structure:
imperative programming In computer science, imperative programming is a programming paradigm of software that uses Statement (computer science), statements that change a program's state (computer science), state. In much the same way that the imperative mood in natural ...
,
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
, and
pipeline programming In software engineering, a pipeline consists of a chain of processing elements ( processes, threads, coroutines, functions, ''etc.''), arranged so that the output of each element is the input of the next. The concept is analogous to a physical ...
forms may be combined. Subroutines may be loaded either via a built-in autoload mechanism or via the usual Perl module mechanism.


Graphics

True to the
glue language In computing, a script is a relatively short and simple set of instructions that typically automate an otherwise manual process. The act of writing a script is called scripting. A scripting language or script language is a programming language t ...
roots of Perl, PDL borrows from several different modules for graphics and plotting support.
NetPBM Netpbm (formerly Pbmplus) is an open-source software, open-source package of graphics programs and a programming library. It is used primarily in Unix, where it is found in all major open-source operating system distributions, but also works on M ...
provides image file I/O (though FITS is supported natively). Gnuplot,
PLplot PLplot is a library of subroutines that are often used to make scientific plots in compiled languages such as C, C++, D, Fortran, Ada, OCaml and Java. The library also exists as an unofficial binding for the .NET runtime. ''PLplot'' can als ...
, PGPLOT, and Prima modules are supported for 2-D graphics and plotting applications, and Gnuplot and
OpenGL OpenGL (Open Graphics Library) is a Language-independent specification, cross-language, cross-platform application programming interface (API) for rendering 2D computer graphics, 2D and 3D computer graphics, 3D vector graphics. The API is typic ...
are supported for 3-D plotting and rendering.


I/O

PDL provides facilities to read and write many open data formats, including
JPEG JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
, PNG,
GIF The Graphics Interchange Format (GIF; or , ) is a Raster graphics, bitmap Image file formats, image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released ...
, PPM,
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...
,
FITS Flexible Image Transport System (FITS) is an open standard defining a digital file format used for storage, transmission and processing of data: formatted as multi-dimensional arrays (for example a 2D image), or tables. FITS is the most commonl ...
,
NetCDF NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidat ...
,
GRIB GRIB (GRIdded Binary or General Regularly-distributed Information in Binary form) is a concise data format commonly used in meteorology to store historical and forecast weather data. It is standardized by the World Meteorological Organization's ...
, raw binary files, and delimited ASCII tables. PDL programmers can use the
CPAN The Comprehensive Perl Archive Network (CPAN) is a software repository of over 220,000 software modules and accompanying documentation for 45,500 distributions, written in the Perl programming language by over 14,500 contributors. ''CPAN'' can de ...
Perl I/O libraries to read and write data in hundreds of standard and niche file formats.


Machine learning

PDL can be used for
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. It includes modules that are used to perform classic
k-means clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition of a set, partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster (statistics), cluste ...
or general and generalized linear modeling methods such as ANOVA, linear regression, PCA, and logistic regression. Examples of PDL usage for regression modelling tasks include evaluating association between education attainment and ancestry differences of parents, comparison of RNA-protein interaction profiles that needs regression-based normalization and analysis of spectra of galaxies.


perldl

An installation of PDL usually comes with an interactive
shell Shell may refer to: Architecture and design * Shell (structure), a thin structure ** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses Science Biology * Seashell, a hard outer layer of a marine ani ...
known as perldl, which can be used to perform simple calculations without requiring the user to create a Perl program file. A typical session of perldl would look something like the following: perldl> $x = pdl 1, 2
, 4 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
; perldl> $y = pdl 5, 6, 7 , 9, 0; perldl> $z = $x x $y; perldl> p $z; [21 24 7 [47 54 21">1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21">1_24__7">_[21_24__7<_a>_[47_54_21.html" ;"title="1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21">1_24__7.html" ;"title=" [21 24 7"> [21 24 7 [47 54 21
The commands used in the shell are Perl statements that can be used in a program with PDL module included. x is an overloaded operator for matrix multiplication, and p in the last command is a shortcut for print.


Implementation

The core of PDL is written in C (programming language), C. Most of the functionality is written in PP, a PDL-specific metalanguage that handles the vectorization of simple C snippets and interfaces them with the Perl host language via Perl's XS (Perl), XS compiler. Some modules are written in Fortran, with a C/PP interface layer. Many of the supplied functions are written in PDL itself. PP is available to the user to write C-language extensions to PDL. There is also an Inline module (Inline::Pdlpp) that allows PP function definitions to be inserted directly into a Perl script; the relevant code is low-level compiled and made available as a Perl subroutine. The PDL API uses the basic Perl 5 object-oriented functionality: PDL defines a new type of Perl scalar object (
eponym An eponym is a noun after which or for which someone or something is, or is believed to be, named. Adjectives derived from the word ''eponym'' include ''eponymous'' and ''eponymic''. Eponyms are commonly used for time periods, places, innovati ...
ously called a "PDL", or "ndarray") that acts as a Perl scalar, but that contains a conventional typed
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of numeric or character values. All of the standard Perl operators are overloaded so that they can be used on PDL objects transparently, and PDLs can be mixed-and-matched with normal Perl scalars. Several hundred object methods for operating on PDLs are supplied by the core modules.


See also

* Comparison of numerical-analysis software *
List of numerical-analysis software Listed here are notable end-user computer applications intended for use with numerical or data analysis: Numerical-software packages * Analytica is a widely used proprietary software tool for building and analyzing numerical models. It is a de ...


References


External links

* {{Official website, http://pdl.perl.org/
PDL Quick Reference
PDL Intro & resources
Tutorial lecture on PDL

Draft release of the PDL Book for PDL-2.006

Example of PDL usage in the scientific literature
Array programming languages Free mathematics software Free science software Numerical programming languages Perl modules