Pipeline Pilot
   HOME

TheInfoList



OR:

Pipeline Pilot is a desktop software program sold by
Dassault Systèmes Dassault Systèmes SE () (abbreviated 3DS) is a French software corporation which develops software for 3D product design, simulation, manufacturing and other 3D related products. Founded in 1981, it is headquartered in Vélizy-Villacoublay, F ...
for processing and analyzing data. Originally used in the natural sciences, the product's basic ETL (
Extract, transform, load In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed (cleaned, sanitized, scrubbed) and loaded into an output data container. The data can be collated from one or more sources and it can also ...
) and analytics capabilities have broadened over time. The product is now used for data science, ETL, reporting, prediction and analytics in a number of sectors. The main feature of the program is the ability to design data workflows using a graphical user interface. It is an example of
visual The visual system comprises the sensory organ (the eye) and parts of the central nervous system (the retina containing photoreceptor cells, the optic nerve, the optic tract and the visual cortex) which gives organisms the sense of sight (th ...
and dataflow programming and has use in a variety of settings, such as cheminformatics and QSAR, Next Generation Sequencing, image analysis, and text analytics. It is not an '
object oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of " objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of p ...
' programming language .


History

The product was created by SciTegic. BIOVIA subsequently acquired SciTegic and Pipeline Pilot in 2004. BIOVIA was itself purchased by
Dassault Systèmes Dassault Systèmes SE () (abbreviated 3DS) is a French software corporation which develops software for 3D product design, simulation, manufacturing and other 3D related products. Founded in 1981, it is headquartered in Vélizy-Villacoublay, F ...
in 2014. The product expanded from an initial focus on chemistry to include general extract, transform and load (ETL) capabilities. Beyond the base product, Dassault has added analytical and data processing collections for report generation, data visualization and a number of scientific and engineering sectors. Currently, the product is used for ETL, analytics and machine learning in the chemical, energy, consumer packaged goods, aerospace, automotive and electronics manufacturing industries.


Overview

Pipeline Pilot is part of a class of software products that provide user interfaces for manipulating and analyzing data. The Vendor says that Pipeline Pilot and similar products allow users with limited or no coding abilities to transform and manipulate datasets. The dataset manipulation is usually a precursor to conducting analysis of the data. Like other graphical ETL products, it enables users to pull from different data sources, such as CSV files, text files and databases.


Components, pipelines, protocols and data records

The
graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inst ...
, called the Pipeline Pilot Professional Client, allows users to drag and drop discrete data processing units called "components". Components can load, filter, join or manipulate data. Components can also perform much more advanced data manipulations, such as building regression models, training neural networks or processing datasets into PDF reports. Pipeline Pilot implements a Components paradigm. Components are represented as nodes in a workflow. In a mathematical sense, components are modeled as nodes in a
directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...
: "pipes" (graph edges) connect components and move data along the from node to node where operations are performed on the data. To help in industry-specific applications, such as Next Generation Sequencing (see High-throughput sequencing (HTS) methods), BIOVIA has developed components that greatly reduce the amount of time users need to do common industry-specific tasks. Users can choose from components that come pre-installed or create their own components in workflows called "protocols". Protocols are sets of linked components. Protocols can be saved, reused and shared. Users can mix and match components that are provided with the software from BIOVIA with their own custom components. Connections between two components are called "pipes", and are visualized in the software as two components connected by a pipe. End users design their workflows/protocols, then execute them by running the protocol. Data flows from left to right along the pipes. Modern data analysis and processing can involve a very large number of manipulations and transformations. Pipeline Pilot has the ability to visually condense a lengthy series of data manipulations that involve many components. A workflow of any length can be visually condensed into a component that is used in a high level workflow. This means that a protocol can be saved and used as a component in another protocol. In the terminology used in Pipeline Pilot, protocols that are used as components in other protocols are called "subprotocols". This allows users to add layers of complexity to their data processing and manipulation workflows, then hide that complexity so they can design the workflow at a higher level of abstraction.


Component collections

Pipeline Pilot features a number of add-ons called "collections". Collections are groups of specialized functions like processing genetic information or analyzing polymers offered to end users for an additional licensing fee. Currently, there are a number of these collections. Given the number of different add-ons now offered by BIOVIA, Pipeline Pilot's use cases are very broad and difficult to summarize succinctly. The product has been used in: * Predictive maintenance * Image analysis, for example the determination of the inhibitory action of a substance on biological processes (
IC50 The half maximal inhibitory concentration (IC50) is a measure of the potency of a substance in inhibiting a specific biological or biochemical function. IC50 is a quantitative measure that indicates how much of a particular inhibitory substance ...
) by calculating the
dose–response relationship The dose–response relationship, or exposure–response relationship, describes the magnitude of the response of an organism, as a function of exposure (or doses) to a stimulus or stressor (usually a chemical) after a certain exposure tim ...
directly from information extracted from
high-content screening High-content screening (HCS), also known as high-content analysis (HCA) or cellomics, is a method that is used in biological research and drug discovery to identify substances such as small molecules, peptides, or RNAi that alter the phenotype of ...
assay images, associated with dilution in the
plate Plate may refer to: Cooking * Plate (dishware), a broad, mainly flat vessel commonly used to serve food * Plates, tableware, dishes or dishware used for setting a table, serving food and dining * Plate, the content of such a plate (for example: ...
layout and chemistry information about the tested compounds (Imaging, Chemistry, Plate Data Analytics) * A
recommender system A recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular ...
for scientific literature based on a Bayesian model built using
fingerprint A fingerprint is an impression left by the friction ridges of a human finger. The recovery of partial fingerprints from a crime scene is an important method of forensic science. Moisture and grease on a finger result in fingerprints on surfac ...
and user's reading list or papers ranking * Access to experiment methods and results from
electronic laboratory notebook An electronic lab notebook (also known as electronic laboratory notebook, or ELN) is a computer program designed to replace paper laboratory notebooks. Lab notebooks in general are used by scientists, engineers, and technicians to document resear ...
or
laboratory information management system A laboratory information management system (LIMS), sometimes referred to as a laboratory information system (LIS) or laboratory management system (LMS), is a software-based solution with features that support a modern laboratory's operations. K ...
, with resulting reports for resource
capacity planning Capacity planning is the process of determining the production capacity needed by an organization to meet changing demands for its products. In the context of capacity planning, design capacity is the maximum amount of work that an organization ...


PilotScript and custom scripts

As with other ETL and analytics solutions, Pipeline Pilot is often used when one or more large (1TB+) and/or complex datasets is processed. In these situations, end users may want to utilize programming scripts that they have written. Early in its development, Pipeline Pilot created a scripting language called PilotScript that enabled end users to write basic programming scripts that could be incorporated into a Pipeline Pilot protocol. Later releases extended support for a variety of programming languages, including
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, .NET,
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
,
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
, SQL,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
,
VBScript VBScript (''"Microsoft Visual Basic Scripting Edition"'') is an Active Scripting language developed by Microsoft that is modeled on Visual Basic. It allows Microsoft Windows system administrators to generate powerful tools for managing computers ...
and R. The syntax for PilotScript is based on PLSQL. It can be used in components such as the ''Custom Manipulator (PilotScript)'' or the ''Custom Filter (PilotScript)''. As an example, the following script can be used to add a property named "Hello" to each record passing through a custom scripting component in a Pipeline Pilot protocol. The value of the property is the string "Hello World!". Hello := "Hello World!"; Currently, the product supports a number of APIs for different programming languages that can be executed without the program's graphical user interface.


References

{{Reflist Science software Enterprise application integration Extract, transform, load tools Bioinformatics software Computational chemistry software Computer vision software Data analysis software Data mining and machine learning software Data visualization software Laboratory software Mass spectrometry software Natural language processing software Numerical software Plotting software Proprietary software Visual programming languages