Filter (software)
   HOME

TheInfoList



OR:

A filter is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer program ...
or
subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions may ...
to process a
stream A stream is a continuous body of water, body of surface water Current (stream), flowing within the stream bed, bed and bank (geography), banks of a channel (geography), channel. Depending on its location or certain characteristics, a stream ...
, producing another stream. While a single filter can be used individually, they are frequently strung together to form a
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
. Some
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s such as
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
are rich with filter programs.
Windows 7 Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on July 22, 2009, and became generally available on October 22, 2009. It is the successor to Windows Vista, released nearly ...
and later are also rich with filters, as they include
Windows PowerShell PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-so ...
. In comparison, however, few filters are built into
cmd.exe Command Prompt, also known as cmd.exe or cmd, is the default command-line interpreter for the OS/2, eComStation, ArcaOS, Microsoft Windows (Windows NT family and Windows CE family), and ReactOS operating systems. On Windows CE .NET 4.2, Windo ...
(the original command-line interface of Windows), most of which have significant enhancements relative to the similar filter commands that were available in
MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few ope ...
. OS X includes filters from its underlying Unix base but also has Automator, which allows filters (known as "Actions") to be strung together to form a pipeline.


Unix

In
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
and
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems, a filter is a program that gets most of its data from its
standard input In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
(the main input stream) and writes its main results to its
standard output In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
(the main output stream). Auxiliary input may come from command line flags or configuration files, while auxiliary output may go to
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
. The command syntax for getting data from a device or file other than standard input is the input operator (<). Similarly, to send data to a device or file other than standard output is the output operator (>). To append data lines to an existing output file, one can use the append operator (>>). Filters may be strung together into a
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
with the pipe operator (", "). This operator signifies that the main output of the command to the left is passed as main input to the command on the right. The
Unix philosophy The Unix philosophy, originated by Ken Thompson, is a set of cultural norms and philosophical approaches to minimalist, modular software development. It is based on the experience of leading developers of the Unix operating system. Early Unix de ...
encourages combining small, discrete tools to accomplish larger tasks. The classic filter in Unix is
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
's , which
Doug McIlroy Malcolm Douglas McIlroy (born 1932) is a mathematician, engineer, and programmer. As of 2019 he is an Adjunct Professor of Computer Science at Dartmouth College. McIlroy is best known for having originally proposed Unix pipelines and developed s ...
cites as what "ingrained the tools outlook irrevocably" in the operating system, with later tools imitating it. at its simplest prints any lines containing a character string to its output. The following is an example: cut -d : -f 1 /etc/passwd , grep foo This finds all registered users that have "
foo The terms foobar (), foo, bar, baz, and others are used as metasyntactic variables and placeholder names in computer programming or computer-related documentation. - Etymology of "Foo" They have been used to name entities such as variables, f ...
" as part of their username by using the
cut Cut may refer to: Common uses * The act of cutting, the separation of an object into two through acutely-directed force ** A type of wound ** Cut (archaeology), a hole dug in the past ** Cut (clothing), the style or shape of a garment ** Cut (ea ...
command to take the first field (username) of each line of the Unix system password file and passing them all as input to grep, which searches its input for lines containing the character string "foo" and prints them on its output. Common Unix filter programs are:
cat The cat (''Felis catus'') is a domestic species of small carnivorous mammal. It is the only domesticated species in the family Felidae and is commonly referred to as the domestic cat or house cat to distinguish it from the wild members of ...
,
cut Cut may refer to: Common uses * The act of cutting, the separation of an object into two through acutely-directed force ** A type of wound ** Cut (archaeology), a hole dug in the past ** Cut (clothing), the style or shape of a garment ** Cut (ea ...
,
grep grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command ''g/re/p'' (''globally search for a regular expression and print matching lines''), which has the sa ...
, head,
sort Sort may refer to: * Sorting, any process of arranging items in sequence or in sets ** Sorting algorithm, any algorithm for arranging elements in lists ** Sort (Unix), a Unix utility which sorts the lines of a file ** Sort (C++), a function in the ...
,
tail The tail is the section at the rear end of certain kinds of animals’ bodies; in general, the term refers to a distinct, flexible appendage to the torso. It is the part of the body that corresponds roughly to the sacrum and coccyx in mammal ...
, and
uniq uniq is a utility command on Unix, Plan 9, Inferno, and Unix-like operating systems which, when fed a text file or standard input, outputs the text with adjacent identical lines collapsed to one, unique line of text. Overview The command is ...
. Programs like
awk AWK (''awk'') is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems. The AWK lang ...
and
sed sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed w ...
can be used to build quite complex filters because they are fully programmable. Unix filters can also be used by
Data scientists Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a ...
to get a quick overview about a file based dataset.


List of Unix filter programs

*
awk AWK (''awk'') is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems. The AWK lang ...
*
cat The cat (''Felis catus'') is a domestic species of small carnivorous mammal. It is the only domesticated species in the family Felidae and is commonly referred to as the domestic cat or house cat to distinguish it from the wild members of ...
*
comm The command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. is specified in the POSIX standard. It has been widely available on Unix-like operating systems since ...
*
compress compress is a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly lo ...
*
cut Cut may refer to: Common uses * The act of cutting, the separation of an object into two through acutely-directed force ** A type of wound ** Cut (archaeology), a hole dug in the past ** Cut (clothing), the style or shape of a garment ** Cut (ea ...
* expand * fold *
grep grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command ''g/re/p'' (''globally search for a regular expression and print matching lines''), which has the sa ...
* head * nl * paste *
perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
* pr *
sed sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed w ...
* sh *
sort Sort may refer to: * Sorting, any process of arranging items in sequence or in sets ** Sorting algorithm, any algorithm for arranging elements in lists ** Sort (Unix), a Unix utility which sorts the lines of a file ** Sort (C++), a function in the ...
*
split Split(s) or The Split may refer to: Places * Split, Croatia, the largest coastal city in Croatia * Split Island, Canada, an island in the Hudson Bay * Split Island, Falkland Islands * Split Island, Fiji, better known as Hạfliua Arts, entertai ...
*
strings String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
* tac *
tail The tail is the section at the rear end of certain kinds of animals’ bodies; in general, the term refers to a distinct, flexible appendage to the torso. It is the part of the body that corresponds roughly to the sacrum and coccyx in mammal ...
*
tee A tee is a stand used in sport to support and elevate a stationary ball prior to striking with a foot, club or bat. Tees are used extensively in golf, tee-ball, baseball, American football, and rugby. Etymology The word tee is derived from the ...
* tr *
uniq uniq is a utility command on Unix, Plan 9, Inferno, and Unix-like operating systems which, when fed a text file or standard input, outputs the text with adjacent identical lines collapsed to one, unique line of text. Overview The command is ...
* wc * zcat


DOS

Two standard filters from the early days of DOS-based computers are
find Find, FIND or Finding may refer to: Computing * find (Unix), a command on UNIX platforms * find (Windows), a command on DOS/Windows platforms Books * ''The Find'' (2010), by Kathy Page * ''The Find'' (2014), by William Hope Hodgson Film and t ...
and
sort Sort may refer to: * Sorting, any process of arranging items in sequence or in sets ** Sorting algorithm, any algorithm for arranging elements in lists ** Sort (Unix), a Unix utility which sorts the lines of a file ** Sort (C++), a function in the ...
. Examples: find "keyword" < ''inputfilename'' > ''outputfilename'' sort "keyword" < ''inputfilename'' > ''outputfilename'' find /v "keyword" < ''inputfilename'' , sort > ''outputfilename'' Such filters may be used in
batch files Batch may refer to: Food and drink * Batch (alcohol), an alcoholic fruit beverage * Batch loaf, a type of bread popular in Ireland * A dialect term for a bread roll used in North Warwickshire, Nuneaton and Coventry, as well as on the Wirral ...
(*.bat, *.cmd etc.). For use in the same
command shell In computing, a shell is a computer program that exposes an operating system's services to a human user or other programs. In general, operating system shells use either a command-line interface (CLI) or graphical user interface (GUI), depending ...
environment, there are many more filters available than those built into Windows. Some of these are
freeware Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for t ...
, some shareware and some are commercial programs. A number of these mimic the function and features of the filters in Unix. Some filtering programs have a
graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
(GUI) to enable users to design a customized filter to suit their special data processing and/or data mining requirements.


Windows

Windows Command Prompt Command Prompt, also known as cmd.exe or cmd, is the default command-line interpreter for the OS/2, eComStation, ArcaOS, Microsoft Windows (Windows NT family and Windows CE family), and ReactOS operating systems. On Windows CE .NET 4.2, W ...
inherited MS-DOS commands, improved some and added a few. For example,
Windows Server 2003 Windows Server 2003 is the sixth version of Windows Server operating system produced by Microsoft. It is part of the Windows NT family of operating systems and was released to manufacturing on March 28, 2003 and generally available on April 24, 2 ...
features six command-line filters for modifying
Active Directory Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. It is included in most Windows Server operating systems as a set of Process (computing), processes and Windows service, services. Initially, Active D ...
that can be chained by piping: DSAdd, DSGet, DSMod, DSMove, DSRm and DSQuery.
Windows PowerShell PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-so ...
adds an entire host of filters known as "cmdlets" which can be chained together with a pipe, except a few simple ones, e.g. Clear-Screen. The following example gets a list of files in the C:\Windows folder, gets the size of each and sorts the size in ascending order. It shows how three filters (Get-ChildItem, ForEach-Object and Sort-Object) are chained with pipes. Get-ChildItem C:\Windows , ForEach-Object , Sort-Object -Ascending


References

{{Reflist


External links

* http://www.webopedia.com/TERM/f/filter.html Software design patterns Programming paradigms Operating system technology