Getopt is a
C library
A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
used to parse
command-line option
A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
s of the Unix/POSIX style. It is a part of the
POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
specification, and is universal to
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
systems.
It is also the name of a Unix program for parsing command line arguments in shell scripts.
History
A long-standing issue with
command line programs was how to specify options; early programs used many ways of doing so, including single character options (
-a
), multiple options specified together (
-abc
is equivalent to
-a -b -c
), multicharacter options (
-inum
), options with arguments (
-a arg
,
-inum 3
,
-a=arg
), and different prefix characters (
-a
,
+b
,
/c
).
The
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
was written to be a standard mechanism that all programs could use to parse
command-line option
A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
s so that there would be a common interface on which everyone could depend. As such, the original authors picked out of the variations support for single character options,
multiple options specified together, and options with arguments (
-a arg
or
-aarg
), all controllable by an option string.
dates back to at least 1980 and was first published by
AT&T
AT&T Inc. is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the world's largest telecommunications company by revenue and the third largest provider of mobile tel ...
at the 1985 UNIFORUM conference in Dallas, Texas, with the intent for it to be available in the public domain. Versions of it were subsequently picked up by other flavors of Unix (
4.3BSD The History of the Berkeley Software Distribution begins in the 1970s.
1BSD (PDP-11)
The earliest distributions of Unix from Bell Labs in the 1970s included the source code to the operating system, allowing researchers at universities to modify an ...
,
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
, etc.). It is specified in the
POSIX.2 standard as part of the
unistd.h
In the C and C++ programming languages, unistd.h is the name of the header file that provides access to the POSIX operating system API. It is defined by the POSIX.1 standard, the base of the Single Unix Specification, and should therefore be a ...
header file
Many programming languages and other computer files have a directive, often called include (sometimes copy or import), that causes the contents of the specified file to be inserted into the original file. These included files are called copybooks ...
. Derivatives of have been created for many
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
s to parse command-line options.
Extensions
is a system dependent function, and its behavior depends on the implementation in the C library. Some custom implementations like
gnulib
Gnulib, also called the GNU portability library, is a collection of software subroutines which are designed to be usable on many operating systems. The goal of the project is to make it easy for free software authors to make their software run ...
are available, however.
The conventional (POSIX and BSD) handling is that the options end when the first non-option argument is encountered, and that would return -1 to signal that. In the
glibc
The GNU C Library, commonly known as glibc, is the GNU Project's implementation of the C standard library. Despite its name, it now also directly supports C++ (and, indirectly, other programming languages). It was started in the 1980s by ...
extension, however, options are allowed ''anywhere'' for ease of use; implicitly permutes the argument vector so it still leaves the non-options in the end. Since POSIX already has the convention of returning -1 on and skipping it, one can always portably use it as an end-of-options signifier.
A
GNU
GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
extension, getopt_long, allows parsing of more readable, multicharacter options, which are introduced by two dashes instead of one. The choice of two dashes allows multicharacter options (
--inum
) to be differentiated from single character options specified together (
-abc
). The GNU extension also allows an alternative format for options with arguments:
--name=arg
.
This interface proved popular, and has been taken up (sans the permution) by many BSD distributions including
FreeBSD
FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
as well as
Solaris
Solaris may refer to:
Arts and entertainment Literature, television and film
* ''Solaris'' (novel), a 1961 science fiction novel by Stanisław Lem
** ''Solaris'' (1968 film), directed by Boris Nirenburg
** ''Solaris'' (1972 film), directed by ...
. An alternative way to support long options is seen in Solaris and
Korn Shell
KornShell (ksh) is a Unix shell which was developed by David Korn at Bell Labs in the early 1980s and announced at USENIX on July 14, 1983. The initial development was based on Bourne shell source code. Other early contributors were Bell L ...
(extending ''optstring''), but it was not as popular.
Another common advanced extension of getopt is resetting the state of argument parsing; this is useful as a replacement of the options-anyware GNU extension, or as a way to "layer" a set of command-line interface with different options at different levels. This is achieved in BSD systems using an variable, and on GNU systems by setting to 0.
A common companion function to is . It parses a string of comma-separated sub-options.
Usage
For users
The command-line syntaxes for getopt-based programs is the POSIX-recommended Utility Argument Syntax. In short:
* Options are single-character alphanumerics preceded by a
-
(hyphen-minus) character.
* Options can take an argument, mandatory or optional, or none.
* In order to specify that an option takes an argument, include
:
after the option name (only during initial specification)
* When an option takes an argument, this can be in the same token or in the next one. In other words, if
o
takes an argument,
-ofoo
is the same as
-o foo
.
* Multiple options can be chained together, as long as the non-last ones are not argument taking. If
a
and
b
take no arguments while
e
takes an optional argument,
-abe
is the same as
-a -b -e
, but
-bea
is not the same as
-b -e a
due to the preceding rule.
* All options precede non-option arguments (except for in the GNU extension).
--
always marks the end of options.
Extensions on the syntax include the GNU convention and Sun's specification.
For programmers
The getopt manual from GNU specifies such a usage for getopt:
#include
int getopt(int argc, char * const argv[],
const char *optstring);
Here the and are defined exactly like they are in the C function prototype; i.e., argc indicates the length of the argv array-of-strings. The contains a specification of what options to look for (normal alphanumerals except ), and what options to accept arguments (colons). For example, refers to three options: an argumentless , an optional-argument , and a mandatory-argument . GNU here implements a extension for long option synonyms.
[
itself returns an integer that is either an option character or -1 for end-of-options.][ The idiom is to use a while-loop to go through options, and to use a switch-case statement to pick and act on options. See the example section of this article.
To communicate extra information back to the program, a few global variables are referenced by the program to fetch information from :
extern char *optarg;
extern int optind, opterr, optopt;
; optarg: A pointer to the argument of the current option, if present. Can be used to control where to start parsing (again).
; optind: Where getopt is currently looking at in .
; opterr: A boolean switch controlling whether getopt should print error messages.
; optopt: If an unrecognized option occurs, the value of that unrecognized character.
The GNU extension interface is similar, although it belongs to a different ]header file
Many programming languages and other computer files have a directive, often called include (sometimes copy or import), that causes the contents of the specified file to be inserted into the original file. These included files are called copybooks ...
and takes an extra option for defining the "short" names of long options and some extra controls. If a short name is not defined, getopt will put an index referring to the option structure in the pointer instead.[
#include
int getopt_long(int argc, char * const argv[],
const char *optstring,
const struct option *longopts, int *longindex);
]
Examples
Using POSIX standard ''getopt''
#include /* for printf */
#include /* for exit */
#include /* for getopt */
int main (int argc, char **argv)
Using GNU extension ''getopt_long''
#include /* for printf */
#include /* for exit */
#include /* for getopt_long; POSIX standard getopt is in unistd.h */
int main (int argc, char **argv)
In Shell
Shell script programmers commonly want to provide a consistent way of providing options. To achieve this goal, they turn to getopts and seek to port it to their own language.
The first attempt at porting was the program ''getopt'', implemented by Unix System Laboratories
Unix System Laboratories (USL), sometimes written UNIX System Laboratories to follow relevant trademark guidelines of the time, was an American software laboratory and product development company that existed from 1989 through 1993. At first wh ...
(USL). This version was unable to deal with quoting and shell metacharacters, as it shows no attempts at quoting. It has been inherited to FreeBSD.
In 1986, USL decided that being unsafe around metacharacters and whitespace was no longer acceptable, and they created the builtin getopts
getopts is a built-in Unix shell command for parsing command-line arguments. It is designed to process command line arguments that follow the POSIX Utility Syntax Guidelines, based on the C interface of getopt.
The predecessor to was the exte ...
command for Unix SVR3 Bourne Shell instead. The advantage of building the command into the shell is that it now has access to the shell's variables, so values could be written safely without quoting. It uses the shell's own variables to track the position of current and argument positions, and , and returns the option name in a shell variable.
In 1995, getopts
was included in the Single UNIX Specification version 1 / X/Open X/Open group (also known as the Open Group for Unix Systems and incorporated in 1987 as X/Open Company, Ltd.) was a consortium founded by several European UNIX systems manufacturers in 1984 to identify and promote open standards in the field of info ...
Portability Guidelines Issue 4. Now a part of the POSIX Shell standard, getopts have spread far and wide in many other shells trying to be POSIX-compliant.
''getopt'' was basically forgotten until util-linux
is a standard package distributed by the Linux Kernel Organization for use as part of the Linux operating system. A fork, (with meaning "next generation"), was created when development stalled, but has been renamed back to , and is the offic ...
came out with an enhanced version that fixed all of old getopt's problems by escaping. It also supports GNU's long option names. On the other hand, long options have been implemented rarely in the command in other shells, ksh93
KornShell (ksh) is a Unix shell which was developed by David Korn at Bell Labs in the early 1980s and announced at USENIX on July 14, 1983. The initial development was based on Bourne shell source code. Other early contributors were Bell L ...
being an exception.
In other languages
''getopt'' is a concise description of the common POSIX command argument structure, and it is replicated widely by programmers seeking to provide a similar interface, both to themselves and to the user on the command-line.
* C: non-POSIX systems do not ship in the C library, but gnulib
Gnulib, also called the GNU portability library, is a collection of software subroutines which are designed to be usable on many operating systems. The goal of the project is to make it easy for free software authors to make their software run ...
and MinGW
MinGW ("Minimalist GNU for Windows"), formerly mingw32, is a free and open source software development environment to create Microsoft Windows applications.
MinGW includes a port of the GNU Compiler Collection (GCC), GNU Binutils for Windows ( ...
(both accept GNU-style), as well as some more minimal libraries, can be used to provide the functionality. Alternative interfaces also exist:
** The library, used by RPM package manager
RPM Package Manager (RPM) (originally Red Hat Package Manager, now a recursive acronym) is a free and open-source package management system. The name RPM refers to the file format and the package manager program itself. RPM was intended primaril ...
, has the additional advantage of being reentrant
Reentrant or re-entrant can refer to:
*Re-entrant (landform), the low ground formed between two hill spurs.
*Reentrancy (computing) in computer programming
*Reentrant mutex in computer science
*Reentry (neural circuitry) in neuroscience
*Salients ...
.
** The family of functions in glibc and gnulib provides some more convenience and modularity.
* D programming language
D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineeri ...
: has getopt module in the D standard library.
* Go: comes with the package, which allows long flag names. The package supports processing closer to the C function. There is also another package providing interface much closer to the original POSIX one.
* Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lang ...
: comes with System.Console.GetOpt, which is essentially a Haskell port of the GNU getopt library.
* Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
: There is no implementation of getopt in the Java standard library. Several open source modules exist, including gnu.getopt.Getopt, which is ported from GNU getopt, and Apache Commons The Apache Commons is a project of the Apache Software Foundation, formerly under the Jakarta Project. The purpose of the Commons is to provide reusable, open source Java software. The Commons is composed of three parts: proper, sandbox, and dorm ...
CLI.
* Lisp
A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech.
Types
* A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
: has many different dialects with no common standard library. There are some third party implementations of getopt for some dialects of Lisp. Common Lisp
Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fro ...
has a prominent third party implementation.
* Free Pascal
Free Pascal Compiler (FPC) is a compiler for the closely related programming-language dialects Pascal and Object Pascal. It is free software released under the GNU General Public License, witexception clausesthat allow static linking against it ...
: has its own implementation as one of its standard units named GetOpts. It is supported on all platforms.
* Perl programming language: has two separate derivatives of getopt in its standard library: Getopt::Long and Getopt::Std.
* PHP
PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
: has a getopt function.
* Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
: contains a module in its standard library
In computer programming, a standard library is the library made available across implementations of a programming language. These libraries are conventionally described in programming language specifications; however, contents of a language's as ...
based on C's getopt and GNU extensions. Python's standard library also contains other modules to parse options that are more convenient to use.
* Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
: has an implementation of getopt_long in its standard library, GetoptLong. Ruby also has modules in its standard library with a more sophisticated and convenient interface. A third party implementation of the original getopt interface is available.
* .NET Framework
The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
: does not have getopt functionality in its standard library. Third-party implementations are available.
References
{{Reflist, 30em
External links
POSIX specification
Full getopt port for Unicode and Multibyte Microsoft Visual C, C++, or MFC projects
C POSIX library
Command-line software