m4 is a
general-purpose macro processor
A general-purpose macro processor or general purpose preprocessor is a macro processor that is not tied to or integrated with a particular language or piece of software.
A macro processor is a program that copies a stream of text from one place ...
included in most
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems, and is a component of the
POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inter ...
standard.
The language was designed by
Brian Kernighan
Brian Wilson Kernighan (; born 1942) is a Canadian computer scientist.
He worked at Bell Labs and contributed to the development of Unix alongside Unix creators Ken Thompson and Dennis Ritchie. Kernighan's name became widely known through co- ...
and
Dennis Ritchie for the original versions of
UNIX
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, an ...
. It is an extension of an earlier macro processor, m3, written by Ritchie for an unknown AP-3 minicomputer.
[Brian W. Kernighan and Dennis M. Ritchie. The m4 macro processor. Technical report, Bell Laboratories, Murray Hill, New Jersey, USA, 1977]
pdf
/ref>
The macro preprocessor
In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by so ...
operates as a text-replacement tool. It is employed to re-use text templates, typically in computer programming
Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
applications, but also in text editing and text-processing applications. Most users require m4 as a dependency of GNU autoconf
GNU Autoconf is a tool for producing configure scripts for building, installing, and packaging software on computer systems where a Bourne shell is available.
Autoconf is agnostic about the programming languages used, but it is often used for ...
.
History
Macro processors became popular when programmers commonly used assembly language. In those early days of programming, programmers noted that much of their programs consisted of repeated text, and they invented simple means for reusing this text. Programmers soon discovered the advantages not only of reusing entire blocks of text, but also of substituting different values for similar parameters. This defined the usage range of macro processors.
In the 1960s, an early general-purpose macro processor, M6, was in use at AT&T Bell Laboratories, which was developed by Douglas McIlroy
Malcolm Douglas McIlroy (born 1932) is a mathematician, engineer, and programmer. As of 2019 he is an Adjunct Professor of Computer Science at Dartmouth College.
McIlroy is best known for having originally proposed Unix pipelines and developed se ...
, Robert Morris and Andrew Hall.
Kernighan and Ritchie developed m4 in 1977, basing it on the ideas of Christopher Strachey
Christopher S. Strachey (; 16 November 1916 – 18 May 1975) was a British computer scientist. He was one of the founders of denotational semantics, and a pioneer in programming language design and computer time-sharing.F. J. Corbató, et al. ...
. The distinguishing features of this style of macro preprocessing included:
* free-form syntax (not line-based like a typical macro preprocessor designed for assembly-language processing)
* the high degree of re-expansion (a macro's arguments get expanded twice: once during scanning and once at interpretation time)
The implementation of Rational Fortran used m4 as its macro engine from the beginning; and most Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, an ...
variants ship with it.
many applications continue to use m4 as part of the GNU
GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
Project's autoconf
GNU Autoconf is a tool for producing configure scripts for building, installing, and packaging software on computer systems where a Bourne shell is available.
Autoconf is agnostic about the programming languages used, but it is often used for ...
. It also appears in the configuration process of sendmail
Sendmail is a general purpose internetwork email routing facility that supports many kinds of mail-transfer and delivery methods, including the Simple Mail Transfer Protocol (SMTP) used for email transport over the Internet.
A descendant of the ...
(a widespread mail transfer agent
The mail or post is a system for physically transporting postcards, letters, and parcels. A postal service can be private or public, though many governments place restrictions on private systems. Since the mid-19th century, national postal syst ...
) and for generating footprints in the gEDA
The term gEDA refers to two things:
# A set of software applications (CAD tools) used for electronic design released under the GPL. As such, gEDA is an ECAD (electronic CAD) or EDA (electronic design automation) application suite. gEDA i ...
toolsuite. The SELinux
Security-Enhanced Linux (SELinux) is a Linux kernel security module that provides a mechanism for supporting access control security policies, including mandatory access controls (MAC).
SELinux is a set of kernel modifications and user-space ...
Reference Policy relies heavily on the m4 macro processor.
m4 has many uses in code generation, but (as with any macro processor) problems can be hard to debug.[Kenneth J. Turner. Exploiting the m4 macro language. Technical Report CSM-126, Department of Computing Science and Mathematics, University of Stirling, Scotland, September 1994]
pdf
/ref>
Features
m4 offers these facilities:
* a free-form syntax, rather than line-based syntax
* a high degree of macro expansion (arguments get expanded during scan and again during interpretation)
* text replacement
* parameter substitution
* file inclusion
* string manipulation
* conditional evaluation
* arithmetic expressions
* system interface
* programmer diagnostics
* programming language independent
* human language independent
* provides programming language capabilities
Unlike most earlier macro processors, m4 does not target any particular computer or human language; historically, however, its development originated for supporting the Ratfor
Ratfor (short for ''Rational Fortran'') is a programming language implemented as a preprocessor for Fortran 66. It provides modern control structures, unavailable in Fortran 66, to replace GOTOs and statement numbers.
Features
Ratfor provides ...
dialect of Fortran. Unlike some other macro processors, m4 is Turing-complete
In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any ...
as well as a practical programming language.
Unquoted identifiers which match defined macros are replaced with their definitions. Placing identifiers in quotes suppresses expansion until possibly later, such as when a quoted string is expanded as part of macro replacement. Unlike most languages, strings in m4 are quoted using the backtick (`) as the starting delimiter, and apostrophe (') as the ending delimiter. Separate starting and ending delimiters allows the arbitrary nesting of quotation marks in strings to be used, allowing a fine degree of control of how and when macro expansion takes place in different parts of a string.
Example
The following fragment gives a simple example that could form part of a library for generating HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaSc ...
code. It defines a commented macro to number sections automatically:
divert(-1)
m4 has multiple output queues that can be manipulated with the
`divert' macro. Valid queues range from 0 to 10, inclusive, with
the default queue being 0. As an extension, GNU m4 supports more
diversions, limited only by integer type size.
Calling the `divert' macro with an invalid queue causes text to be
discarded until another call. Note that even while output is being
discarded, quotes around `divert' and other macros are needed to
prevent expansion.
# Macros aren't expanded within comments, meaning that keywords such
# as divert and other built-ins may be used without consequence.
# HTML utility macro:
define(`H2_COUNT', 0)
# The H2_COUNT macro is redefined every time the H2 macro is used:
define(`H2',
`define(`H2_COUNT', incr(H2_COUNT))H2_COUNT. $1
')
divert(1)dnl
dnl
dnl The dnl macro causes m4 to discard the rest of the line, thus
dnl preventing unwanted blank lines from appearing in the output.
dnl
H2(First Section)
H2(Second Section)
H2(Conclusion)
dnl
divert(0)dnl
dnl
undivert(1)dnl One of the queues is being pushed to output.
Processing this code with m4 generates the following text:
1. First Section
2. Second Section
3. Conclusion
Implementations
FreeBSD, NetBSD, and OpenBSD provide independent implementations of the m4 language. Furthermore, the Heirloom Project
The Heirloom Project is a collection of traditional Unix utilities. Most of them are derived from original Unix source code, as released as open-source by Caldera and Sun.
The project has the following components:
* The Heirloom Toolchest: awk, ...
Development Tools includes a free version of the m4 language, derived from OpenSolaris.
M4 has been included in the Inferno
Inferno may refer to:
* Hell, an afterlife place of suffering
* Conflagration, a large uncontrolled fire
Film
* ''L'Inferno'', a 1911 Italian film
* Inferno (1953 film), ''Inferno'' (1953 film), a film noir by Roy Ward Baker
* Inferno (1973 fi ...
operating system. This implementation is more closely related to the original m4 developed by Kernighan and Ritchie in Version 7 Unix
Seventh Edition Unix, also called Version 7 Unix, Version 7 or just V7, was an important early release of the Unix operating system. V7, released in 1979, was the last Bell Laboratories release to see widespread distribution before the commercial ...
than its more sophisticated relatives in UNIX System V
Unix System V (pronounced: "System Five") is one of the first commercial versions of the Unix operating system. It was originally developed by AT&T and first released in 1983. Four major versions of System V were released, numbered 1, 2, 3, an ...
and POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inter ...
.
''GNU m4'' is an implementation of m4 for the GNU Project
The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaborat ...
.[GNU m4 web sit]
"GNU M4"
accessed January 25, 2020.[GNU m4 manual, online and for download in HTML, PDF, and other forms]
accessed January 25, 2020. It is designed to avoid many kinds of arbitrary limits found in traditional m4 implementations, such as maximum line lengths, maximum size of a macro and number of macros. Removing such arbitrary limits is one of the stated goals of the GNU Project.
quote: "Avoid arbitrary limits on the length or number of any data structure".
The GNU Autoconf
GNU Autoconf is a tool for producing configure scripts for building, installing, and packaging software on computer systems where a Bourne shell is available.
Autoconf is agnostic about the programming languages used, but it is often used for ...
package makes extensive use of the features of GNU m4.
GNU m4 is currently maintained by Gary V. Vaughan and Eric Blake. Released under the terms of the GNU General Public License
The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general ...
, GNU m4 is free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
.
See also
* C preprocessor
The C preprocessor is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control ...
* Macro (computer science)
In computer programming, a macro (short for "macro instruction"; ) is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion. The input and output ...
* Make
Make or MAKE may refer to:
* Make (magazine), a tech DIY periodical
*Make (software), a software build tool
*Make, Botswana, in the Kalahari Desert
*Make Architects
Make Architects is an international architecture practice headquartered in Londo ...
* Template processor
A template processor (also known as a template engine or template parser) is software designed to combine templates with a data model to produce result documents.
The language that the templates are written in is known as a template language ...
* Web template system
A web template system in web publishing lets web designers and developers work with ''web templates'' to automatically generate custom web pages, such as the results from a search. This reuses static web page elements while defining dynamic e ...
*
References
External links
GNU m4 website
Macro Magic: m4, Part One
an
Part Two
{{DEFAULTSORT:M4 (Computer Language)
Macro programming languages
Unix programming tools
Unix SUS2008 utilities
Inferno (operating system) commands