HOME

TheInfoList



OR:

printf is a
C standard library The C standard library, sometimes referred to as libc, is the standard library for the C (programming language), C programming language, as specified in the ISO C standard.International Organization for Standardization, ISO/International Electrote ...
function that formats
text Text may refer to: Written word * Text (literary theory) In literary theory, a text is any object that can be "read", whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothi ...
and writes it to
standard output Standard may refer to: Symbols * Colours, standards and guidons, kinds of military signs * Standard (emblem), a type of a large symbol or emblem used for identification Norms, conventions or requirements * Standard (metrology), an object t ...
. The function accepts a format c-string
argument An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persu ...
and a variable number of value arguments that the function serializes per the format string. Mismatch between the format specifiers and count and
type Type may refer to: Science and technology Computing * Typing, producing text via a keyboard, typewriter, etc. * Data type, collection of values used for computations. * File type * TYPE (DOS command), a command to display contents of a file. * ...
of values results in
undefined behavior In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
and possibly program crash or other
vulnerability Vulnerability refers to "the quality or state of being exposed to the possibility of being attacked or harmed, either physically or emotionally." The understanding of social and environmental vulnerability, as a methodological approach, involves ...
. The format string is
encoded In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
as a
template language A template processor (also known as a template engine or template parser) is software designed to combine ''template''s with data (defined by a data model) to produce resulting Electronic document, documents or Computer program, programs. The ...
consisting of verbatim text and ''format specifiers'' that each specify how to serialize a value. As the format string is processed left-to-right, a subsequent value is used for each format specifier found. A format specifier starts with a character and has one or more following characters that specify how to serialize a value. The standard library provides other, similar functions that form a family of ''printf-like'' functions. The functions share the same formatting capabilities but provide different behavior such as output to a different destination or safety measures that limit exposure to vulnerabilities. Functions of the printf-family have been implemented in other programming contexts (i.e.
languages Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed forms, and may also be conveyed through writing. Human language is ch ...
) with the same or similar
syntax In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
and
semantics Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
. The scanf C standard library function complements printf by providing formatted input (a.k.a. lexing, a.k.a.
parsing Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...
) via a similar format string syntax. The name, ''printf'', is short for ''print formatted'' where ''print'' refers to output to a printer although the function is not limited to printer output. Today, print refers to output to any text-based environment such as a terminal or a file.


History


1950s: Fortran

Early programming languages like Fortran used special statements with different syntax from other calculations to build formatting descriptions. (2+51+1 pages) In this example, the format is specified on line , and the command refers to it by line number: PRINT 601, IA, IB, AREA 601 FORMAT (4H A= ,I5,5H B= ,I5,8H AREA= ,F10.2, 13H SQUARE UNITS) Hereby: * indicates a
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
of 4 characters " A= " ( means Hollerith Field); * indicates an
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
field of width 5; * indicates a
floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
field of width 10 with 2 digits after the decimal point. An output with input arguments , , and might look like this: A= 100 B= 200 AREA= 1500.25 SQUARE UNITS


1960s: BCPL and ALGOL 68

In 1967,
BCPL BCPL ("Basic Combined Programming Language") is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still f ...
appeared. Its library included the routine. An example application looks like this: WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT) Hereby: * indicates an
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
of width 2 (the order of the format specification's field width and type is reversed compared to C's ); * indicates an integer of width 5; * is a BCPL ''language''
escape sequence In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding (and possibly terminating) characters. Examples * In C and ma ...
representing a
newline A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
character (for which C uses the escape sequence ). In 1968,
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language member of the ALGOL family that was conceived as a successor to the ALGOL 60 language, designed with the goal of a much wider scope of application and ...
had a more function-like
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
, but still used special syntax (the delimiters surround special formatting syntax): printf(($"Color "g", number1 "6d,", number2 "4zd,", hex "16r2d,", float "-d.2d,", unsigned value"-3d"."l$, "red", 123456, 89, BIN 255, 3.14, 250)); In contrast to Fortran, using normal function calls and data types simplifies the language and compiler, and allows the implementation of the input/output to be written in the same language. These advantages were thought to outweigh the disadvantages (such as a complete lack of
type safety In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that ...
in many instances) up until the 2000s, and in most newer languages of that era I/O is not part of the syntax. People have since learned that this potentially results in consequences, ranging from security exploits to hardware failures (e.g., phone's networking capabilities being permanently disabled after trying to connect to an access point named "%p%s%s%s%s%n"). Modern languages, such as
C++20 C20 or C-20 may refer to: Science and technology * Carbon-20 (C-20 or 20C), an isotope of carbon * C20, the smallest possible fullerene (a carbon molecule) * C20 (engineering), a mix of concrete that has a compressive strength of 20 newtons per squ ...
and later, tend to include format specifications as a part of the language syntax, which restore type safety in formatting to an extent, and allow the compiler to detect some invalid combinations of format specifiers and data types at compile time.


1970s: C

In 1973, was included as a C standard library routine as part of
Version 4 Unix Research Unix refers to the early versions of the Unix operating system for PDP-7, DEC PDP-7, PDP-11, VAX and Interdata 7/32 and 8/32 computers, developed in the Bell Labs Computing Sciences Research Center (CSRC). The term ''Research Unix'' first ...
.


1990s: Shell command

In 1990, the
printf printf is a C standard library function that formats text and writes it to standard output. The function accepts a format c-string argument and a variable number of value arguments that the function serializes per the format string. Mism ...
shell Shell may refer to: Architecture and design * Shell (structure), a thin structure ** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses Science Biology * Seashell, a hard outer layer of a marine ani ...
command Command may refer to: Computing * Command (computing), a statement in a computer language * command (Unix), a Unix command * COMMAND.COM, the default operating system shell and command-line interpreter for DOS * Command key, a modifier key on A ...
, modeled after the C standard library function, was included with 4.3BSD-Reno. In 1991, a command was included with GNU shellutils (now part of
GNU Core Utilities The GNU Core Utilities or coreutils is a collection of GNU software that implements many standard, Unix-based shell commands. The utilities generally provide POSIX compliant interface when the environment variable is set, but otherwise offers ...
).


2000s: -Wformat safety

The need to do something about the range of problems resulting from lack of type safety has prompted attempts to make the C++ compiler -aware. The option of GCC allows compile-time checks to calls, enabling the compiler to detect a subset of invalid calls (and issue either a warning or an error, stopping the compilation altogether, depending on other flags). Since the compiler is inspecting format specifiers, enabling this effectively extends the C++ syntax by making formatting a part of it.


2020s: std::print

To address usability issues with the existing C++ input/output support, as well as avoid safety issues of printf the
C++ standard library The C standard library, sometimes referred to as libc, is the standard library for the C programming language, as specified in the ISO C standard.ISO/ IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the origina ...
was revised to support a new type-safe formatting starting with
C++20 C20 or C-20 may refer to: Science and technology * Carbon-20 (C-20 or 20C), an isotope of carbon * C20, the smallest possible fullerene (a carbon molecule) * C20 (engineering), a mix of concrete that has a compressive strength of 20 newtons per squ ...
. The approach of resulted from incorporating Victor Zverovich's API into the language specification (Zverovich wrote the first draft of the new format proposal); consequently, is an implementation of the C++20 format specification. In
C++23 C++23, formally ISO/IEC 14882:2024, is the current open standard for the C++ programming language that follows C++20. The final draft of this version is N4950. In February 2020, at the final meeting for C++20 in Prague, an overall plan for C++ ...
, another function, , was introduced that combines formatting and outputting and therefore is a functional replacement for . As the format specification has become a part of the language syntax, a C++ compiler is able to prevent invalid combinations of types and format specifiers in many cases. Unlike the option, this is not an optional feature. The format specification of and is, in itself, an extensible "mini-language" (referred to as such in the specification), an example of a
domain-specific language A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging ...
. As such, , completes a historical cycle; bringing the state-of-the-art (as of 2024) back to what it was in the case of Fortran's first implementation in the 1950s.


Format specifier

Formatting of a value is specified as markup in the format string. For example, the following outputs Your age is and then the value of the variable in decimal format. printf("Your age is %d", age);


Syntax

The syntax for a format specifier is: % 'parameter''''flags''] 'width''.''precision''] 'length'''type''


Parameter field

The parameter field is optional. If included, then matching specifiers to values is sequential. The numeric value selects the n-th value parameter. This is a
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
extension; not C99. This field allows for using the same value multiple times in a format string instead of having to pass the value multiple times. If a specifier includes this field, then subsequent specifiers must also. For example, printf("%2$d %2$#x; %1$d %1$#x",16,17); outputs: This field is particularly useful for localizing messages to different
natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
s that use different
word order In linguistics, word order (also known as linear order) is the order of the syntactic constituents of a language. Word order typology studies it from a cross-linguistic perspective, and examines how languages employ different orders. Correlatio ...
s. In
Windows API The Windows API, informally WinAPI, is the foundational application programming interface (API) that allows a computer program to access the features of the Microsoft Windows operating system in which the program is running. Programs can acces ...
, support for this feature is via a different function, .


Flags field

The flags field can be zero or more of (in any order):


Width field

The width field specifies the number of characters to output. If the value can be represented in fewer characters, then the value is left-padded with spaces so that output is the number of characters specified. If the value requires more characters, then the output is longer than the specified width. A value is never truncated. For example, specifies a width of 3 and outputs with a space on the left to output 3 characters. The call outputs which is 4 characters long since that is the minimum width for that value even though the width specified is 3. If the width field is omitted, the output is the minimum number of characters for the value. If the field is specified as , then the width value is read from the list of values in the call. For example, outputs 10 where the second parameter, , is the width (matches with ) and is the value to serialize (matches with ). Though not part of the width field, a leading zero is interpreted as the zero-padding flag mentioned above, and a negative value is treated as the positive value in conjunction with the left-alignment flag also mentioned above. The width field can be used to format values as a table (tabulated output). But, columns do not align if any value is larger than fits in the width specified. For example, notice that the last line value () does not fit in the first column of width 3 and therefore the column is not aligned. 1 1 12 12 123 123 1234 123


Precision field

The precision field usually specifies a limit of the output, depending on the particular formatting type. For
floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
numeric types, it specifies the number of digits to the right of the decimal point to which the output should be rounded; for and it specifies the total number of significant digits (before and after the decimal, not including leading or trailing zeroes) to round to. For the string type, it limits the number of characters that should be output, after which the string is truncated. The precision field may be omitted, or a numeric integer value, or a dynamic value when passed as another argument when indicated by an asterisk (). For example, outputs .


Length field

The length field can be omitted or be any of: Platform-specific length options came to exist prior to widespread use of the ISO C99 extensions, including: ISO C99 includes the inttypes.h header file that includes a number of macros for platform-independent coding. For example: specifies decimal format for a 64-bit signed integer. Since the macros evaluate to a
string literal string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where , "foo ...
, and the compiler concatenates adjacent string literals, the expression compiles to a single string. Macros include:


Type field

The type field can be any of:


Custom data type formatting

A common way to handle formatting with a custom data type is to format the custom data type value into a
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
, then use the specifier to include the serialized value in a larger message. Some printf-like functions allow extensions to the escape-character-based mini-language, thus allowing the programmer to use a specific formatting function for non-builtin types. One is the (now deprecated)
glibc The GNU C Library, commonly known as glibc, is the GNU Project implementation of the C standard library. It provides a wrapper around the system calls of the Linux kernel and other kernels for application use. Despite its name, it now also dir ...
'

However, it is rarely used due to the fact that it conflicts with static format string checking. Another i
Vstr custom formatters
which allows adding multi-character format names. Some applications (like the
Apache HTTP Server The Apache HTTP Server ( ) is a free and open-source software, free and open-source cross-platform web server, released under the terms of Apache License, Apache License 2.0. It is developed and maintained by a community of developers under the ...
) include their own printf-like function, and embed extensions into it. However these all tend to have the same problems that has. The
Linux kernel The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
printk function supports a number of ways to display kernel structures using the generic specification, by additional format characters. For example, prints an IPv4 address in dotted-decimal form. This allows static format string checking (of the portion) at the expense of full compatibility with normal printf.


Vulnerabilities


Format string attack

Extra value arguments are ignored, but if the format string has more format specifiers than value arguments passed, the behavior is undefined. For some C compilers, an extra format specifier results in consuming a value even though there isn't one which allows the format string attack. Generally, for C, arguments are passed on the stack. If too few arguments are passed, then printf can read past the end of the stack frame, thus allowing an attacker to read the stack. Some compilers, like the GNU Compiler Collection, will statically check the format strings of printf-like functions and warn about problems (when using the flags or ). GCC will also warn about user-defined printf-style functions if the non-standard "format" is applied to the function.


Uncontrolled format string exploit

The format string is often a
string literal string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where , "foo ...
, which allows static analysis of the function call. However, the format string can be the value of a variable, which allows for dynamic formatting but also a security vulnerability known as an uncontrolled format string exploit.


Memory write

Although an output function on the surface, allows writing to a memory location specified by an argument via . This functionality is occasionally used as a part of more elaborate format-string attacks. The functionality also makes accidentally
Turing-complete In computability theory, a system of data-manipulation rules (such as a model of computation, a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be ...
even with a well-formed set of arguments. A game of tic-tac-toe written in the format string is a winner of the 27th IOCCC.


Related functions


Family

Variants of in the C standard library include: outputs to a file instead of standard output. writes to a string buffer instead of standard output. provides a level of safety over since the caller provides a length ''n'' that is the length of the output buffer in bytes (including space for the trailing nul). provides for safety by accepting a string
handle A handle is a part of, or an attachment to, an object that allows it to be grasped and object manipulation, manipulated by hand. The design of each type of handle involves substantial ergonomics, ergonomic issues, even where these are dealt wi ...
(char**) argument. The function allocates a buffer of sufficient size to contain the formatted text and outputs the buffer via the handle. For each function of the family, including printf, there is also a variant that accepts a single argument rather than a variable list of arguments. Typically, these variants start with "v". For example: , , . Generally, printf-like functions return the number of bytes output or -1 to indicate failure.


Other contexts

The following list includes notable programming languages that provide (directly or via a standard library) functionality that is the same or similar to the C printf-like functions. Excluded are languages that use format strings that deviate from the style in this article (such as
AMPL AMPL (A Mathematical Programming Language) is an algebraic modeling language to describe and solve high-complexity problems for large-scale mathematical computing (e.g. large-scale optimization and scheduling-type problems). It was developed ...
and
Elixir An elixir is a sweet liquid used for medical purposes, to be taken orally and intended to cure one's illness. When used as a dosage form, pharmaceutical preparation, an elixir contains at least one active ingredient designed to be taken orall ...
), languages that inherit their implementation from the
JVM A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally descri ...
or other environment (such as
Clojure Clojure (, like ''closure'') is a dynamic programming language, dynamic and functional programming, functional dialect (computing), dialect of the programming language Lisp (programming language), Lisp on the Java (software platform), Java platfo ...
and Scala), and languages that do not have a standard native printf implementation but have external libraries which emulate printf behavior (such as
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
). * awk * C * C++ * D * F# * G * GNU MathProg *
GNU Octave GNU Octave is a scientific programming language for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly ...
* Go *
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
* J *
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
(since version 1.5) and JVM languages * Julia (via Printf standard library) * Lua () *
Maple ''Acer'' is a genus of trees and shrubs commonly known as maples. The genus is placed in the soapberry family Sapindaceae.Stevens, P. F. (2001 onwards). Angiosperm Phylogeny Website. Version 9, June 2008 nd more or less continuously updated si ...
*
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
*
Max Max or MAX may refer to: Animals * Max (American dog) (1983–2013), at one time purported to be the world's oldest living dog * Max (British dog), the first pet dog to win the PDSA Order of Merit (animal equivalent of the OBE) * Max (gorilla) ...
(via the object) *
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
*
OCaml OCaml ( , formerly Objective Caml) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Comparison of multi-paradigm programming languages, multi-paradigm programming language which extends the ...
(via the Printf module) *
PARI/GP PARI/GP is a computer algebra system with the main aim of facilitating number theory computations. Versions 2.1.0 and higher are distributed under the GNU General Public License. It runs on most common operating systems. System overview The P ...
*
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
* PHP * Python (via operator) * R * Raku (via , , and ) * Red/System *
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
* Tcl (via command) *
Transact-SQL Transact-SQL (T-SQL) is Microsoft's and Sybase's proprietary extension to the SQL (Structured Query Language) used to interact with relational databases. T-SQL expands on the SQL standard to include procedural programming, local variables, vari ...
(vi

* Vala (via and )


See also

*
"Hello, World!" program A "Hello, World!" program is usually a simple computer program that emits (or displays) to the screen (often the Console application, console) a message similar to "Hello, World!". A small piece of code in most general-purpose programming languag ...
A basic example program first featured in ''The C Programming Language'' (the "K&R Book"), which in the C example uses printf to output the message "Hello, World!" * * * * * * * * * *


Notes


References


External links


C++ reference for
* *Th

in Java 1.5 *
GNU Bash In computing, Bash (short for "''Bourne Again SHell''") is an interactive command interpreter and command programming language developed for UNIX-like operating systems. Created in 1989 by Brian Fox for the GNU Project, it is supported by the Fre ...
br> builtin
{{Unix commands Articles with example C code Articles with example ALGOL 68 code Articles with example Fortran code C standard library Unix software