Source-code Formatter
   HOME

TheInfoList



OR:

Pretty-printing (or prettyprinting) is the application of any of various stylistic formatting conventions to
text file A text file (sometimes spelled textfile; an old alternative name is flat file) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In ope ...
s, such as
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
, markup, and similar kinds of content. These formatting conventions may entail adhering to an indentation style, using different color and typeface to highlight syntactic elements of source code, or adjusting size, to make the content easier for people to read, and understand. Pretty-printers for source code are sometimes called code formatters or beautifiers.


Pretty-printing mathematics

Pretty-printing usually refers to displaying
mathematical expression In mathematics, an expression is a written arrangement of symbols following the context-dependent, syntactic conventions of mathematical notation. Symbols can denote numbers, variables, operations, and functions. Other symbols include punct ...
s similar to the way they would be typeset professionally. For example, in
computer algebra system A computer algebra system (CAS) or symbolic algebra system (SAS) is any mathematical software with the ability to manipulate mathematical expressions in a way similar to the traditional manual computations of mathematicians and scientists. The de ...
s such as Maxima or
Mathematica Wolfram (previously known as Mathematica and Wolfram Mathematica) is a software system with built-in libraries for several areas of technical computing that allows machine learning, statistics, symbolic computation, data manipulation, network ...
the system may write output like as Some
graphing calculator Graphing Calculator may refer to: * Graphing calculators, calculators that are able to display and/or analyze mathematical function graphs * NuCalc, a computer software program able to perform many graphing calculator functions * Grapher, th ...
s, such as the Casio 9860 series, HP-49/50 series and HP Prime, TI-84 Plus,
TI-89 The TI-89 and the TI-89 Titanium are graphing calculators developed by Texas Instruments (TI). They are differentiated from most other TI graphing calculators by their computer algebra system, which allows symbolic manipulation of algebra ...
, and
TI-Nspire The TI-Nspire is a graphing calculator line made by Texas Instruments, with the first version released on 25 September 2007. The calculators feature a non-QWERTY keyboard and a different key-by-key layout than Texas Instruments's previous ...
, the TI-83 Plus with the PrettyPt add-on, or the TI-84 Plus with the same add-on or the "MathPrint"-enabled OSes, can perform pretty-printing. Additionally, a number of newer scientific calculators are equipped with dot matrix screens capable of pretty-printing such as the Casio FX-ES series (Natural Display), Sharp EL-W series (WriteView), HP SmartCalc 300s,
TI-30XB The TI-30 is a scientific calculator manufactured by Texas Instruments, the first model of which was introduced in 1976. While the original TI-30 was discontinued in 1983 after several design revisions, TI maintains the TI-30 designation as a bran ...
, and
Numworks NumWorks is a technology company that designs, develops, and sells graphing calculators. Their calculators are source-available and have their hardware design available under a Creative Commons license. Its first calculator, the N0100, was relea ...
. Many text formatting programs can also typeset mathematics:
TeX Tex, TeX, TEX, may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Tex Earnhardt (1930–2020), U.S. businessman * Joe Tex (1933–1982), stage name of American soul singer ...
was developed specifically for high-quality mathematical
typesetting Typesetting is the composition of text for publication, display, or distribution by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other ...
.


Pretty-printing markup and tag-based code

Pretty-printing in markup language instances is most typically associated with
indentation __FORCETOC__ In the written form of many languages, indentation describes empty space ( white space) used before or around text to signify an important aspect of the text such as: * Beginning of a paragraph * Hierarchy subordinate concept * Qu ...
of tags and string content to visually determine hierarchy and nesting. Although the syntactical structures of tag-based languages do not significantly vary, the indentation may vary significantly due to how a markup language is interpreted or due to the data it describes. In
MathML Mathematical Markup Language (MathML) is a pair of mathematical markup languages, an application of XML for describing mathematical notations and capturing both its structure and content. Its aim is to natively integrate mathematical formulae ...
,
whitespace character A whitespace character is a character data element that represents white space when text is rendered for display by a computer. For example, a ''space'' character (, ASCII 32) represents blank space such as a word divider in a Western scrip ...
s do not reflect data, meaning, or syntax above what is required by
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
syntax. In
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
, whitespace characters between tags are considered text and are parsed as text nodes into the parsed result. While indentation may be generously applied to a MathML document, sufficient additional care must be taken in pretty-printing an HTML document to ensure additional text nodes are not created or destroyed in general proximity to the content or content-reflective tag elements. This difference in complexity is non-trivial from the perspective of an automated pretty-print operation where no special rules or edge cases are necessary, as in the more simple MathML example. The HTML example may require a series of progressive interrelated algorithms to account for various patterns of tag elements and content that conforms to a uniform style and is consistent in application across various instances, as evidenced by the markup.ts application component used to beautify HTML, XML, and related technologies for the
Pretty Diff Pretty Diff is a language-aware data comparison utility implemented in TypeScript. The online utility is capable of source code prettification, minification, and comparison of two pieces of input text. It operates by removing code comments from ...
tool.


Programming code formatting

Programmers often use tools to format
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
in a particular manner. Proper code formatting makes it easier to read and understand. Different programmers often prefer different styles of formatting, such as the use of code
indentation __FORCETOC__ In the written form of many languages, indentation describes empty space ( white space) used before or around text to signify an important aspect of the text such as: * Beginning of a paragraph * Hierarchy subordinate concept * Qu ...
and whitespace or positioning of braces. A code formatter or code indenter converts source code from one format style to another. This is relatively straightforward because of the unambiguous syntax of programming languages. Code beautification involves parsing the source code into component structures, such as assignment statements, ''if'' blocks, loops, etc. (see also
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an '' ...
), and formatting them in a manner specified by the user in a configuration file. Code beautifiers exist as standalone applications and built into
text editor A text editor is a type of computer program that edits plain text. An example of such program is "notepad" software (e.g. Windows Notepad). Text editors are provided with operating systems and software development packages, and can be used to c ...
s and
integrated development environment An integrated development environment (IDE) is a Application software, software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, an ...
s. For example,
Emacs Emacs (), originally named EMACS (an acronym for "Editor Macros"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, s ...
' various language modes can correctly indent blocks of code attractively.


HTML


Lisp pretty-printer

An early example of pretty-printing was Bill Gosper's "GRINDEF" (''i.e.'' 'grind function') program (''c.'' 1967), which used combinatorial search with pruning to format
LISP Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
programs. Early versions operated on the executable (list structure) form of the Lisp program and were oblivious to the special meanings of various functions. Later versions had special read conventions for incorporating non-executable comments and also for preserving read macros in unexpanded form. They also allowed special indentation conventions for special functions such as if. The term "grind" was used in some Lisp circles as a synonym for pretty-printing.


Project style rules

Many open source projects have established rules for code layout. The most typical are the GNU formatting and the BSD style.BSD style
/ref> The biggest difference between the two is the location of the braces: in the GNU style, opening and closing braces are on lines by themselves, with the same indent. BSD style places an opening brace at the end of the preceding line, and the closing braces can be followed by else. The size of indent and location of whitespace also differs.


Example of formatting and beautifying code

The following example shows some typical C structures and how various indentation style rules format them. Without any formatting at all, it looks like this: int foo(int k) The GNU indent program produces the following output when asked to indent according to the GNU rules: int foo (int k) It produces this output when formatting according to BSD rules: int foo(int k)


See also

Related concepts * Elastic tabstop, a feature of many source code editors that detects and maintains aligned indents * Minification, making source code compact, even if it becomes harder for humans to understand *
Obfuscation Obfuscation is the obscuring of the intended meaning of communication by making the message difficult to understand, usually with confusing and ambiguous language. The obfuscation might be either unintentional or intentional (although intent ...
, deliberately making source code very difficult for humans to understand - especially as it becomes more convoluted Utilities * enscript, a text-to-PostScript converter, with pretty-printing features


References

{{reflist


External links


Algorithm 268: ALGOL 60 reference language editor
'' William M. McKeeman'': Commun. ACM 8(11): 667-668 (1965)
lgrind
Comprehensive TEX Archive Network
NEATER2: a PL/I source statement reformatter
''Kenneth Conrow, Ronald G. Smith'': Commun. ACM 13(11): 669-675 (1970)
SOAP - Simplify Obscure Algol Programs
''R. S. Scowen, D. Allin, A. L. Hillman, M. Shimell'': National Physical Laboratory Central Computer Unit repor
CCU6
(April, 1969) Includes formatted listing of SOAP source code. *
SOAP - A Program which Documents and Edits ALGOL 60 Programs.
''R. S. Scowen, D. Allin, A. L. Hillman, M. Shimell'': Comput. J. 14(2): 133-135 (1971) *


SOAP User's Guide.
(for Edinburgh IMP) ''Peter Salkeld Robertson'' (1976) *
SOAP Source Code
in/for IMP9
Soap80: A Program for Formatting IMP80 Source Programs.
''J.M. Murison, Edinburgh Regional Computer Center'' (1980) *

in/for IMP80 ''E. N. Gregory, University of Kent at Canterbury; Peter D. Stephens, Edinburgh Regional Computer Center''
PRETTYP.PAS
Early pascal prettyprinter. ''Ledgard et al.''
Pascal With Style
(1979)
style(9)
FreeBSD style guidelines

The nixHeirloom Project
Formatting your source code
GNU style guidelines Articles with example C code Source code Text editor features