simple markup language
   HOME

TheInfoList



OR:

A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output. For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another application for such languages is to provide for data entry in web-based publishing, such as
weblog A blog (a truncation of "weblog") is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries (posts). Posts are typically displayed in reverse chronological order ...
s and
wiki A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pub ...
s, where the input interface is a simple
text box type=search placeholder=An example text box, which can be used to search the English Wikipedia. A text box (input box), text field or text entry box is a control element of a graphical user interface, that should enable the user to input ...
. The server software then converts the input into a common
document markup language Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the documen ...
like
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaSc ...
.


History

Lightweight markup languages were originally used on text-only displays which could not display characters in
italics In typography, italic type is a cursive font based on a stylised form of calligraphic handwriting. Owing to the influence from calligraphy, italics normally slant slightly to the right. Italics are a way to emphasise key points in a printed ...
or bold, so informal methods to convey this information had to be developed. This formatting choice was naturally carried forth to plain-text email communications. Console browsers may also resort to similar display conventions. In 1986 international standard SGML provided facilities to define and parse lightweight markup languages using grammars and tag implication. The 1998 W3C
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
is a profile of SGML that omits these facilities. However, no SGML document type definition (DTD) for any of the languages listed below is known.


Types

Lightweight markup languages can be categorized by their tag types. Like HTML (<b>bold</b>), some languages use named elements that share a common format for start and end tags (e.g.
BBCode BBCode ("Bulletin Board Code") is a lightweight markup language used to format messages in much Internet forum software, first introduced in 1998. The available "tags" of BBCode are usually indicated by square brackets ( _and_.html" ;"title="/code> ...
''bold b/code>), whereas proper lightweight markup languages are restricted to
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
-only punctuation marks and other non-letter symbols for tags, but some also mix both styles (e.g.
Textile Textile is an umbrella term that includes various fiber-based materials, including fibers, yarns, filaments, threads, different fabric types, etc. At first, the word "textiles" only referred to woven fabrics. However, weaving is not the ...
bq. ) or allow embedded HTML (e.g.
Markdown Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown i ...
), possibly extended with custom elements (e.g.
MediaWiki MediaWiki is a free and open-source wiki software. It is used on Wikipedia and almost all other Wikimedia websites, including Wiktionary, Wikimedia Commons and Wikidata; these sites define a large part of the requirement set for MediaWi ...
). Most languages distinguish between markup for lines or blocks and for shorter spans of texts, but some only support inline markup. Some markup languages are tailored for a specific purpose, such as documenting computer code (e.g. POD,
reST Rest or REST may refer to: Relief from activity * Sleep ** Bed rest * Kneeling * Lying (position) * Sitting * Squatting position Structural support * Structural support ** Rest (cue sports) ** Armrest ** Headrest ** Footrest Arts and enter ...
, RD) or being converted to a certain output format (usually HTML or
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
) and nothing else, others are more general in application. This includes whether they are oriented on textual presentation or on data serialization. Presentation oriented languages include AsciiDoc,
atx ATX (Advanced Technology eXtended) is a motherboard and power supply configuration specification developed by Intel in 1995 to improve on previous de facto standards like the AT design. It was the first major change in desktop computer enclo ...
,
BBCode BBCode ("Bulletin Board Code") is a lightweight markup language used to format messages in much Internet forum software, first introduced in 1998. The available "tags" of BBCode are usually indicated by square brackets ( _and_.html" ;"title="/code> ...
, Creole, Crossmark,
Epytext Epydoc is a documentation generator that processes its own lightweight markup language Epytext for Python documentation strings. As opposed to freeform Python docstrings, reStructuredText (both also supported) and other markup languages for docs ...
,
Haml Haml (HTML Abstraction Markup Language) is a templating system that is designed to avoid writing inline code in a web document and make the HTML cleaner. Haml gives the flexibility to have some dynamic content in HTML. Similar to other template s ...
,
JsonML JsonML, the JSON Markup Language is a lightweight markup language used to map between XML (Extensible Markup Language) and JSON (JavaScript Object Notation). It converts an XML document or fragment into a JSON data structure for ease of use within ...
,
MakeDoc MakeDoc is a lightweight markup language created in 2000 by Carl Sassenrath for creating documentation and web pages using simple text notations. The language is used extensively in the REBOL community for documentation, websites, and wikis. Ove ...
,
Markdown Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown i ...
,
Org-mode Org Mode (also: ''org-mode''; ) is a document editing, formatting, and organizing mode, designed for notes, planning, and authoring within the free software text editor Emacs. The name is used to encompass plain text files ("org files") that incl ...
, POD (Perl), reST (Python), RD (Ruby), SECST,
Setext Setext (Structure Enhanced Text) is a lightweight markup language used to format plain text documents such as e-newsletters, Usenet postings, and e-mails. In contrast to some other markup languages (such as HTML), the markup is easily readable w ...
,
SiSU SiSU (SiSU information structuring universe or Structured information, serialized units), is a Unix command line-oriented framework for document structuring, publishing and search. Usage Using markup applied to a document, or a collection of do ...
,
SPIP SPIP (''Système de Publication pour l'Internet'') is a free software content management system designed for web site publishing, oriented towards online collaborative editing. The software is designed for easy setup, use and maintenance, and is ...
, Xupl, Texy!, Textile,
txt2tags txt2tags is a document generator software that uses a lightweight markup language. txt2tags is free software under GNU General Public License. Written in Python, it can export documents to several formats including: HTML, XHTML, SGML, LaTeX, Lo ...
,
UDO Udo is a masculine given name. It may refer to: People Medieval era *Udo of Neustria, 9th century nobleman * Udo (Obotrite prince) (died 1028) * Udo (archbishop of Trier) (c. 1030 – 1078) * Lothair Udo II, Margrave of the Nordmark (c. 1025 †...
and
Wikitext A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pub ...
. Data serialization oriented languages include Curl (
homoiconic In computer programming, homoiconicity (from the Greek words ''homo-'' meaning "the same" and ''icon'' meaning "representation") is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as ...
, but also reads JSON; every object serializes), JSON, and
YAML YAML ( and ) (''see '') is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Ext ...
.


Comparison of language features

Markdown's own syntax does not support class attributes or id attributes; however, since Markdown supports the inclusion of native HTML code, these features can be implemented using direct HTML. (Some extensions may support these features.) txt2tags' own syntax does not support class attributes or id attributes; however, since txt2tags supports inclusion of native HTML code in tagged areas, these features can be implemented using direct HTML when saving to an HTML target.


Comparison of implementation features


Comparison of lightweight markup language syntax


Inline span syntax

Although usually documented as yielding italic and bold text, most lightweight markup processors output semantic HTML elements em and strong instead. Monospaced text may either result in semantic code or presentational tt elements. Few languages make a distinction, e.g. Textile, or allow the user to configure the output easily, e.g. Texy. LMLs sometimes differ for multi-word markup where some require the markup characters to replace the inter-word spaces (''infix''). Some languages require a single character as prefix and suffix, other need doubled or even tripled ones or support both with slightly different meaning, e.g. different levels of emphasis. Gemtext does not have any inline formatting, monospaced text (called preformatted text in the context of Gemtext) must have the opening and closing ``` on their own lines.


Emphasis syntax

In HTML, text is emphasized with the <em> and <strong> element types, whereas <i> and <b> traditionally mark up text to be italicized or bold-faced, respectively. Microsoft Word and Outlook, and accordingly other word processors and mail clients that strive for a similar user experience, support the basic convention of using asterisks for boldface and underscores for italic style. While Word removes the characters, Outlook retains them.


Editorial syntax

In HTML, removed or deleted and inserted text is marked up with the <del> and <ins> element types, respectively. However, legacy element types <s> or <strike> and <u> are still also available for stricken and underlined spans of text. AsciiDoc, ATX, Creole, MediaWiki, PmWiki, reST, Slack, Textile, Texy! and WhatsApp do not support dedicated markup for underlining text. Textile does, however, support insertion via the +inserted+ syntax. AsciiDoc, ATX, Creole, MediaWiki, PmWiki, reST, Setext and Texy! do not support dedicated markup for striking through text.


Programming syntax

Quoted computer code is traditionally presented in typewriter-like fonts where each character occupies the same fixed width. HTML offers the semantic <code> and the deprecated, presentational <tt> element types for this task. Mediawiki and Gemtext do not provide lightweight markup for inline code spans.


Heading syntax

Headings are usually available in up to six levels, but the top one is often reserved to contain the same as the document title, which may be set externally. Some documentation may associate levels with divisional types, e.g. part, chapter, section, article or paragraph. Most LMLs follow one of two styles for headings, either
Setext Setext (Structure Enhanced Text) is a lightweight markup language used to format plain text documents such as e-newsletters, Usenet postings, and e-mails. In contrast to some other markup languages (such as HTML), the markup is easily readable w ...
-like underlines or
atx ATX (Advanced Technology eXtended) is a motherboard and power supply configuration specification developed by Intel in 1995 to improve on previous de facto standards like the AT design. It was the first major change in desktop computer enclo ...
-like"atx, the true structured text format" by Aaron Swartz (2002)
/ref> line markers, or they support both.


Underlined headings

Level 1 Heading


Level 2 Heading --------------- Level 3 Heading ~~~~~~~~~~~~~~~
The first style uses underlines, i.e. repeated characters (e.g. equals =, hyphen - or tilde ~, usually at least two or four times) in the line below the heading text. RST determines heading levels dynamically, which makes authoring more individual on the one hand, but complicates merges from external sources on the other hand.


Prefixed headings

# Level 1 Heading
## Level 2 Heading ##
### Level 3 Heading ###
The second style is based on repeated markers (e.g. hash #, equals = or asterisk *) at the start of the heading itself, where the number of repetitions indicates the (sometimes inverse) heading level. Most languages also support the reduplication of the markers at the end of the line, but whereas some make them mandatory, others do not even expect their numbers to match. Org-mode supports indentation as a means of indicating the level.
BBCode BBCode ("Bulletin Board Code") is a lightweight markup language used to format messages in much Internet forum software, first introduced in 1998. The available "tags" of BBCode are usually indicated by square brackets ( _and_.html" ;"title="/code> ...
does not support section headings at all. POD and Textile choose the HTML convention of numbered heading levels instead. Microsoft Word supports auto-formatting paragraphs as headings if they do not contain more than a handful of words, no period at the end and the user hits the enter key twice. For lower levels, the user may press the tabulator key the according number of times before entering the text, i.e. one through eight tabs for heading levels two through nine.


Link syntax

Hyperlinks can either be added inline, which may clutter the code because of long URLs, or with named alias or numbered id references to lines containing nothing but the address and related attributes and often may be located anywhere in the document. Most languages allow the author to specify text Text to be displayed instead of the plain address http://example.com and some also provide methods to set a different link title Title which may contain more information about the destination. LMLs that are tailored for special setups, e.g. wikis or code documentation, may automatically generate named anchors (for headings, functions etc.) inside the document, link to related pages (possibly in a different namespace) or provide a textual search for linked keywords. Most languages employ (double) square or angular brackets to surround links, but hardly any two languages are completely compatible. Many can automatically recognize and parse absolute URLs inside the text without further markup. Gemtext and setext links must be on a line by themselves, they cannot be used inline. Org-mode's normal link syntax does a text search of the file. You can also put in dedicated targets with <>.


List syntax

HTML requires an explicit element for the list, specifying its type, and one for each list item, but most lightweight markup languages need only different line prefixes for the bullet points or enumerated items. Some languages rely on indentation for nested lists, others use repeated parent list markers. Microsoft Word automatically converts paragraphs that start with an asterisk *, hyphen-minus - or greater-than bracket > followed by a space or horizontal tabulator as bullet list items. It will also start an enumerated list for the digit ''1'' and the case-insensitive letters ''a'' (for alphabetic lists) or ''i'' (for roman numerals), if they are followed by a period ., a closing round parenthesis ), a greater-than sign > or a hyphen-minus - and a space or tab; in case of the round parenthesis an optional opening one ( before the list marker is also supported. Languages differ on whether they support optional or mandatory digits in numbered list items, which kinds of enumerators they understand (e.g. decimal digit ''1'', roman numerals ''i'' or ''I'', alphabetic letters ''a'' or ''A'') and whether they support to keep explicit values in the output format. Some Markdown dialects, for instance, will respect a start value other than 1, but ignore any other explicit value.
! (1) ! /nowiki> ! ! ! ! ! ! ! ! nest , - ! ,
Markdown Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown i ...
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , 0–3 , , 1–3 , , indent , - ! ,
MediaWiki MediaWiki is a free and open-source wiki software. It is used on Wikipedia and almost all other Wikimedia websites, including Wiktionary, Wikimedia Commons and Wikidata; these sites define a large part of the requirement set for MediaWi ...
,
TiddlyWiki TiddlyWiki is a personal wiki and a non-linear notebook for organising and sharing complex information. It is an open-source single page application wiki in the form of a single HTML file that includes CSS, JavaScript, embedded files such as ...
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , 0 , , 1+ , , repeat , - ! ,
Org-mode Org Mode (also: ''org-mode''; ) is a document editing, formatting, and organizing mode, designed for notes, planning, and authoring within the free software text editor Emacs. The name is used to encompass plain text files ("org files") that incl ...
, , , , , , , , , , , , , , , , , , , , , colspan="2" , , , , , , 0+ , , , , indent , - ! , Jira,
Textile Textile is an umbrella term that includes various fiber-based materials, including fibers, yarns, filaments, threads, different fabric types, etc. At first, the word "textiles" only referred to woven fabrics. However, weaving is not the ...
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , 0 , , 1+ , , repeat , - Slack assists the user in entering enumerated and bullet lists, but does not actually format them as such, i.e. it just includes a leading digit followed by a period and a space or a bullet character • in front of a line.


Historical formats

The following lightweight markup languages, while similar to some of those already mentioned, have not yet been added to the comparison tables in this article: * EtText: circa 2000 * Grutatext: circa 2002


See also

* Comparison of document-markup languages *
Comparison of documentation generators The following tables compare general and technical information for a number of documentation generators. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the s ...
* Lightweight programming language *
Markdown Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown i ...
*
Wikitext A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pub ...


References


External links

* {{Markup languages Computing-related lists Data serialization formats Markup language comparisons Markup languages de:Auszeichnungssprache#Lightweight Markup Language