Pandoc is a
free-software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
document converter, widely used as a writing tool (especially by scholars)
-
-
- and as a basis for publishing workflows. It was created by
John MacFarlane, a philosophy professor at the
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California. Established in 1868 as the University of California, it is the state's first land-grant u ...
.
Functionality
Pandoc dubs itself a "markup format" converter. It can take a document in one of the supported formats and convert only its markup to another format. Maintaining the look and feel of the document is not a priority.
Plug-ins for custom formats can also be written in
Lua
Lua or LUA may refer to:
Science and technology
* Lua (programming language)
* Latvia University of Agriculture
* Last universal ancestor, in evolution
Ethnicity and language
* Lua people, of Laos
* Lawa people, of Thailand sometimes referred t ...
, which has been used to create an exporting tool for the
Journal Article Tag Suite
The Journal Article Tag Suite (JATS) is an XML format used to describe scientific literature published online. It is a technical standard developed by the National Information Standards Organization (NISO) and approved by the American National S ...
, for example.
An included
CiteProc CiteProc is the generic name for programs that produce formatted bibliographies and citations based on the metadata of the cited objects and the formatting instructions provided by Citation Style Language (CSL) styles. The first CiteProc implementa ...
option allows pandoc to use bibliographic data from
reference management software
Reference management software, citation management software, or bibliographic management software is software for scholars and authors to use for recording and utilising bibliographic citations (references) as well as managing project reference ...
in any of five formats:
BibTeX
BibTeX is reference management software for formatting lists of references. The BibTeX tool is typically used together with the LaTeX document preparation system. Within the typesetting system, its name is styled as . The name is a portmanteau ...
,
BibLaTeX
Biber is a bibliography information processing program that works in conjunction with the LaTeX package BibLaTeX and offers full Unicode support.
Biber is a widely used replacement for the BibTeX software. Both generate a bibliography in LaTeX, b ...
,
CSL JSON or CSL YAML, or
RIS.
The information is automatically transformed into a
citation
A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose of ...
in various styles (such as
APA,
Chicago
(''City in a Garden''); I Will
, image_map =
, map_caption = Interactive Map of Chicago
, coordinates =
, coordinates_footnotes =
, subdivision_type = Country
, subdivision_name ...
, or
MLA) using an implementation of the
Citation Style Language
The Citation Style Language (CSL) is an open XML-based language to describe the formatting of citations and bibliographies. Reference management programs using CSL include Zotero, Mendeley and Papers. The Pandoc lightweight document conversion s ...
.
[ This allows the program to serve as a simpler alternative to ]LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well.
In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
for producing academic writing.
Supported file formats
Input formats
The input format with the most support is an extended version of Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
. Notwithstanding, pandoc can also read in the following formats:
* Creole
* DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software, but it can be used for any other sort of documentation.
As a semantic languag ...
* EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
* FictionBook
FictionBook is an open XML-based e-book format which originated and gained popularity in Russia. FictionBook files have the filename extension. Some readers also support ZIP-compressed FictionBook files ( or )
The FictionBook format does not ...
(FB2)
* Haddock
The haddock (''Melanogrammus aeglefinus'') is a saltwater ray-finned fish from the family Gadidae, the true cods. It is the only species in the monotypic genus ''Melanogrammus''. It is found in the North Atlantic Ocean and associated seas where ...
* HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
* Jira wiki markup
* Journal Article Tag Suite
The Journal Article Tag Suite (JATS) is an XML format used to describe scientific literature published online. It is a technical standard developed by the National Information Standards Organization (NISO) and approved by the American National S ...
(JATS)
* JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
* LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well.
In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
* Lightweight markup language
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightwei ...
* man
A man is an adult male human. Prior to adulthood, a male human is referred to as a boy (a male child or adolescent). Like most other male mammals, a man's genome usually inherits an X chromosome from the mother and a Y chromos ...
* Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
: Strict, CommonMark, GitHub Flavored Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
(GFM), MultiMarkdown
MultiMarkdown is a lightweight markup language created by Fletcher T. Penney as an extension of the Markdown format. It supports additional features not available in plain Markdown syntax.
There is also a text editor with the same name that suppo ...
(MMD) and Markdown Extra
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
(PHP Extra) variants
* OpenDocument
The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed wi ...
(ODT)
* OPML
OPML (Outline Processor Markup Language) is an XML format for outlines (defined as "a tree, where each node contains a set of named attributes with string values"). Originally developed by UserLand as a native file format for the outliner appli ...
* Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version a ...
: Microsoft Word
Microsoft Word is a word processing software developed by Microsoft. It was first released on October 25, 1983, under the name ''Multi-Tool Word'' for Xenix systems. Subsequent versions were later written for several other platforms includin ...
variant
* Org-mode
Org Mode (also: ''org-mode''; ) is a document editing, formatting, and organizing mode, designed for notes, planning, and authoring within the free software text editor Emacs. The name is used to encompass plain text files ("org files") that incl ...
* reStructuredText
reStructuredText (RST, ReST, or reST) is a file format for textual data used primarily in the Python programming language community for technical documentation.
It is part of the Docutils project of the Python Doc-SIG (Documentation Special Inte ...
* Textile
Textile is an umbrella term that includes various fiber-based materials, including fibers, yarns, filaments, threads, different fabric types, etc. At first, the word "textiles" only referred to woven fabrics. However, weaving is not the ...
* txt2tags
txt2tags is a document generator software that uses a lightweight markup language. txt2tags is free software under GNU General Public License.
Written in Python, it can export documents to several formats including: HTML, XHTML, SGML, LaTeX, Lou ...
(t2t)
* Wiki markup
A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pu ...
: MediaWiki
MediaWiki is a free and open-source wiki software. It is used on Wikipedia and almost all other Wikimedia websites, including Wiktionary, Wikimedia Commons and Wikidata; these sites define a large part of the requirement set for MediaWiki ...
, Muse, TikiWiki
Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source software, free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Les ...
, TWiki
TWiki is a Perl-based structured wiki application, typically used to run a collaboration platform, knowledge or document management system, a knowledge base, or team portal. Users can create wiki pages using the TWiki Markup Language, and develo ...
and Vimwiki
A personal wiki is wiki software that allows individual users to organize information on their desktop or mobile computing devices in a manner similar to community wikis, but without collaborative software or multiple users.
Personal wiki softwa ...
variants
Output formats
Pandoc can create files in the following output formats, which are not necessarily the same set of formats as the input formats:
* AsciiDoc
AsciiDoc is a human-readable document format, semantically equivalent to DocBook XML, but using plain-text mark-up conventions. AsciiDoc documents can be created using any text editor and read “as-is”, or rendered to HTML or any other for ...
* ConTeXt
Context may refer to:
* Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary
Computing
* Context (computing), the virtual environment required to su ...
* DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software, but it can be used for any other sort of documentation.
As a semantic languag ...
: Versions 4 and 5
* EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
: Versions 2 and 3
* FictionBook
FictionBook is an open XML-based e-book format which originated and gained popularity in Russia. FictionBook files have the filename extension. Some readers also support ZIP-compressed FictionBook files ( or )
The FictionBook format does not ...
(FB2)
* Haddock
The haddock (''Melanogrammus aeglefinus'') is a saltwater ray-finned fish from the family Gadidae, the true cods. It is the only species in the monotypic genus ''Melanogrammus''. It is found in the North Atlantic Ocean and associated seas where ...
* HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
: HTML4 and HTML5
HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
variants, respectively compliant with XHTML 1.0 Transitional and XHTML Strict
* InDesign
Adobe InDesign is a desktop publishing and page layout designing software application produced by Adobe Inc. and first released in 1999. It can be used to create works such as posters, flyers, brochures, magazines, newspapers, presentations, ...
ICML
* Jira wiki markup
* Journal Article Tag Suite
The Journal Article Tag Suite (JATS) is an XML format used to describe scientific literature published online. It is a technical standard developed by the National Information Standards Organization (NISO) and approved by the American National S ...
(JATS)
* JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
* LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well.
In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
* man
A man is an adult male human. Prior to adulthood, a male human is referred to as a boy (a male child or adolescent). Like most other male mammals, a man's genome usually inherits an X chromosome from the mother and a Y chromos ...
* Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
: Strict, CommonMark, GitHub Flavored Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
(GFM), MultiMarkdown
MultiMarkdown is a lightweight markup language created by Fletcher T. Penney as an extension of the Markdown format. It supports additional features not available in plain Markdown syntax.
There is also a text editor with the same name that suppo ...
(MMD) and Markdown Extra
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is ...
(PHP Extra) variants
* OpenDocument
The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed wi ...
(ODT/ODF)
* OPML
OPML (Outline Processor Markup Language) is an XML format for outlines (defined as "a tree, where each node contains a set of named attributes with string values"). Originally developed by UserLand as a native file format for the outliner appli ...
* Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version a ...
: Microsoft Word
Microsoft Word is a word processing software developed by Microsoft. It was first released on October 25, 1983, under the name ''Multi-Tool Word'' for Xenix systems. Subsequent versions were later written for several other platforms includin ...
and Microsoft PowerPoint
Microsoft PowerPoint is a presentation program, created by Robert Gaskins and Dennis Austin at a software company named Forethought, Inc. It was released on April 20, 1987, initially for Macintosh computers only. Microsoft acquired PowerPoi ...
variants
* Org-mode
Org Mode (also: ''org-mode''; ) is a document editing, formatting, and organizing mode, designed for notes, planning, and authoring within the free software text editor Emacs. The name is used to encompass plain text files ("org files") that incl ...
* PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
(needs a third-party add-on like ConTeXt
Context may refer to:
* Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary
Computing
* Context (computing), the virtual environment required to su ...
, pdfroff
, wkhtmltopdf
, weasyprint
or prince
)
* Plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
* reStructuredText
reStructuredText (RST, ReST, or reST) is a file format for textual data used primarily in the Python programming language community for technical documentation.
It is part of the Docutils project of the Python Doc-SIG (Documentation Special Inte ...
* Rich Text Format
)
As an example, the following RTF code
would be rendered as follows:
This is some bold text.
Character encoding
A standard RTF file can only consist of 7-bit ASCII characters, but can use escape sequences to encode other characters. Th ...
(RTF)
* TEI
* Texinfo
Texinfo is a typesetting syntax used for generating documentation in both on-line and printed form (creating filetypes as , , , etc., and its own hypertext format, ) with a single source file. It is implemented by a computer program released as fr ...
* Textile
Textile is an umbrella term that includes various fiber-based materials, including fibers, yarns, filaments, threads, different fabric types, etc. At first, the word "textiles" only referred to woven fabrics. However, weaving is not the ...
* Web-based slideshows: LaTeX Beamer, Slideous, Slidy, DZSlides, reveal.js and S5 variants[See as an example Th]
source file
is written in Markdown.
* Wiki markup
A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pu ...
: DokuWiki
DokuWiki is a wiki application licensed under GPLv2 and written in the PHP programming language. It works on plain text files and thus does not need a database. Its syntax is similar to the one used by MediaWiki. It is often recommended as a more ...
, MediaWiki
MediaWiki is a free and open-source wiki software. It is used on Wikipedia and almost all other Wikimedia websites, including Wiktionary, Wikimedia Commons and Wikidata; these sites define a large part of the requirement set for MediaWiki ...
, Muse
In ancient Greek religion and mythology, the Muses ( grc, Μοῦσαι, Moûsai, el, Μούσες, Múses) are the inspirational goddesses of literature, science, and the arts. They were considered the source of the knowledge embodied in the ...
, TikiWiki
Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source software, free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Les ...
, TWiki
TWiki is a Perl-based structured wiki application, typically used to run a collaboration platform, knowledge or document management system, a knowledge base, or team portal. Users can create wiki pages using the TWiki Markup Language, and develo ...
and Vimwiki
A personal wiki is wiki software that allows individual users to organize information on their desktop or mobile computing devices in a manner similar to community wikis, but without collaborative software or multiple users.
Personal wiki softwa ...
variants
See also
* Round-trip format conversion {{noref, date=January 2019
The term round-trip is used in document conversion particularly involving markup languages such as XML and SGML. A successful round-trip consists of converting a document in format A (docA) to one in format B (docB) and ...
* Help authoring tool
A Help Authoring Tool or HAT is a software program used by technical writers to create online help systems.
Functions
The basic functions of a Help Authoring Tool (HAT) can be divided into the following categories:
File input
HATs obtain their s ...
References
External links
*
{{LaTeX navbox
2006 software
File conversion software
Free software programmed in Haskell
Lightweight markup languages
Lua (programming language)-scriptable software
Technical communication tools
Workflow applications