XMLStarlet is a set of command line utilities (toolkit) to query, transform, validate, and edit
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
documents and files using a simple set of shell commands in a way similar to how it is done with UNIX grep, sed, awk, diff, patch, join, etc commands.
This set of command line utilities can be used by those who want to test
XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
query or execute commands on the fly as well as deal with many XML documents or for automated XML processing with shell scripts.
To run XMLStarlet utility you can download it from the official site, then simply type '' on the command line with the corresponding commands or queries to execute (see
#Examples below).
Features
The toolkit's feature set includes the following options:
* Check or validate XML files (simple well-formedness check, DTD, XSD, RelaxNG)
* Calculate values of
XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
expressions on XML files (such as running sums, etc)
* Search XML files for matches to given
XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
expressions
* Apply XSLT stylesheets to XML documents (including EXSLT support, and passing parameters to stylesheets)
* Query XML documents (ex. query for value of some elements of attributes, sorting, etc)
* Modify or edit XML documents (ex. delete some elements)
* Format or "beautify" XML documents (as changing indentation, etc)
* Fetch XML documents using http:// or ftp:// URLs
* Browse tree structure of XML documents (in similar way to 'ls' command for directories)
* Include one XML document into another using XInclude
* XML c14n canonicalization
* Escape/unescape special XML characters in input text
* Print directory as XML document
* Convert XML into PYX format (based on ESIS -
ISO 8879), and vice versa.
The XMLStarlet command line utility is written in C and uses
libxml2
libxml2 is a software library for parsing XML documents. It is also the basis for the libxslt library which processes XSLT-1.0 stylesheets.
Description
Written in the C programming language, libxml2 provides bindings to C++, Ch, XSH, C#, Py ...
and
libxslt
libxslt is the XSLT C library developed for the GNOME project. It provides an implementation of XSLT 1.0, plus most of the EXSLT set of processor-portable extensions functions and some of Saxon's evaluate and expressions extensions. libxslt is ba ...
. Implementation of extensive choice of options for XMLStarlet utility was only possible because of rich feature set of both libraries: libxml2 and libxslt. XMLStarlet is linked statically to both libxml2 and libxslt, so generally all you need to process XML documents is one executable file.
XMLStarlet is open source free software released under an
MIT License
The MIT License is a permissive free software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts only very limited restriction on reuse and has, therefore, high license comp ...
which allows free use and distribution for both commercial and non-commercial projects.
Examples
Consider the following XML document 'xmlfile1.xml' example:
en.wikipedia.org
de.wikipedia.org
fr.wikipedia.org
pl.wikipedia.org
es.wikipedia.org
en.wiktionary.org
fr.wiktionary.org
vi.wiktionary.org
tr.wiktionary.org
es.wiktionary.org
en.wikileaks.org
On a command prompt the following five
XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
queries are executed on the above XML file 'xmlfile1.xml'.
* Example 1: The XPath expression to select all name attributes for all projects.
$ xmlstarlet sel -t -v "//wikimedia/projects/project/@name" xmlfile1.xml
Wikipedia
Wiktionary
Wikileaks
* Example 2: The XPath expression to select all attributes of the last Wikimedia project.
$ xmlstarlet sel -t -v "/wikimedia/projects/project ast()@*" xmlfile1.xml
Wikileaks
2006-10-04
* Example 3: The XPath expression to select addresses of all Wiktionary editions (text of all edition elements that exist under project element with a name attribute of Wiktionary).
$ xmlstarlet sel -t -v "/wikimedia/projects/project name='Wiktionary'editions/edition" xmlfile1.xml
en.wiktionary.org
fr.wiktionary.org
vi.wiktionary.org
tr.wiktionary.org
es.wiktionary.org
* Example4: The XPath expression to select addresses of all Wikimedia Wiktionary editions that have languages different from Turkish and Spanish (all those NOT Turkish and Not Spanish).
$ xmlstarlet sel -t -v "/wikimedia/projects/project name='Wiktionary'editions/edition language!='Turkish' and @language!='Spanish' xmlfile1.xml
en.wiktionary.org
fr.wiktionary.org
vi.wiktionary.org
* Example 5: The XPath expression to select all attributes of editions whose position is greater or equal to 3 in the list of editions.
$ xmlstarlet sel -t -v "/wikimedia/projects/project/editions/edition = 3">osition() >= 3@*" xmlfile1.xml
French
Polish
Spanish
Vietnamese
Turkish
Spanish
An XML document can be validated against an XSD schema saved in file 'xsdfile.xsd' as follows:
$ xmlstarlet val -e -s xsdfile.xsd xmlfile1.xml
xmlfile1.xml - valid
See also
*
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
(Extensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
*
XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
(XML Path Language) is a query language for selecting nodes from an XML document.
*
XSLT
XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subseque ...
(Extensible Stylesheet Language Transformations) is a language for transforming XML documents into other XML documents or other formats such as HTML for web pages, plain text, etc.
*
Document type definition
A document type definition (DTD) is a set of ''markup declarations'' that define a ''document type'' for an SGML-family markup language ( GML, SGML, XML, HTML).
A DTD defines the valid building blocks of an XML document. It defines the document ...
(DTD) defines the legal building blocks of an XML document.
Notes
External links
*
XML Schema Definition (XSD)
{{DEFAULTSORT:Xmlstarlet
XML software
XSLT processors
XML parsers