Fast Infoset
   HOME

TheInfoList



OR:

Fast Infoset (or FI) is an international standard that specifies a binary encoding format for the
XML Information Set XML Information Set (XML Infoset) is a W3C specification describing an abstract data model of an XML document in terms of a set of ''information items''. The definitions in the XML Information Set specification are meant to be used in ''other'' sp ...
(''XML Infoset'') as an alternative to the
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
document format. It aims to provide more efficient serialization than the text-based XML format. FI is effectively a
lossless compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...
, analogous to ''
gzip gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and in ...
'', for XML, except that while the original formatting is lost, no information is lost in the conversion from XML to FI, and back to XML. While the purpose of compression is to reduce physical data size, FI aims to optimize both document size and processing performance. The Fast Infoset specification is defined by both the
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Commu ...
and the
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
/
IEC The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
standards bodies. FI is officially defined in ''ITU-T Rec. X.891'' and ''ISO/IEC 24824-1,'' and entitled ''Fast Infoset''. The standard was published by ITU-T on May 14, 2005, and by ISO on May 4, 2007. The Fast Infoset standard document can be downloaded from th
ITU website
Though the document does not assert intellectual property (IP) restrictions on implementation or use, page ii warns that it has received notices and the subject may not be completely free of IP assertions. A common misconception is that FI requires
ASN.1 Abstract Syntax Notation One (ASN.1) is a standard interface description language for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and ...
tool support. Although the formal specification uses ASN.1 notation, the standard includes
Encoding Control Notation The Encoding Control Notation (ECN) is a standardized formal language that is part of the Abstract Syntax Notation One (ASN.1) family of international standards. ECN is designed to be used along with ASN.1, and each ECN specification (a coherent se ...
(ECN) and ASN.1 tools are not required by implementations. An alternative to FI is FleXPath.


Structure

The underlying file format is
ASN.1 Abstract Syntax Notation One (ASN.1) is a standard interface description language for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and ...
, with tag/length/value blocks. Text values of attributes and elements are stored with length prefixes rather than end delimiters, and data segments do not require escapement for special characters. The equivalent of end tags ("terminators") are needed only at the end of a list of child-elements. Binary data is transmitted in native format, and need not be converted to a transmission format such as
base64 In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data (more specifically, a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit Base64 digits. Common to all bina ...
. Fast Infoset is a higher level format built on ASN.1 forms and notation. Element and attribute names are stored within the octet stream, unlike traditional ASN.1 encoding schemes. In consequence, The conventional XML file can be recovered from the binary stream without reference the XML Schema, and the XML Schema need not be expressed as an ASN.1 definition. (ASN.1 "Tags" are just type names, e.g. String, Integer, or complex types.) ASN.1 together with ECN is used to define the file format. An index table is built for most strings, which includes element and attribute names, and their values. This means that the text of repeated tags and values only appears once per document.


Implementations


Reference implementation


Java implementation
of the FI specification is available as part of th
Eclipse Implementation of JAXB
The library is
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
and is distributed under the terms of the Apache License 2.0. Several projects use this implementation, including the reference implementation for
JAX-WS The Jakarta XML Web Services (JAX-WS; formerly Java API for XML Web Services) is a Jakarta EE API for creating web services, particularly SOAP services. JAX-WS is one of the Java XML programming APIs. Overview The JAX-WS 2.2 specificatioJSR 2 ...
used in
Eclipse Metro Metro is a high-performance, extensible, easy-to-use web service stack. Although historically an open-source part of the GlassFish application server, it can also be used in a stand-alone configuration. Components of Metro include: JAXB RI, JAX ...
. Th
QtitanFastInfoset
implementation for C++ is available under commercial license as a component for the Qt framework.


Performance

Because Fast Infosets are compressed as part of the XML generation process, they are much faster than using Zip-style compression algorithms on an XML stream, although the output is not as well compressed. SAX-type parsing performance of Fast Infoset is also much faster than parsing performance of XML 1.0, even without any Zip-style compression. Typical increases in parsing speed observed for the reference
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
implementation are a factor of 10 over Java Xerces, and a factor of 4 over th
Piccolo driver
(one of the fastest Java-based XML parsers).


Typical applications

Portable devices – Mobile devices typically have low bandwidth data connections and slower CPUs. Fast Infoset uses less bandwidth than XML and is faster to process, making it a superior choice. Storing large volumes of data – When storing XML to either file or database, the volume of data a system produces can often exceed reasonable limits, with a number of detriments: the access times go up as more data is read, CPU load goes up as XML data takes more power to process, and storage costs go up. By storing XML data in Fast Infoset format, data volume may be reduced by as much as 80 percent. Passing XML through the Internet – When an application passes data over the internet, network bandwidth can be a major bottleneck, seriously degrading the performance of client applications and limiting the server's power to process requests. Reducing the size of data transferred across the internet reduces the time required to send or receive the message, and increases the number of transactions a server can process per hour.


See also

*
Binary XML Various binary formats have been proposed as compact representations for XML (''Extensible Markup Language''). Using a binary XML format generally reduces the verbosity of XML documents thereby also reducing the cost of parsing, but hinders the use ...
*
Efficient XML Interchange Efficient XML Interchange (EXI) is a binary XML format for exchange of data on a computer network. It was developed by the W3C's Efficient Extensible Interchange Working Group and is one of the most prominent efforts to encode XML documents in a b ...
*
X3D X3D is a royalty-free ISO/IEC standard for declaratively representing 3D computer graphics. File format support includes XML, ClassicVRML, Compressed Binary Encoding (CBE) and a draft JSON encoding. X3D became the successor to the Virtual Re ...


References

{{Reflist, 2


External links


A heavy technical description on OTN

FastInfoset.NET home page

FI project home page





Free download of the Fast Infoset standard (ITU-T Rec. X.891) from the ITU Web site


* ttps://github.com/LiquidTechnologies/fast-infoset Liquid Fast Infoset .Net (an Open Source .Net implementation) XML Data serialization formats