UBJSON
   HOME

TheInfoList



OR:

Universal Binary JSON (UBJSON) is a
computer A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
data interchange format. It is a binary form directly imitating
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
, but requiring fewer bytes of data. It aims to achieve the generality of JSON, combined with being much easier to process than JSON.


Rationale and Objectives

UBJSON is a proposed successor to BSON, BJSON and others. UBJSON has the following goals: * Complete compatibility with the JSON specification – there is a 1:1 mapping between standard JSON and UBJSON. * Ease of implementation – only including data types that are widely supported in popular programming languages so that there are no problems with certain languages not being supported well. * Ease of use – it can be quickly understood and adopted. * Speed and efficiency – UBJSON uses data representations that are (roughly) 30% smaller than their compacted JSON counterparts and are optimized for fast parsing. Streamed serialisation is supported, meaning that the transfer of UBJSON over a network connection can start sending data before the final size of the data is known.


Data types and syntax

UBJSON data can be either a ''value'' or a ''container''.


Value types

UBJSON uses a single binary tuple to represent all JSON value types: type ength ata Each element in the tuple is defined as:


type

The type is a 1-byte ASCII character used to indicate the type of the data following it. The ASCII characters were chosen to make manually walking and debugging data stored in the UBJSON format as easy as possible (e.g. making the data relatively readable in a hex editor). Types are available for the five JSON value types. There is also a
no-op In computer science, a NOP, no-op, or NOOP (pronounced "no op"; short for no operation) is a machine language instruction and its assembly language mnemonic, programming language statement, or computer protocol command that does nothing. Mac ...
type used for stream keep-alive. * Null: Z *
No-op In computer science, a NOP, no-op, or NOOP (pronounced "no op"; short for no operation) is a machine language instruction and its assembly language mnemonic, programming language statement, or computer protocol command that does nothing. Mac ...
: N - no operation, to be ignored by the receiving end * Boolean types: true (T) and false (F) * Numeric types:
int8 In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
(i), uint8 (U), int16 (I), int32 (l),
int64 In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
(L),
float32 Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floating ...
(d), float64 (D), and high-precision (H) * ASCII character: C * UTF-8 string: S High-precision numbers are represented as an arbitrarily long, UTF-8 string-encoded numeric value.


length (optional)

The length is an integer number (e.g. uint8, or int64) encoding the size of the data payload in bytes. It is used for strings, high-precision numbers and optionally containers. They are omitted for other types. Length is encoded following the same convention as integers, thus including its own type. For example, the string hello is encoded as S,U,0x05,h,e,l,l,o.


data (optional)

A sequence of bytes representing the actual binary data for this type of value. All numbers are in big-endian order.


Container types

Similarly to JSON, UBJSON defines two container types: ''array'' and ''object''. Arrays are ordered sequences of elements, represented as a /code> followed by zero or more elements of value and container type and a trailing /code>. Objects are labeled sets of elements, represented as a . Each key is a string with the S character omitted, and each "value" can be any element of value or container type. Alternatively, arrays and objects may indicate the number of elements they contain as # followed by an integer number before their first element, in which case the trailing ] or } is omitted. Additionally, if all elements have the same type, the types can be omitted and replaced by a single $ followed by the type, in which case the element count must follow immediately. For example, the array a","b","c"/nowiki> may be represented as
strongly typed In computer programming, one of the many ways that programming languages are colloquially classified is whether the language's type system makes it strongly typed or weakly typed (loosely typed). However, there is no precise technical definition o ...
array of uint8 values. This ensures binary efficiency while maintaining compatibility with JSON, even though JSON has no direct support for binary data.


Representation

The MIME type 'application/ubjson' is recommended, as is the file extension '.ubj' when stored in a file-system.


Software support

* Teradata Database * The Wolfram Language introduced support for UBJSON in 2017, with version 11.1 of the language.


See also

* Comparison of data serialization formats *
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
* CBOR * Smile (binary JSON) * Protocol Buffers * Action Message Format *
Apache Thrift Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of ...
* MessagePack * Document-oriented database (e.g. MongoDB) * Abstract Syntax Notation One (ASN.1) *
Wireless Binary XML WAP Binary XML (WBXML) is a binary representation of XML. It was developed by the WAP Forum and since 2002 is maintained by the Open Mobile Alliance as a standard to allow XML documents to be transmitted in a compact manner over mobile networks and ...
(WBXML) * Efficient XML Interchange


References


External links

* {{Data Exchange Data serialization formats