Chunked transfer encoding is a
streaming
Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. ''Streaming'' refers to the delivery method of content, rather than the content it ...
data transfer mechanism available in
Hypertext Transfer Protocol
The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
(HTTP) version 1.1, defined in
RFC 9112 ยง7.1. In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. No knowledge of the data stream outside the currently-being-processed chunk is necessary for both the sender and the receiver at any given time.
Each chunk is preceded by its size in bytes. The transmission ends when a zero-length chunk is received. The ''chunked'' keyword in the
Transfer-Encoding header is used to indicate chunked transfer.
Chunked transfer encoding is not supported in
HTTP/2
HTTP/2 (originally named HTTP/2.0) is a major revision of the HTTP network protocol used by the World Wide Web. It was derived from the earlier experimental SPDY protocol, originally developed by Google. HTTP/2 was developed by the HTTP Working ...
, which provides its own mechanisms for data streaming.
Rationale
The introduction of chunked encoding provided various benefits:
* Chunked transfer encoding allows a server to maintain an
HTTP persistent connection
HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/ ...
for dynamically generated content. In this case, the HTTP Content-Length header cannot be used to delimit the content and the next HTTP request/response, as the content size is not yet known. Chunked encoding has the benefit that it is not necessary to generate the full content before writing the header, as it allows streaming of content as chunks and explicitly signaling the end of the content, making the connection available for the next HTTP request/response.
* Chunked encoding allows the sender to send additional header fields after the message body. This is important in cases where values of a field cannot be known until the content has been produced, such as when the content of the message must be digitally signed. Without chunked encoding, the sender would have to buffer the content until it was complete in order to calculate a field value and send it before the content.
Applicability
For version 1.1 of the HTTP protocol, the chunked transfer mechanism is considered to be always and anyway acceptable, even if not listed in the
TE (transfer encoding) request header field, and when used with other transfer mechanisms, should always be applied last to the transferred data and never more than one time. This transfer coding method also allows additional entity header fields to be sent after the last chunk if the client specified the "trailers" parameter as an argument of the TE field. The origin server of the response can also decide to send additional entity trailers even if the client did not specify the "trailers" option in the TE request field, but only if the metadata is optional (i.e. the client can use the received entity without them). Whenever the trailers are used, the server should list their names in the Trailer header field; three header field types are specifically prohibited from appearing as a trailer field:
Transfer-Encoding,
Content-Length and
Trailer.
Format
If a field with a value of "" is specified in an HTTP message (either a request sent by a client or the response from the server), the body of the message consists of an unspecified number of chunks, a terminating chunk, trailer, and a final CRLF sequence (i.e.
carriage return
A carriage return, sometimes known as a cartridge return and often shortened to CR, or return, is a control character or mechanism used to reset a device's position to the beginning of a line of text. It is closely associated with the line feed a ...
followed by
line feed
Newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a ...
).
Each chunk starts with the number of
octets
Octet may refer to:
Music
* Octet (music), ensemble consisting of eight instruments or voices, or composition written for such an ensemble
** String octet, a piece of music written for eight string instruments
*** Octet (Mendelssohn), 1825 compos ...
of the data it embeds expressed as a
hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...
number in
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
followed by optional parameters (''chunk extension'') and a terminating CRLF sequence, followed by the chunk data. The chunk is terminated by CRLF.
If chunk extensions are provided, the chunk size is terminated by a semicolon and followed by the parameters, each also delimited by semicolons. Each parameter is encoded as an extension name followed by an optional equal sign and value. These parameters could be used for a running
message digest
A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography:
* the probability of a particular n-bit output re ...
or
digital signature
A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature, where the prerequisites are satisfied, gives a recipient very high confidence that the message was created b ...
, or to indicate an estimated transfer progress, for instance.
The terminating chunk is a regular chunk, with the exception that its length is zero. It is followed by the trailer, which consists of a (possibly empty) sequence of entity header fields. Normally, such header fields would be sent in the message's header; however, it may be more efficient to determine them after processing the entire message entity. In that case, it is useful to send those headers in the trailer.
Header fields that regulate the use of trailers are ''TE'' (used in requests), and ''Trailers'' (used in responses).
Use with compression
HTTP servers often use
compression
Compression may refer to:
Physical science
*Compression (physics), size reduction due to forces
*Compression member, a structural element such as a column
*Compressibility, susceptibility to compression
*Gas compression
*Compression ratio, of a c ...
to optimize transmission, for example with or . If both compression and chunked encoding are enabled, then the content stream is first compressed, then chunked; so the chunk encoding itself is not compressed, and the data in each chunk is not compressed individually. The remote endpoint then decodes the stream by concatenating the chunks and uncompressing the result.
Example
Encoded data
In the following example, three chunks of length 4, 6 and 14 (hexadecimal "E") are shown. The chunk size is transferred as a hexadecimal number followed by \r\n as a line separator, followed by a chunk of data of the given size.
4\r\n (bytes to send)
Wiki\r\n (data)
6\r\n (bytes to send)
pedia \r\n (data)
E\r\n (bytes to send)
in \r\n
\r\n
chunks.\r\n (data)
0\r\n (final byte - 0)
\r\n (end message)
Note: the chunk size indicates the size of the chunk data and excludes the trailing CRLF ("\r\n"). In this particular example, the CRLF following "in" are counted as two octets toward the chunk size of 0xE (14). The CRLF in its own line are also counted as two octets toward the chunk size. The period character at the end of "chunks" is the 14th character, so it is the last data character in that chunk. The CRLF following the period is the trailing CRLF, so it is not counted toward the chunk size of 0xE (14).
Decoded data
Wikipedia in
chunks.
See also
*
List of HTTP header fields
A ''list'' is any set of items in a row. List or lists may also refer to:
People
* List (surname)
Organizations
* List College, an undergraduate division of the Jewish Theological Seminary of America
* SC Germania List, German rugby union ...
References
External links
*
{{DEFAULTSORT:Chunked Transfer Encoding
Data management
Hypertext Transfer Protocol headers