Query String
   HOME

TheInfoList



OR:

A query string is a part of a
uniform resource locator A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifi ...
(URL) that assigns values to specified parameters. A query string commonly includes fields added to a base URL by a Web browser or other client application, for example as part of an HTML, choosing the appearance of a page, or jumping to positions in multimedia content. A web server can handle a
Hypertext Transfer Protocol The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
(HTTP) request either by reading a file from its
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
based on the URL path or by handling the request using logic that is specific to the type of resource. In cases where special logic is invoked, the query string will be available to that logic for use in its processing, along with the path component of the URL.


Structure

Typical URL containing a query string is as follows: When a server receives a request for such a page, it may run a program, passing the query string, which in this case is name=ferret, unchanged to the program. The question mark is used as a separator, and is not part of the query string.
Web frameworks Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created by ...
may provide methods for parsing multiple parameters in the query string, separated by some delimiter. In the example URL below, multiple query parameters are separated by the
ampersand The ampersand, also known as the and sign, is the logogram , representing the conjunction "and". It originated as a ligature of the letters ''et''—Latin for "and". Etymology Traditionally in English, when spelling aloud, any letter that ...
, "&": The exact structure of the query string is not standardized. Methods used to parse the query string may differ between websites. A link in a web page may have a URL that contains a query string.
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
defines three ways a user agent can generate the query string: * an
HTML form A webform, web form or HTML form on a web page allows a user to enter data that is sent to a server for processing. Forms can resemble paper or database forms because web users fill out the forms using checkboxes, radio buttons, or text fields. F ...
via the element * a server-side image map via the attribute on the element with an construction * an indexed search via the now deprecated element


Web forms

One of the original uses was to contain the content of an
HTML form A webform, web form or HTML form on a web page allows a user to enter data that is sent to a server for processing. Forms can resemble paper or database forms because web users fill out the forms using checkboxes, radio buttons, or text fields. F ...
, also known as web form. In particular, when a form containing the fields field1, field2, field3 is submitted, the content of the fields is encoded as a query string as follows: * The query string is composed of a series of field-value pairs. * Within each pair, the field name and value are separated by an
equals sign The equals sign (British English, Unicode) or equal sign (American English), also known as the equality sign, is the mathematical symbol , which is used to indicate equality in some well-defined sense. In an equation, it is placed between two ...
, "=". * The series of pairs is separated by the
ampersand The ampersand, also known as the and sign, is the logogram , representing the conjunction "and". It originated as a ligature of the letters ''et''—Latin for "and". Etymology Traditionally in English, when spelling aloud, any letter that ...
, "&" ( semicolons ";" are not recommended by the
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
anymore, see below). While there is no definitive standard, most
web framework A web framework (WF) or web application framework (WAF) is a software framework that is designed to support the development of web applications including web services, web resources, and web APIs. Web frameworks provide a standard way to build and ...
s allow multiple values to be associated with a single field (e.g. field1=value1&field1=value2&field2=value3). For each
field Field may refer to: Expanses of open ground * Field (agriculture), an area of land used for agricultural purposes * Airfield, an aerodrome that lacks the infrastructure of an airport * Battlefield * Lawn, an area of mowed grass * Meadow, a grass ...
of the form, the query string contains a pair field=value. Web forms may include fields that are not visible to the user; these fields are included in the query string when the form is submitted. This convention is a
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
recommendation.Forms in HTML documents
W3.org. Retrieved on 2013-09-08.
In the recommendations of 1999, W3C recommended that all web servers support
semicolon The semicolon or semi-colon is a symbol commonly used as orthographic punctuation. In the English language, a semicolon is most commonly used to link (in a single sentence) two independent clauses that are closely related in thought. When a ...
separators in addition to
ampersand The ampersand, also known as the and sign, is the logogram , representing the conjunction "and". It originated as a ligature of the letters ''et''—Latin for "and". Etymology Traditionally in English, when spelling aloud, any letter that ...
separators to allow
application/x-www-form-urlencoded Percent-encoding, also known as URL encoding, is a method to encode arbitrary data in a Uniform Resource Identifier (URI) using only the limited US-ASCII characters legal within a URI. Although it is known as ''URL encoding'', it is also used m ...
query strings in URLs within HTML documents without having to entity escape ampersands. Since 2014, W3C recommends to use only
ampersand The ampersand, also known as the and sign, is the logogram , representing the conjunction "and". It originated as a ligature of the letters ''et''—Latin for "and". Etymology Traditionally in English, when spelling aloud, any letter that ...
as query separator The form content is only encoded in the URL's query string when the form submission method is
GET Get or GET may refer to: * Get (animal), the offspring of an animal * Get (divorce document), in Jewish religious law * GET (HTTP), a type of HTTP request * "Get" (song), by the Groggers * Georgia Time, used in the Republic of Georgia * Get AS, a ...
. The same encoding is used by default when the submission method is
POST Post or POST commonly refers to: *Mail, the postal system, especially in Commonwealth of Nations countries **An Post, the Irish national postal service **Canada Post, Canadian postal service **Deutsche Post, German postal service **Iraqi Post, Ira ...
, but the result is submitted as the
HTTP request The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
body rather than being included in a modified URL.


Indexed search

Before
forms Form is the shape, visual appearance, or configuration of an object. In a wider sense, the form is the way something happens. Form also refers to: *Form (document), a document (printed or electronic) with spaces in which to write or enter data * ...
were added to HTML, browsers rendered the element as a single-line text-input control. The text entered into this control was sent to the server as a query string addition to a
GET Get or GET may refer to: * Get (animal), the offspring of an animal * Get (divorce document), in Jewish religious law * GET (HTTP), a type of HTTP request * "Get" (song), by the Groggers * Georgia Time, used in the Republic of Georgia * Get AS, a ...
request for the base URL or another URL specified by the attribute. This was intended to allow web servers to use the provided text as query criteria so they could return a list of matching pages. When the text input into the indexed search control is submitted, it is encoded as a query string as follows: * The query string is composed of a series of arguments by parsing the text into words at the spaces. * The series is separated by the
plus sign The plus and minus signs, and , are mathematical symbols used to represent the notions of positive and negative, respectively. In addition, represents the operation of addition, which results in a sum, while represents subtraction, result ...
, '+'. Though the element is deprecated and most browsers no longer support or render it, there are still some vestiges of indexed search in existence. For example, this is the source of the special handling of
plus sign The plus and minus signs, and , are mathematical symbols used to represent the notions of positive and negative, respectively. In addition, represents the operation of addition, which results in a sum, while represents subtraction, result ...
, '+' within browser URL percent encoding (which today, with the deprecation of indexed search, is all but redundant with %20). Also some web servers supporting CGI (e.g.,
Apache The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Ndendahe (Bedonkohe or Mogollon and Nednhi or Carrizaleño an ...
) will process the query string into command line arguments if it does not contain an
equals sign The equals sign (British English, Unicode) or equal sign (American English), also known as the equality sign, is the mathematical symbol , which is used to indicate equality in some well-defined sense. In an equation, it is placed between two ...
, '=' (as per section 4.4 of CGI 1.1). Some CGI scripts still depend on and use this historic behavior for URLs embedded in HTML.


URL encoding

Some
characters Character or Characters may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # can be used to further specify a subsection (or fragment) of a document. In HTML forms, the character = is used to separate a name from a value. The URI generic syntax uses
URL encoding Percent-encoding, also known as URL encoding, is a method to encode arbitrary data in a Uniform Resource Identifier (URI) using only the limited US-ASCII characters legal within a URI. Although it is known as ''URL encoding'', it is also used m ...
to deal with this problem, while HTML forms make some additional substitutions rather than applying percent encoding for all such characters. SPACE is encoded as '+' or "%20".
HTML 5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML L ...
specifies the following transformation for submitting HTML forms with the "GET" method to a web server. The following is a brief summary of the algorithm: * Characters that cannot be converted to the correct charset are replaced with HTML
numeric character reference A numeric character reference (NCR) is a common markup construct used in SGML and SGML-derived markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represents a single character. Since WebSgml, XML ...
s * SPACE is encoded as '+' or '%20' * Letters (AZ and az), numbers (09) and the characters '~','-','.' and '_' are left as-is * + is encoded by %2B * All other characters are encoded as a %HH
hexadecimal In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...
representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding) The octet corresponding to the tilde ("~") is permitted in query strings by RFC3986 but required to be percent-encoded in HTML forms to "%7E". The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 3986.


Example

If a
form Form is the shape, visual appearance, or configuration of an object. In a wider sense, the form is the way something happens. Form also refers to: *Form (document), a document (printed or electronic) with spaces in which to write or enter data ...
is embedded in an
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
page as follows:
and the user inserts the strings “this is a field” and “was it clear (already)?” in the two
text fields type=search placeholder=An example text box, which can be used to search the English Wikipedia. A text box (input box), text field or text entry box is a control element of a graphical user interface, that should enable the user to input ...
and presses the submit button, the program test.cgi (the program specified by the action
attribute Attribute may refer to: * Attribute (philosophy), an extrinsic property of an object * Attribute (research), a characteristic of an object * Grammatical modifier, in natural languages * Attribute (computing), a specification that defines a prope ...
of the form element in the above example) will receive the following query string: first=this+is+a+field&second=was+it+clear+%28already%29%3F. If the form is processed on the
server Server may refer to: Computing *Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients Role * Waiting staff, those who work at a restaurant or a bar attending customers and su ...
by a CGI
script Script may refer to: Writing systems * Script, a distinctive writing system, based on a repertoire of specific elements or symbols, or that repertoire * Script (styles of handwriting) ** Script typeface, a typeface with characteristics of handw ...
, the script may typically receive the query string as an
environment variable An environment variable is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the TEMP env ...
named QUERY_STRING.


Tracking

A program receiving a query string can ignore part or all of it. If the requested URL corresponds to a file and not to a program, the whole query string is ignored. However, regardless of whether the query string is used or not, the whole URL including it is stored in the server
log files In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or l ...
. These facts allow query strings to be used to track users in a manner similar to that provided by
HTTP cookie HTTP cookies (also called web cookies, Internet cookies, browser cookies, or simply cookies) are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's w ...
s. For this to work, every time the user downloads a page, a unique identifier must be chosen and added as a query string to the URLs of all links the page contains. As soon as the user follows one of these links, the corresponding URL is requested to the server. This way, the download of this page is linked with the previous one. For example, when a web page containing the following is requested: see my page! mine is better a unique string, such as e0a72cb2a2c7 is chosen, and the page is modified as follows: see my page! mine is better The addition of the query string does not change the way the page is shown to the user. When the user follows, for example, the first link, the browser requests the page foo.html?e0a72cb2a2c7 to the server, which ignores what follows ? and sends the page foo.html as expected, adding the query string to its links as well. This way, any subsequent page request from this user will carry the same query string e0a72cb2a2c7, making it possible to establish that all these pages have been viewed by the same user. Query strings are often used in association with
web beacon A web beaconAlso called web bug, tracking bug, tag, web tag, page tag, tracking pixel, pixel tag, 1×1 GIF, or clear GIF. is a technique used on web pages and email to unobtrusively (usually invisibly) allow checking that a user has accessed s ...
s. The main differences between query strings used for tracking and HTTP cookies are that: # Query strings form part of the URL, and are therefore included if the user saves or sends the URL to another user; cookies can be maintained across browsing sessions, but are not saved or sent with the URL. # If the user arrives at the same web server by two (or more) independent paths, it will be assigned two different query strings, while the stored cookies are the same. # The user can disable cookies, in which case using cookies for tracking does not work. However, using query strings for tracking should work in all situations. # Different query strings passed by different visits to the page will mean that the pages are never served from the browser (or proxy, if present) cache thereby increasing the load on the web server and slowing down the user experience.


Compatibility issues

According to the
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
specification:
Various ad hoc limitations on request-line length are found in practice. It is RECOMMENDED that all HTTP senders and recipients support, at a minimum, request-line lengths of 8000 octets.
If the URL is too long, the web server fails with the 414 Request-URI Too Long HTTP status code. The common workaround for these problems is to use
POST Post or POST commonly refers to: *Mail, the postal system, especially in Commonwealth of Nations countries **An Post, the Irish national postal service **Canada Post, Canadian postal service **Deutsche Post, German postal service **Iraqi Post, Ira ...
instead of
GET Get or GET may refer to: * Get (animal), the offspring of an animal * Get (divorce document), in Jewish religious law * GET (HTTP), a type of HTTP request * "Get" (song), by the Groggers * Georgia Time, used in the Republic of Georgia * Get AS, a ...
and store the parameters in the request body. The length limits on request bodies are typically much higher than those on URL length. For example, the limit on POST size, by default, is 2 MB on IIS 4.0 and 128 KB on IIS 5.0. The limit is configurable on Apache2 using the LimitRequestBody directive, which specifies the number of bytes from 0 (meaning unlimited) to 2147483647 (2 GB) that are allowed in a request body.core – Apache HTTP Server
Httpd.apache.org. Retrieved on 2013-09-08.


See also

*
Clean URL Clean URLs, also sometimes referred to as RESTful URLs, user-friendly URLs, pretty URLs or search engine-friendly URLs, are URLs intended to improve the usability and accessibility of a website or web service by being immediately and intuitively ...
* Click identifier *
Common Gateway Interface In computing, Common Gateway Interface (CGI) is an interface specification that enables web servers to execute an external program, typically to process user requests. Such programs are often written in a scripting language and are commonly ref ...
(CGI) *
HTTP cookie HTTP cookies (also called web cookies, Internet cookies, browser cookies, or simply cookies) are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's w ...
*
HyperText Transfer Protocol The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
(HTTP) *
Semantic URL Clean URLs, also sometimes referred to as RESTful URLs, user-friendly URLs, pretty URLs or search engine-friendly URLs, are URLs intended to improve the usability and accessibility of a website or web service by being immediately and intuitively ...
s *
URI scheme A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, conc ...
*
UTM parameters Urchin Tracking Module (UTM) parameters are five variants of query string, URL parameters used by marketers to track the effectiveness of online marketing campaigns across traffic sources and publishing media. They were introduced by Google Analy ...
*
Web beacon A web beaconAlso called web bug, tracking bug, tag, web tag, page tag, tracking pixel, pixel tag, 1×1 GIF, or clear GIF. is a technique used on web pages and email to unobtrusively (usually invisibly) allow checking that a user has accessed s ...


References

{{Reflist , refs=
HTML5.2, W3C recommendation, 14 December 2017
{{cite web, title=HTML URL Encoding Reference, url=https://www.w3schools.com/tags/ref_urlencode.asp, publisher=W3Schools, access-date=May 1, 2013 Th
''application/x-www-form-urlencoded'' encoding algorithm
HTML5.2, W3C recommendation, 14 December 2017
URL