A uniform resource locator (URL), colloquially known as an address on the

Web Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created by ...

, is a reference to a

resource ''Resource'' refers to all the materials available in our environment which are Technology, technologically accessible, Economics, economically feasible and Culture, culturally Sustainability, sustainable and help us to satisfy our needs and want ...

that specifies its location on a

computer network A computer network is a collection of communicating computers and other devices, such as printers and smart phones. In order to communicate, the computers and devices must be connected by wired media like copper cables, optical fibers, or b ...

and a mechanism for retrieving it. A URL is a specific type of

Uniform Resource Identifier A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world obje ...

(URI), although many people use the two terms interchangeably. URLs occur most commonly to reference

web page A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...

s (

HTTP HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...

HTTPS Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It uses encryption for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protoc ...

) but are also used for file transfer (

FTP The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and dat ...

), email ( mailto), database access (

JDBC Java Database Connectivity (JDBC) is an application programming interface (API) for the Java (programming language), Java programming language which defines how a client may access a database. It is a Java-based data access technology used for Java ...

), and many other applications. Most

web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...

s display the URL of a web page above the page in an

address bar In a web browser, the address bar (also location bar or URL bar) is the element that shows the current URL. The user can type a URL into it to navigate to a chosen website. In most modern browsers, non-URLs are automatically sent to a search eng ...

. A typical URL could have the form http://www.example.com/index.html, which indicates a protocol (http), a hostname (www.example.com), and a file name (index.html).

History

Uniform Resource Locators were defined in in 1994 by

Tim Berners-Lee Sir Timothy John Berners-Lee (born 8 June 1955), also known as TimBL, is an English computer scientist best known as the inventor of the World Wide Web, the HTML markup language, the URL system, and HTTP. He is a professorial research fellow a ...

, the inventor of the

World Wide Web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...

, and the URI working group of the

Internet Engineering Task Force The Internet Engineering Task Force (IETF) is a standards organization for the Internet standard, Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster ...

(IETF), as an outcome of collaboration started at the IETF Living Documents birds of a feather session in 1992. The format combines the pre-existing system of

domain name In the Internet, a domain name is a string that identifies a realm of administrative autonomy, authority, or control. Domain names are often used to identify services provided through the Internet, such as websites, email services, and more. ...

s (created in 1985) with file path syntax, where slashes are used to separate directory and

filename A filename or file name is a name used to uniquely identify a computer file in a file system. Different file systems impose different restrictions on filename lengths. A filename may (depending on the file system) include: * name – base ...

s. Conventions already existed where server names could be prefixed to complete file paths, preceded by a double slash (//). Berners-Lee later expressed regret at the use of dots to separate the parts of the

within URIs, wishing he had used slashes throughout, and also said that, given the colon following the first component of a URI, the two slashes before the domain name were unnecessary. Early

WorldWideWeb WorldWideWeb (later renamed Nexus to avoid confusion between the software and the World Wide Web) is the first web browser and web page editor. It was discontinued in 1994. It was the first WYSIWYG HTML editor. The source code was released i ...

collaborators including Berners-Lee originally proposed the use of UDIs: Universal Document Identifiers. An early (1993) draft of the HTML Specification referred to "Universal" Resource Locators. This was dropped some time between June 1994 () and October 1994 (draft-ietf-uri-url-08.txt). In his book '' Weaving the Web'', Berners-Lee emphasizes his preference for the original inclusion of "universal" in the expansion rather than the word "uniform", to which it was later changed, and he gives a brief account of the contention that led to the change.

Syntax

Every HTTP URL conforms to the syntax of a generic URI. A web browser will usually

dereference In computer science, a pointer is an object (computer science), object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped I/O, memo ...

a URL by performing an

request to the specified host, by default on port number 80. URLs using the https scheme require that requests and responses be made over a secure connection to the website.

Internationalized URL

Internet users are distributed throughout the world using a wide variety of languages and alphabets, and expect to be able to create URLs in their own local alphabets. An Internationalized Resource Identifier (IRI) is a form of URL that includes

Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...

characters. All modern browsers support IRIs. The parts of the URL requiring special treatment for different alphabets are the domain name and path. The domain name in the IRI is known as an

Internationalized Domain Name An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacrit ...

(IDN). Web and Internet software automatically convert the domain name into punycode usable by the

Domain Name System The Domain Name System (DNS) is a hierarchical and distributed name service that provides a naming system for computers, services, and other resources on the Internet or other Internet Protocol (IP) networks. It associates various information ...

; for example, the Chinese URL http://例子.卷筒纸 becomes http://xn--fsqu00a.xn--3lr804guic/. The xn-- indicates that the character was not originally

ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...

. The URL path name can also be specified by the user in the local writing system. If not already encoded, it is converted to

UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...

, and any characters not part of the basic URL character set are escaped as

hexadecimal Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...

using

percent-encoding URL encoding, officially known as percent-encoding, is a method to binary-to-text encoding, encode arbitrary data in a uniform resource identifier (URI) using only the ASCII, US-ASCII characters legal within a URI. Although it is known as ''URL en ...

; for example, the Japanese URL http://example.com/引き割り.html becomes http://example.com/%E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A.html. The target computer decodes the address and displays the page.

Protocol-relative URLs

Protocol-relative links (PRL), also known as protocol-relative URLs (PRURL), are URLs that have no protocol specified. For example, //example.com will use the protocol of the current page, typically HTTP or HTTPS.

Notes

Citations

References

* * * * * * * * * * * * * * * *

External links

URL specification
at

WHATWG The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, ...

URL splitter
that splits any URI into its parts {{Authority control Identifiers Internet properties established in 1994