HOME

TheInfoList



OR:

In programming, a file uniform resource identifier (URI) scheme is a specific format of URI, used to specifically identify a file on a host computer. While URIs can be used to identify anything, there is specific syntax associated with identifying files.


Format

A file URI has the format file://''host''/''path'' where ''host'' is the fully qualified domain name of the system on which the ''path'' is accessible, and ''path'' is a hierarchical directory path of the form ''directory''/''directory''/.../''name''. If ''host'' is omitted, it is taken to be "
localhost In computer networking, localhost is a hostname that refers to the current computer used to access it. The name ''localhost'' is reserved for loopback purposes. It is used to access the network services that are running on the host via t ...
", the machine from which the URL is being interpreted. Note that when omitting host, the slash is not omitted (while "file:///piro.txt" is valid, "file://simpen.txt" is not, although some interpreters manage to handle the latter). RFC 3986 includes additional information about the treatment of ".." and "." segments in URIs.


Number of slash characters

* The character sequence of two slash characters (//) after the string ''file:'' denotes that either a hostname or the literal term ''localhost'' follows, although this part may be omitted entirely, or may contain an empty hostname. * The single slash between ''host'' and ''path'' denotes the start of the local-path part of the URI and must be present. * A valid file URI must therefore begin with either file:/path (no hostname), file:///path (empty hostname), or file://hostname/path. * file://path (i.e. two slashes, without a hostname) is never correct, but is often used. * Further slashes in ''path'' separate directory names in a hierarchical system of directories and subdirectories. In this usage, the slash is a general, system-independent way of separating the parts, and in a particular host system it might be used as such in any pathname (as in Unix systems). There are two ways that Windows UNC filenames (such as \\server\folder\data.xml) can be represented. These are both described in RFC 8089, Appendix E as "non-standard". The first way (called here the 2-slash format) is to represent the server name using the ''Authority'' part of the URI, which then becomes file://server/folder/data.xml. The second way (called here the 4-slash format) is to represent the server name as part of the ''Path'' component, so the URI becomes file:////server/folder/data.xml. Both forms are actively used. Microsoft .NET (for example, the method new Uri(path)) generally uses the 2-slash form; Java (for example, the method new URI(path)) generally uses the 4-slash form. Either form allows the most common operations on URIs (resolving relative URIs, and dereferencing to obtain a connection to the remote file) to be used successfully. However, because these URIs are non-standard, some less common operations fail: an example is the ''normalize'' operation (defined in RFC 3986 and implemented in the Java java.net.URI.normalize() method) which reduces file:////server/folder/data.xml to the unusable form file:/server/folder/data.xml.


Examples


Unix

Here are two
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
examples pointing to the same ''/etc/fstab'' file: file://localhost/etc/fstab file:///etc/fstab The
KDE KDE is an international free software community that develops free and open-source software. As a central development hub, it provides tools and resources that enable collaborative work on its projects. Its products include the KDE Plasma gra ...
environment without an authority field: file:/etc/fstab


Windows

Here are some examples which may be accepted by some applications on Windows systems, referring to the same, local file ''c:''\''WINDOWS''\''clock.avi'' file://localhost/c:/WINDOWS/clock.avi file:///c:/WINDOWS/clock.avi Here is the URI as understood by the Windows Shell API: file:///c:/WINDOWS/clock.avi Note that the drive letter followed by a colon and slash is part of the acceptable file URI.


Implementations


Windows

On Microsoft Windows systems, the normal colon (:) after a device letter has sometimes been replaced by a vertical bar (, ) in file URLs. This reflected the original URL syntax, which made the colon a reserved character in a path part. Since Internet Explorer 4, file URIs have been standardized on Windows, and should follow the following scheme. This applies to all applications which use URLMON or SHLWAPI for parsing, fetching or binding to URIs. To convert a path to a URL, use UrlCreateFromPath, and to convert a URL to a path, use PathCreateFromUrl. To access a file "the file.txt", the following might be used. For a network location: file://hostname/path/to/the%20file.txt Or for a local file, the hostname is omitted, but the slash is not (note the third slash): file:///c:/path/to/the%20file.txt This is not the same as providing the string "localhost" or the dot "." in place of the hostname. The string "localhost" will attempt to access the file as UNC path \\localhost\c:\path\to\the file.txt, which will not work since the colon is not allowed in a share name. The dot "." results in the string being passed as \\.\c:\path\to\the file.txt, which will work for local files, but not shares on the local system. For example file://./sharename/path/to/the%20file.txt will not work, because it will result in ''sharename'' being interpreted as part of the DOSDEVICES namespace, not as a network share. The following outline roughly describes the requirements. * The colon should be used, and should ''not'' be replaced with a vertical bar for Internet Explorer. * Forward slashes should be used to delimit paths. * Characters such as the hash (#) or question mark (?) which are part of the filename should be percent-encoded. * Characters which are not allowed in URIs, but which are allowed in filenames, must also be percent-encoded. For example, any of "`^ " and all control characters. In the example above, the space in the filename is encoded as %20. * Characters which are allowed in both URIs and filenames must NOT be percent-encoded. * Must not use legacy ACP encodings. (ACP code pages are specified by DOS CHCP or Windows Control Panel language setting.) * Unicode characters outside of the
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
range must be
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
encoded, and those UTF-8 encodings must be percent-encoded. Use the provided functions if possible. If you must create a URL programmatically and cannot access SHLWAPI.dll (for example from script, or another programming environment where the equivalent functions are not available) the above outline will help.


Legacy URLs

To aid the installed base of legacy applications on Win32 PathCreateFromUrl recognizes certain URLs which do not meet these criteria, and treats them uniformly. These are called "legacy" file URLs as opposed to "healthy" file URLs. In the past, a variety of other applications have used other systems. Some added an additional two slashes. For example, UNC path \\remotehost\share\dir\file.txt would become file:////remotehost/share/dir/file.txt instead of the "healthy" file://remotehost/share/dir/file.txt.


Web pages

File URLs are rarely used in
Web page A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...
s on the public Internet, since they are only useful if it is known that a specific file exists on the designated host or the local computer. Additionally, web browsers generally disable File URLs in web pages that were not themselves loaded from a File URL for security reasons. The ''host'' specifier can be used to retrieve a file from an external source. However, no specific file-retrieval protocol is specified and the interpretation of the host specifier is not well standardized, so it is only useful in specific circumstances. If a web page wants to access files stored on the computer the web browser is running on, a modern alternative to File URLs is the HTML5 File API.


References

{{URI scheme Internet Standards Identifiers URI schemes .