Nameprep is the process of case-folding a
string
String or strings may refer to:
* String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian ani ...
to lowercase and removal of some generally invisible code points before it is suitable to represent a
domain name
A domain name is a string that identifies a realm of administrative autonomy, authority or control within the Internet. Domain names are often used to identify services provided through the Internet, such as websites, email services and more. As ...
, or other such canonical name. It is used by the
Internationalizing Domain Names in Applications (IDNA) standard, using the
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, whic ...
standard for
NFKC normalization.
Nameprep is defined in RFC 3491, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", as a profile of
stringprep, which is described in RFC 3454, "Preparation of Internationalized Strings ("stringprep")."
It does not map lookalike characters to a single character nor prohibit the use of lookalike characters. There are good reasons for this, such as the fact that same sets of characters may be lookalikes in some fonts but not in others, and the fact that any decision on which character to map to will obviously provide a bias towards users of one script; but it also has potentially grave implications for security if not considered by the designers and administrators of systems based on nameprep (the best known example of this being VeriSign's handling of IDNA names in .com and .net).
See also
*
Homoglyph
In orthography and typography, a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar. The designation is also applied to sequences of characters sharing these properties.
Synoglyphs ...
*
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, whic ...
*
Internationalization
In economics, internationalization or internationalisation is the process of increasing involvement of enterprises in international markets, although there is no agreed definition of internationalization. Internationalization is a crucial strateg ...
*
International Components for Unicode
International Components for Unicode (ICU) is an open-source project of mature C/ C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and environ ...
(ICU contains an implementation of nameprep)
*
Internationalized domain name
An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-latin script or alphabet, such as Arabic, Bengali, Chinese (Mandarin, simplifie ...
*
IDN homograph attack
The internationalized domain name (IDN) homograph attack is a way a malicious party may deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters look alike (i.e., they ar ...
or "lookalike" character spoofing based on a URL's appearance as read by a web user or as entered by a web user ( read in a page font, entered in the user's font of choice.) Note: this is not URI ambiguity in encoding. Examples are provided in both of the above articles.
Unicode
Domain Name System
{{web-stub