Canonical Link Element
   HOME

TheInfoList



OR:

A canonical link element is an
HTML element An HTML element is a type of HTML (HyperText Markup Language) document component, one of several types of HTML nodes (there are also text nodes, comment nodes and others). The first used version of HTML was written by Tim Berners-Lee in 1993 ...
that helps
webmasters A webmaster is a person responsible for maintaining one or more websites. The title may refer to web architects, web developers, site authors, website administrators, website owners, website coordinators, or website publishers. The duties of ...
prevent
duplicate content Duplicate content is a term used in the field of search engine optimization to describe content that appears on more than one web page. The duplicate content can be substantial parts of the content within or across domains and can be either exactly ...
issues in
search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of Web traffic, website traffic to a website or a web page from web search engine, search engines. SEO targets unpaid traffic (known as "natural" or "Organ ...
by specifying the "
canonical The adjective canonical is applied in many contexts to mean "according to the canon" the standard, rule or primary source that is accepted as authoritative for the body of knowledge or literature in that context. In mathematics, "canonical example ...
" or "preferred" version of a web page. It is described in RFC 6596, which went live in April 2012.


Purpose

A major problem for
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
is to determine the original source for documents that are available on multiple
URLs A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifi ...
. Content duplication can happen in many ways, including: * Duplication due to -parameters * Duplication with multiple URLs due to
CMS CMS may refer to: Computing * Call management system * CMS-2 (programming language), used by the United States Navy * Code Morphing Software, a technology used by Transmeta * Collection management system for a museum collection * Color manag ...
* Duplication due to accessibility on different hosts/protocols * Duplication due to print versions of websites Duplicate content issues occur when the same content is accessible from multiple URLs. For example, would be considered by
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
to be an entirely different page from , even though both URLs may reference the same content. In February 2009,
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
,
Yahoo Yahoo! (, styled yahoo''!'' in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo! Inc. (2017–present), Yahoo Inc., which is 90% owned by investment funds ma ...
and
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
announced support for the canonical link element, which can be inserted into the section of a web page, to allow webmasters to prevent these issues. The canonical link element helps webmasters make clear to the search engines which page should be credited as the original.


How search engines handle

rel Rel or REL may mean: __NOTOC__ Science and technology * REL, a human gene * the rel descriptor of stereochemistry, see Relative configuration *REL (''Rassemblement Européen pour la Liberté''), European Rally for Liberty, a defunct French far-ri ...
=canonical

Search engines try to utilize canonical link definitions as an output filter for their search results. If multiple URLs contain the same content in the result set, the canonical link URL definitions will likely be incorporated to determine the original source of the content. "For example, when Google finds identical content instances, it decides to show one of them. Its choice of the resource to display in the search results will depend upon the search query." According to Google, the canonical link element is not considered to be a directive, but rather a hint that the ranking algorithm will "honor strongly." While the canonical link element has its benefits,
Matt Cutts Matthew Cutts (born 1972 or 1973) is an American software engineer. Cutts is the former Administrator of the United States Digital Service. He was first appointed as acting administrator, to later be confirmed as full administrator in October 20 ...
, then the head of Google's webspam team, has said that the search engine prefers the use of 301 redirects. Cutts said the preference for redirects is because Google's
spiders Spiders ( order Araneae) are air-breathing arthropods that have eight legs, chelicerae with fangs generally able to inject venom, and spinnerets that extrude silk. They are the largest order of arachnids and rank seventh in total species dive ...
can choose to ignore a canonical link element if they deem it more beneficial to do so.


Implementation


Semantic tag

The canonical link element can be either used in the
semantic Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
HTML or sent with the
HTTP header The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, w ...
of a document. For non HTML documents, the HTTP header is an alternate way to set a canonical URL. By the HTML 5 standard, the HTML element must be within the section of the document.


Self-hyperlink

Some sites such as
Stack Overflow In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many fac ...
have on-page hyperlinks which link to a
clean URL Clean URLs, also sometimes referred to as RESTful URLs, user-friendly URLs, pretty URLs or search engine-friendly URLs, are URLs intended to improve the usability and accessibility of a website or web service by being immediately and intuitively ...
of themselves. Usability benefits are facilitating copying the hyperlink target URL or title if the browser or a browser extension offers a "Copy link text"
context menu A context menu (also called contextual, shortcut, and pop up or pop-up menu) is a menu in a graphical user interface (GUI) that appears upon user interaction, such as a right-click mouse operation. A context menu offers a limited set of choic ...
option for hyperlinks, the ability for the original URL to be retrieved from a saved page if not stored by the browser into a comment inside the file, as well as the ability to duplicate the opened page into a new tab right next to the currently opened one if the browser lacks such a feature.


Examples


HTML

Below is an example of HTML code that uses the inside the tag. The code could be used on a page such as https://example.com/page.php?parameter=1to tell search engines that the https://example.com/page.php is the preferred version of the webpage. ...


HTTP

HTTP/1.1 200 OK Content-Type: application/pdf Link: ; rel="canonical" Content-Length: 4223 ...


See also

*
URL normalization URI normalization is the process by which URIs are modified and standardized in a consistent manner. The goal of the normalization process is to transform a URI into a normalized URI so it is possible to determine if two syntactically differen ...


References

{{DEFAULTSORT:Canonical link element Search engine optimization HTML