XHTML RDFa
   HOME

TheInfoList



OR:

Extensible HyperText Markup Language (XHTML) is part of the family of
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
markup languages. It mirrors or extends versions of the widely used
HyperText Markup Language The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript ...
(HTML), the language in which Web pages are formulated. While HTML, prior to
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
, was defined as an application of Standard Generalized Markup Language (SGML), a flexible markup language framework, XHTML is an application of
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
, a more restrictive subset of SGML. XHTML documents are well-formed and may therefore be parsed using standard XML parsers, unlike HTML, which requires a lenient HTML-specific parser. XHTML 1.0 became a World Wide Web Consortium (W3C) recommendation on 26 January 2000. XHTML 1.1 became a W3C recommendation on 31 May 2001. The standard known as XHTML5 is being developed as an XML adaptation of the HTML5 specification.


Overview

XHTML 1.0 is "a reformulation of the three HTML 4 document types as applications of XML 1.0". The
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working ...
(W3C) also continues to maintain the HTML 4.01 Recommendation, and the specifications for
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
and XHTML5 are being actively developed. In the current XHTML 1.0 Recommendation document, as published and revised in August 2002, the W3C commented that "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility." However, in 2005, the
Web Hypertext Application Technology Working Group The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, lea ...
(WHATWG) formed, independently of the W3C, to work on advancing ordinary HTML not based on XHTML. The WHATWG eventually began working on a standard that supported both XML and non-XML serializations,
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
, in parallel to W3C standards such as XHTML 2. In 2007, the W3C's HTML working group voted to officially recognize HTML5 and work on it as the next-generation HTML standard. In 2009, the W3C allowed the XHTML 2 Working Group's charter to expire, acknowledging that HTML5 would be the sole next-generation HTML standard, including both XML and non-XML serializations. Of the two serializations, the W3C suggests that most authors use the HTML syntax, rather than the XHTML syntax.


Motivation

XHTML was developed to make HTML more
extensible Extensibility is a software engineering and systems design principle that provides for future growth. Extensibility is a measure of the ability to extend a system and the level of effort required to implement the extension. Extensions can be t ...
and increase
interoperability Interoperability is a characteristic of a product or system to work with other products or systems. While the term was initially defined for information technology or systems engineering services to allow for information exchange, a broader defi ...
with other data formats. In addition, browsers were forgiving of errors in HTML, and most websites were displayed despite technical errors in the markup; XHTML introduced stricter error handling. HTML 4 was ostensibly an application of Standard Generalized Markup Language (SGML); however the specification for SGML was complex, and neither web browsers nor the HTML 4 Recommendation were fully conformant to it. The XML standard, approved in 1998, provided a simpler data format closer in simplicity to HTML 4. By shifting to an XML format, it was hoped HTML would become compatible with common XML tools; servers and proxies would be able to transform content, as necessary, for constrained devices such as mobile phones. By using
namespaces In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
, XHTML documents could provide extensibility by including fragments from other XML-based languages such as
Scalable Vector Graphics Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphics, having support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium sinc ...
and
MathML Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide W ...
. Finally, the renewed work would provide an opportunity to divide HTML into reusable components (
XHTML Modularization XHTML modularization is a methodology for producing modularized markup languages in a number of different schema languages (currently DTDs, XML Schema and Relax NG) so that the modules can easily be plugged together to create markup languages. Alt ...
) and clean up untidy parts of the language.


Relationship to HTML

There are various differences between XHTML and HTML. The
Document Object Model The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document wi ...
(DOM) is a tree structure that represents the page internally in applications, and XHTML and HTML are two different ways of representing that in markup. Both are less expressive than the DOM – for example, "--" may be placed in comments in the DOM, but cannot be represented in a comment in either XHTML or HTML – and generally, XHTML's XML syntax is more expressive than HTML (for example, arbitrary namespaces are not allowed in HTML). XHTML uses an XML syntax, while HTML uses a pseudo- SGML syntax (officially SGML for HTML 4 and under, but never in practice, and standardized away from SGML in HTML5). Because the expressible contents of the DOM in syntax are slightly different, there are some changes in actual behavior between the two models. Syntax differences, however, can be overcome by implementing an alternate translational framework within the markup. First, there are some differences in syntax: * Broadly, the XML rules require that all elements be closed, either by a separate closing tag or using the self-closing syntax (e.g. ), while HTML syntax permits some elements to be unclosed because either they are always empty (e.g. ) or their end can be determined implicitly ("omissibility", e.g. ). * XML is case-sensitive for element and
attribute Attribute may refer to: * Attribute (philosophy), an extrinsic property of an object * Attribute (research), a characteristic of an object * Grammatical modifier, in natural languages * Attribute (computing), a specification that defines a prope ...
names, while HTML is not. * Some shorthand features in HTML are omitted in XML, such as (1) ''attribute minimization'', where attribute values or their quotes may be omitted (e.g. or , while in XML this must be expressed as ); (2) ''element minimization'' may be used to remove elements entirely (such as inferred in a table if not given); and (3) the rarely used SGML syntax for element minimization ("shorttag"), which most browsers do not implement. * There are numerous other technical requirements surrounding namespaces and precise parsing of whitespace and certain characters and elements. The exact parsing of HTML in practice has been undefined until recently; see the HTML5 specification
[HTML5
/nowiki>.html" ;"title="TML5">[HTML5
/nowiki>">TML5">[HTML5
/nowiki> for full details, or the working summary
HTML vs. XHTML
. In addition to the syntactical differences, there are some behavioral differences, mostly arising from the underlying differences in serialization. For example: * Behavior on parse errors differs. A fatal parse error in XML (such as an incorrect tag structure) causes document processing to be aborted. * Most content requiring namespaces will not work in HTML, except the built-in support for SVG and MathML in the HTML5 parser along with certain magic prefixes such as xlink. * JavaScript processing is different in XHTML, with minor changes in case sensitivity to some functions, and further precautions to restrict processing to well-formed content. Scripts must not use the method; it is not available for XHTML. The innerHTML property is available, but will not insert non-well-formed content. On the other hand, it can be used to insert well-formed namespaced content into XHTML. * Cascading Style Sheets (CSS) are also applied differently. Due to XHTML's case-sensitivity, all CSS selectors become case-sensitive for XHTML documents. Some CSS properties, such as backgrounds, set on the element in HTML are 'inherited upwards' into the element; this appears not to be the case for XHTML.


Adoption

The similarities between HTML 4.01 and XHTML 1.0 led many websites and content management systems to adopt the initial W3C XHTML 1.0 Recommendation. To aid authors in the transition, the W3C provided guidance on how to publish XHTML 1.0 documents in an HTML-compatible manner, and serve them to browsers that were not designed for XHTML. Such "HTML-compatible" content is sent using the HTML media type (text/html) rather than the official Internet media type for XHTML (application/xhtml+xml). When measuring the adoption of XHTML to that of regular HTML, therefore, it is important to distinguish whether it is media type usage or actual document contents that are being compared. Most web browsers have mature supportEarly implementations (such as Mozilla 0.7 and Opera 6.0, both released in 2001) do not incrementally render XHTML as it is received over the network, giving a degraded user experience; see th
Mozilla Web Author FAQ
Later browsers such as Opera 9.0, Safari 3.0, and Firefox 3.0 do not have this issue.
for all of the possible XHTML media types. The notable exception is
Internet Explorer Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical web browsers developed by Microsoft which was used in the Windows line of operating systems ( ...
versions 8 and earlier by
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
; rather than rendering application/xhtml+xml content, a dialog box invites the user to save the content to disk instead. Both Internet Explorer 7 (released in 2006) and Internet Explorer 8 (released in March 2009) exhibit this behavior. Microsoft developer Chris Wilson explained in 2005 that IE7's priorities were improved
browser security Browser security is the application of Internet security to web browsers in order to protect networked data and computer systems from breaches of privacy or malware. Security exploits of browsers often use JavaScript, sometimes with cross-si ...
and CSS support, and that proper XHTML support would be difficult to graft onto IE's compatibility-oriented HTML parser; however,
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
added support for true XHTML in IE9. As long as support is not widespread, most web developers avoid using XHTML that is not HTML-compatible, so advantages of XML such as namespaces, faster parsing, and smaller-footprint browsers do not benefit the user.


Criticism

In the early 2000s, some Web developers began to question why Web authors ever made the leap into authoring in XHTML. Others countered that the problems ascribed to the use of XHTML could mostly be attributed to two main sources: the production of invalid XHTML documents by some Web authors and the lack of support for XHTML built into
Internet Explorer 6 Microsoft Internet Explorer 6 (IE6) is a graphical web browser developed by Microsoft for Windows operating systems. Released on August 24, 2001, it is the sixth, and by now discontinued, version of Internet Explorer and the successor to Internet ...
. They went on to describe the benefits of XML-based Web documents (i.e. XHTML) regarding searching, indexing, and parsing as well as future-proofing the Web itself. In October 2006, HTML inventor and W3C chair Tim Berners-Lee, introducing a major W3C effort to develop a new HTML specification, posted in his blog that, "The attempt to get the world to switch to XML ... all at once didn't work. The large HTML-generating public did not move ... Some large communities did shift and are enjoying the fruits of well-formed systems ... The plan is to charter a completely new HTML group." The current HTML5 working draft says "special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability ... while at the same time updating the HTML specifications to address issues raised in the past few years." Ian Hickson, editor of the HTML5 specification criticizing the improper use of XHTML in 2002, is a member of the group developing this specification and is listed as one of the co-editors of the current working draft. Simon Pieters researched the XML-compliance of mobile browsers and concluded "the claim that XHTML would be needed for mobile devices is simply a myth".


Versions of XHTML


XHTML 1.0

December 1998 saw the publication of a W3C Working Draft entitled ''Reformulating HTML in XML''. This introduced Voyager, the codename for a new markup language based on HTML 4, but adhering to the stricter syntax rules of XML. By February 1999 the name of the specification had changed to ''XHTML 1.0: The Extensible HyperText Markup Language'', and in January 2000 it was officially adopted as a W3C Recommendation. There are three formal DTDs for XHTML 1.0, corresponding to the three different versions of HTML 4.01: * XHTML 1.0 Strict is the XML equivalent to strict HTML 4.01, and includes elements and attributes that have not been marked deprecated in the HTML 4.01 specification. , XHTML 1.0 Strict is the document type used for the homepage of the website of the
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working ...
. * XHTML 1.0 Transitional is the XML equivalent of HTML 4.01 Transitional, and includes the presentational elements (such as center, font and strike) excluded from the strict version. * XHTML 1.0 Frameset is the XML equivalent of HTML 4.01 Frameset, and allows for the definition of frameset documents—a common Web feature in the late 1990s. The second edition of XHTML 1.0 became a W3C Recommendation in August 2002.


Modularization of XHTML

Modularization provides an abstract collection of components through which XHTML can be subsetted and extended. The feature is intended to help XHTML extend its reach onto emerging platforms, such as mobile devices and Web-enabled televisions. The initial draft of ''Modularization of XHTML'' became available in April 1999, and reached Recommendation status in April 2001. The first modular XHTML variants were XHTML 1.1 and XHTML Basic 1.0. In October 2008 ''Modularization of XHTML'' was superseded by ''XHTML Modularization 1.1'', which adds an
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
implementation. It was superseded by a second edition in July 2010.


XHTML 1.1: Module-based XHTML

XHTML 1.1 evolved out of the work surrounding the initial ''Modularization of XHTML'' specification. The W3C released the first draft in September 1999; the Recommendation status was reached in May 2001. The modules combined within XHTML 1.1 effectively recreate XHTML 1.0 Strict, with the addition of ruby annotation elements (ruby, rbc, rtc, rb, rt and rp) to better support East-Asian languages. Other changes include the removal of the name attribute from the a and map elements, and (in the first edition of the language) the removal of the lang attribute in favor of xml: lang. Although XHTML 1.1 is largely compatible with XHTML 1.0 and HTML 4, in August 2002 the Working Group issued a formal Note advising that it should not be transmitted with the HTML media type. With limited browser support for the alternate application/xhtml+xml media type, XHTML 1.1 proved unable to gain widespread use. In January 2009 a second edition of the document (''XHTML Media Types – Second Edition'') was issued, relaxing this restriction and allowing XHTML 1.1 to be served as text/html. This document supersedes the HTML Compatibility Guidelines originally found in XHTML 1.0 Appendix C. The second edition of XHTML 1.1 was issued on 23 November 2010, which addresses various errata and adds an XML Schema implementation not included in the original specification. (It was first released briefly on 7 May 2009 as a "Proposed Edited Recommendation" before being rescinded on 19 May due to unresolved issues.)


XHTML Basic

Since
information appliance An information appliance (IA) is an appliance that is designed to easily perform a specific electronic function such as playing music, photography, or editing text. Typical examples are smartphones and personal digital assistants (PDAs). Inf ...
s may lack the
system resources In computing, a system resource, or simple resource, is any physical or virtual component of limited availability within a computer system. All connected devices and internal system components are resources. Virtual system resources include fi ...
to implement all XHTML abstract modules, the W3C defined a feature-limited XHTML specification called XHTML Basic. It provides a minimal feature subset sufficient for the most common content-authoring. The specification became a
W3C recommendation The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working t ...
in December 2000. Of all the versions of XHTML, XHTML Basic 1.0 provides the fewest features. With XHTML 1.1, it is one of the two first implementations of modular XHTML. In addition to the Core Modules (Structure, Text, Hypertext, and List), it implements the following abstract modules: Base, Basic Forms, Basic Tables, Image, Link, Metainformation, Object, Style Sheet, and Target. XHTML Basic 1.1 replaces the Basic Forms Module with the Forms Module and adds the Intrinsic Events, Presentation, and Scripting modules. It also supports additional tags and attributes from other modules. This version became a W3C recommendation on 29 July 2008. The current version of XHTML Basic is 1.1 Second Edition (23 November 2010), in which the language is re-implemented in the W3C's
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
language. This version also supports the lang attribute.


XHTML-Print

XHTML-Print, which became a W3C Recommendation in September 2006, is a specialized version of XHTML Basic designed for documents printed from information appliances to low-end
printers Printer may refer to: Technology * Printer (publishing), a person or a company * Printer (computing), a hardware device * Optical printer for motion picture films People * Nariman Printer ( fl. c. 1940), Indian journalist and activist * Jam ...
.


XHTML Mobile Profile

XHTML Mobile Profile (abbreviated XHTML MP or XHTML-MP) is a third-party variant of the W3C's XHTML Basic specification. Like XHTML Basic, XHTML was developed for information appliances with limited system resources. In October 2001, a
limited company In a limited company, the liability of members or subscribers of the company is limited to what they have invested or guaranteed to the company. Limited companies may be limited by shares or by guarantee. In a company limited by shares, the lia ...
called the Wireless Application Protocol Forum began adapting XHTML Basic for
WAP 2.0 Wireless Application Protocol (WAP) is a technical standard for accessing information over a mobile wireless network. A WAP browser is a web browser for mobile devices such as mobile phones that use the protocol. Introduced in 1999, WAP achieve ...
, the second major version of the
Wireless Application Protocol Wireless Application Protocol (WAP) is a technical standard for accessing information over a mobile wireless network. A WAP browser is a web browser for mobile devices such as mobile phones that use the protocol. Introduced in 1999, WAP achieve ...
. WAP Forum based their DTD on the W3C's Modularization of XHTML, incorporating the same modules the W3C used in XHTML Basic 1.0—except for the Target Module. Starting with this foundation, the WAP Forum replaced the Basic Forms Module with a partial implementation of the Forms Module, added partial support for the Legacy and Presentation modules, and added full support for the Style Attribute Module. In 2002, the WAP Forum has subsumed into the
Open Mobile Alliance OMA SpecWorks, previously the Open Mobile Alliance (OMA) is a standards organization which develops open, international technical standards for the mobile phone industry. It is a nonprofit Non-governmental organization (NGO), not a formal govern ...
(OMA), which continued to develop XHTML Mobile Profile as a component of their OMA Browsing Specification.


XHTML Mobile Profile 1.1

To this version, finalized in 2004, the OMA added partial support for the Scripting Module and partial support for Intrinsic Events. XHTML MP 1.1 is part of v2.1 of the OMA Browsing Specification (1 November 2002).


XHTML Mobile Profile 1.2

This version, finalized on 27 February 2007, expands the capabilities of XHTML MP 1.1 with full support for the Forms Module and OMA Text Input Modes. XHTML MP 1.2 is part of v2.3 of the OMA Browsing Specification (13 March 2007).


XHTML Mobile Profile 1.3

XHTML MP 1.3 (finalized on 23 September 2008) uses the XHTML Basic 1.1 document type definition, which includes the Target Module. Events in this version of the specification are updated to DOM Level 3 specifications (i.e., they are platform- and language-neutral).


XHTML 1.2

The XHTML 2 Working Group considered the creation of a new language based on XHTML 1.1. If XHTML 1.2 was created, it would include WAI-ARIA and role attributes to better support accessible web applications, and improved Semantic Web support through RDFa. The inputmode attribute from XHTML Basic 1.1, along with the target attribute (for specifying
frame A frame is often a structural system that supports other components of a physical construction and/or steel frame that limits the construction's extent. Frame and FRAME may also refer to: Physical objects In building construction *Framing (con ...
targets) might also be present. The XHTML2 WG had not been chartered to carry out the development of XHTML1.2. Since the W3C announced that it does not intend to recharter the XHTML2 WG, and closed the WG in December 2010, this means that XHTML 1.2 proposal would not eventuate.


XHTML 2.0

Between August 2002 and July 2006, the W3C released eight Working Drafts of XHTML 2.0, a new version of XHTML able to make a clean break from the past by discarding the requirement of backward compatibility. This lack of compatibility with XHTML 1.x and HTML 4 caused some early controversy in the web developer community.See both
XHTML 2.0 Considered Harmful
' and

' by browser developer Tantek Çelik, who criticizes early drafts of XHTML 2.0 for the absence of the style attribute and the cite element. Developer Daniel Glazman offer

but also shows support for some backward-incompatible changes such as the decision to remove the ins and del elements.
Some parts of the language (such as the role and RDFa attributes) were subsequently split out of the specification and worked on as separate modules, partially to help make the transition from XHTML 1.x to XHTML 2.0 smoother. The ninth draft of XHTML 2.0 was expected to appear in 2009, but on 2 July 2009, the W3C decided to let the XHTML2 Working Group charter expire by that year's end, effectively halting any further development of the draft into a standard. Instead, XHTML 2.0 and its related documents were released as W3C Notes in 2010. New features to have been introduced by XHTML 2.0 included: * HTML forms were to be replaced by
XForms XForms is an XML format used for collecting inputs from web forms. XForms was designed to be the next generation of HTML / XHTML forms, but is generic enough that it can also be used in a standalone manner or with presentation languages other th ...
, an XML-based user input specification allowing forms to be displayed appropriately for different rendering devices. * HTML frames were to be replaced by
XFrames XFrames is an XML format for combining and organizing web based documents together on a single webpage through the use of frames. Similarly to HTML Frames, XFrames can be made useful through its power to create a content frame that is scrollable wh ...
. * The DOM Events were to be replaced by
XML Events In computer science and web development, XML Events is a W3C standard for handling events that occur in an XML document. These events are typically caused by users interacting with the web page using a device, such as a web browser on a perso ...
, which uses the XML
Document Object Model The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document wi ...
. * A new list element type, the nl element type, was to be included to specifically designate a list as a navigation list. This would have been useful in creating nested menus, which are currently created by a wide variety of means like nested unordered lists or nested definition lists. * Any element was to be able to act as a hyperlink, e. g., , similar to
XLink XML Linking Language, or XLink, is an XML markup language and W3C specification that provides methods for creating internal and external links within XML documents, and associating metadata with those links. The XLink specification XLink 1.1 is ...
. However, XLink itself is not compatible with XHTML due to design differences. * Any element was to be able to reference alternative media with the src attribute, e. g., is the same as . * The alt attribute of the img element was removed: alternative text was to be given in the content of the img element, much like the object element, e. g., . * A single heading element (h) was added. The level of these headings was determined by the depth of the nesting. This would have allowed the use of headings to be infinite, rather than limiting use to six levels deep. * The remaining presentational elements i, b and tt, still allowed in XHTML 1.x (even Strict), were to be absent from XHTML 2.0. The only somewhat presentational elements remaining were to be sup and sub for superscript and subscript respectively because they have significant non-presentational uses and are required by certain languages. All other tags were meant to be semantic instead (e. g. strong for strong emphasis) while allowing the user agent to control the presentation of elements via CSS (e.g. rendered as boldface text in most visual browsers, but possibly rendered with changes of tone in a text-to-speech reader, larger + italic font per rules in a user-end stylesheet, etc.). * The addition of RDF triple with the property and about attributes to facilitate the conversion from XHTML to RDF/XML.


XHTML5

HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
grew independently of the W3C, through a loose group of browser manufacturers and other interested parties calling themselves the
WHATWG The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, l ...
, or Web Hypertext Application Technology Working Group. The key motive of the group was to create a platform for dynamic web applications; they considered XHTML 2.0 to be too document-centric, and not suitable for the creation of
internet forum An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are often longer than one line of text, and are at least temporar ...
sites or online shops. HTML5 has both a regular text/html serialization and an XML serialization, which is also known as XHTML5. The language is more compatible with HTML 4 and XHTML 1.x than XHTML 2.0, due to the decision to keep the existing HTML form elements and events model. It adds many new elements not found in XHTML 1. x, however, such as section and aside tags. The XHTML5 language, like HTML5, uses a DOCTYPE declaration without a DTD. Furthermore, the specification deprecates earlier XHTML DTDs by asking the browsers to replace them with one containing only entity definitions for named characters during parsing.


Semantic content in XHTML

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. This host language is one of the techniques used to develop Semantic Web content by embedding rich semantic markup.


Valid XHTML documents

An XHTML document that conforms to an XHTML specification is said to be ''valid''. Validity assures consistency in document code, which in turn eases processing, but does not necessarily ensure consistent rendering by browsers. A document can be checked for validity with the
W3C Markup Validation Service The Markup Validation Service is a validator by the World Wide Web Consortium (W3C) that allows Internet users to check pre-HTML5 HTML and XHTML documents for well-formed markup against a document type definition. Markup validation is an impor ...
(for XHTML5, the Validator. nu Living Validator should be used instead). In practice, many web development programs provide code validation based on the
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
standards.


Root element

The root element of an XHTML document must be html, and must contain an xmlns attribute to associate it with the XHTML
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
. The namespace URI for XHTML is http://www.w3.org/1999/xhtml. The example tag below additionally features an xml:lang attribute to identify the document with a natural language:


DOCTYPEs

In order to validate an XHTML document, a
Document Type Declaration #REDIRECT Document type declaration {{redirect category shell, {{R move{{R from other capitalisation{{R up ...
, or ''DOCTYPE'', may be used. A DOCTYPE declares to the browser the Document Type Definition (DTD) to which the document conforms. A Document Type Declaration should be placed before the
root element Each XML document has exactly one single root element. It encloses all the other elements and is, therefore, the sole parent element to all the other elements. ROOT elements are also called document elements. In HTML, the root element is the elemen ...
. The system identifier part of the DOCTYPE, which in these examples is the URL that begins with http://, need only point to a copy of the DTD to use, if the validator cannot locate one based on the public identifier (the other quoted string). It does not need to be the specific URL that is in these examples; in fact, authors are encouraged to use local copies of the DTD files when possible. The public identifier, however, must be character-for-character the same as in the examples.


XML declaration

A
character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...
may be specified at the beginning of an XHTML document in the XML declaration when the document is served using the application/xhtml+xml
MIME Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Message ...
type. (If an XML document lacks encoding specification, an XML parser assumes that the encoding is
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
or
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as cod ...
, unless the encoding has already been determined by a higher protocol.) For example: : The declaration may be optionally omitted because it declares its encoding the default encoding. However, if the document instead makes use of XML 1.1 or another character encoding, a declaration is necessary.
Internet Explorer Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical web browsers developed by Microsoft which was used in the Windows line of operating systems ( ...
prior to version 7 enters
quirks mode In computing, quirks mode is a technique used by some web browsers for the sake of maintaining backward compatibility with web pages designed for old web browsers instead of strictly complying with W3C and IETF standards in standards mode. This b ...
, if it encounters an XML declaration in a document served as text/html.


Backward compatibility

XHTML 1.x documents are mostly backward compatible with HTML 4 user agents when the appropriate guidelines are followed. XHTML 1.1 is essentially compatible, although the elements for ruby annotation are not part of the HTML 4 specification and thus generally ignored by HTML 4 browsers. Later XHTML 1.x modules such as those for the role attribute, RDFa, and WAI-ARIA degrade gracefully in a similar manner. XHTML 2.0 is significantly less compatible, although this can be mitigated to some degree through the use of scripting. (This can be simple one-liners, such as the use of document.createElement() to register a new HTML element within Internet Explorer, or complete JavaScript frameworks, such as the FormFaces implementation of
XForms XForms is an XML format used for collecting inputs from web forms. XForms was designed to be the next generation of HTML / XHTML forms, but is generic enough that it can also be used in a standalone manner or with presentation languages other th ...
.)


Examples

The following are examples of XHTML 1.0 Strict, with both having the same visual output. The former one follows the HTML Compatibility Guidelines of the XHTML Media Types Note while the latter one breaks backward compatibility, but provides cleaner markup. Example 1. XHTML 1.0 Strict Example

This is an example of an XHTML 1.0 Strict document.
Valid XHTML 1.0 Strict

Example 2. XHTML 1.0 Strict Example

This is an example of an XHTML 1.0 Strict document.
Valid XHTML 1.0 Strict

Notes: # The "loadpdf" function is actually a workaround for Internet Explorer. It can be replaced by adding within . # The img element does not get a name attribute in th
XHTML 1.0 Strict DTD
Use id instead.


Cross-compatibility of XHTML and HTML

HTML5 and XHTML5 serializations are largely inter-compatible if adhering to the stricter XHTML5 syntax, but there are some cases in which XHTML will not work as valid HTML5 (e.g., processing instructions are deprecated in , are treated as comments, and close on the first ">", whereas they are fully allowed in XML, are treated as their own type, and close on ?>).HTML vs. XHTML
WHATWG Wiki


See also

* Extensible User Interface Protocol *
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaSc ...
*
List of XML and HTML character entity references In SGML, HTML and XML documents, the logical constructs known as ''character data'' and ''attribute values'' consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series ...


References


External links


W3C's Markup Home Page

XHTML 1.0 Recommendation

XHTML 1.1 Recommendation

XHTML 2.0 Working Group Note

XHTML Basic

XHTML 1.0 Strict / 1.1 Online Reference
* Links dealing with the MIME type of XHTML documents: *
Beware of XHTML
*
Sending XHTML as text/html Considered Harmful
*
Serving up XHTML with the correct MIME type
*

– Mark Pilgrim (3/19/2003). Includes examples for conditionally serving application/xhtml+xml using
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, and Apache (via
URL rewriting In web applications, a rewrite engine is a software component that performs rewriting on URLs (Uniform Resource Locators), modifying their appearance. This modification is called URL rewriting. It is a way of implementing URL mapping or routing ...
). *
Mozilla Web Author FAQ: How is the treatment of application/xhtml+xml documents different from the treatment of text/html documents?
– summarizes one web browser's XHTML processing mode




W3C's Markup Validator

HTML to XHTML conversion library for .NET
{{Authority control HTML Markup languages Open formats World Wide Web Consortium standards XML-based standards