XHTML 1.0: Where XML and HTML meet
Any transition pains?
Unfortunately, yes. Some of the subtle differences in HTML and XML encoding cause some
difficulties:
- Boolean Attributes
Some browsers cannot interpret boolean attributes when these appear in
their full, non-minimized form, as required by XML 1.0. This problem doesn't affect
user agents compliant with HTML 4, though. The following attributes are involved: compact,
nowrap, ismap, declare, noshade, checked, disabled, readonly, multiple, selected,
noresize, defer .
Document Object Model and XHTML
The Document Object Model level 1 Recommendation defines document object model
interfaces for XML and HTML 4. The HTML 4 document object model specifies that HTML
element and attribute names are returned in upper-case. The XML document object model
specifies that element and attribute names are returned in the case they are specified.
In XHTML 1.0, elements and attributes are specified in lower-case. This apparent
difference can be addressed in two ways:
- Applications that access XHTML documents served as Internet media type text/html via the
DOM can use the HTML DOM, and can rely upon element and attribute names being returned in
upper-case from those interfaces.
- Applications that access XHTML documents served as Internet media types text/xml or
application/xml can also use the XML DOM. Elements and attributes will be returned in
lower-case. Also, some XHTML elements may or may not appear in the object tree because
they are optional in the content model (e.g. the tbody element within table). This occurs
because in HTML 4 some elements were permitted to be minimized such that their start and
end tags are both omitted (an SGML feature). This is not possible in XML. Rather than
require document authors to insert extraneous elements, XHTML has made the elements
optional. Applications need to adapt to this accordingly.
XML Processing Instructions
Be aware that processing instructions are rendered on some user agents. However, also
note that when the XML declaration is not included in a document, the document can only
use the default character encodings UTF-8 or UTF-16.
Cascading Style Sheets (CSS) and XHTML
The Cascading Style Sheets level 2 Recommendation [CSS2] defines style properties which
are applied to the parse tree of the HTML or XML document. Differences in parsing will
produce different visual or aural results, depending on the selectors used. The following
hints will reduce this effect for documents which are served without modification as both
media types:
- CSS style sheets for XHTML should use lower case element and attribute names.
In tables, the tbody element will be inferred by the parser of an HTML user agent, but
not by the parser of an XML user agent. Therefore you should always explicitly add a
tbody element if it is referred to in a CSS selector.
- Within the XHTML name space, user agents are expected to recognize the "id" attribute as
an attribute of type ID. Therefore, style sheets should be able to continue using the
shorthand "#" selector syntax even if the user agent does not read the DTD.
- Within the XHTML name space, user agents are expected to recognize the "class" attribute.
Therefore, style sheets should be able to continue using the shorthand "." selector syntax.
- CSS defines different conformance rules for HTML and XML documents; be aware that the
HTML rules apply to XHTML documents delivered as HTML and the XML rules apply to XHTML
documents delivered as XML.
next page

Produced by Michael Claßen
All Rights Reserved. Legal Notices.
URL: http://www.webreference.com/xml/column6/5.html
Created: Feb. 07, 2000
Revised: Feb. 07, 2000