HTML Unleashed. SGML and the HTML DTD: How to Define an SGML Application | WebReference

HTML Unleashed. SGML and the HTML DTD: How to Define an SGML Application


HTML Unleashed: SGML and the HTML DTD

How to Define an SGML Application


From SGML's point of view, a document is a hierarchical structure of nested elements (chapters, sections, paragraphs, and so on).  SGML has no means---and was not intended to have---for specifying any presentational aspects of these elements.  However, strictly speaking, SGML cannot tell you about the meaning or role of any element, either.  This information is implied by the creator of an SGML application and is usually provided in comments or in the documentation accompanying the formal specification.

SGML realizes the maxim of Wittgenstein, who said, "The meaning of a word is its use."  In SGML, the only information that can be formally communicated about an element is in what contexts and levels of document hierarchy it can or must occur.  This means that you cannot build an interpreter that could apply a meaningful formatting to a document based only on its SGML markup.  However, the purely formal dissection that SGML performs on a document is still surprisingly useful in many situations.

All documents that can be marked up with the same hierarchy of elements are said to belong to a certain document type.  Rather than describe a set of tools to mark up documents, SGML defines the structure of a particular type of documents via what is called document type definition (DTD).  A part of this chapter is devoted to analyzing the DTD for one particular SGML application, HTML version 4.0 (code-named Cougar).  Besides (and before) the DTD, some general features of an SGML application are specified in another formal construct called the SGML declaration, which is detailed in the next section.

As for SGML syntax, suffice it to say beforehand that it is pretty close to the syntax of HTML.  You will see that SGML statements, like HTML tags, are enclosed in angle brackets (<>) and contain a keyword or name followed by one or more parameters separated by spaces.  The only consistent difference is that SGML statements commonly have the ! character inserted between the open delimiter < and the statement keyword, for example:

<!ELEMENT IMG - O EMPTY -- Embedded image -->

You must already be familiar with one type of statement that uses the <! syntax, namely comments in HTML documents that are enclosed in <!-- and -->.  That's because the comment syntax of HTML is directly borrowed from SGML, where everything within a <! statement enclosed in double hyphens (--) is ignored by the SGML parser.  For example, the words Embedded image in the preceding code line are intended as a comment for human readers only.

One more <!-type declaration that needs to appear in HTML files is DOCTYPE, discussed briefly later in this chapter.


Created: Jun. 15, 1997
Revised: Jun. 16, 1997