Preparing for XML. Modular HTML | WebReference

Preparing for XML. Modular HTML

  Modular HTML

In my previous article, I promoted academic style as a viable option for those who are not particularly interested in design but are serious about the quality of information they deliver.  On another occasion, I spoke about modular design as one of the best ways to ensure a neat and consistent visual presentation.  Now, let's take the modular concept one step down the abstractions stairway and try to apply it not to design units, but to the HTML source of your pages.

I'm sure that those who do a lot of HTML coding have long ago developed their own sets of practical rules-of-thumb similar to those discussed below.  My goal here is only to describe the modular HTML concept and suggest some of its important corollaries that you can benefit from---as we'll see shortly, modularizing your HTML may prove the only feasible way to safely and seamlessly migrate to XML. 

It is well known that all HTML tags fall into two categories: logical tags (e.g. <H1>, <BLOCKQUOTE>, <ADDRESS>) and visual tags (e.g. <FONT> or <MARQUEE>).  Many tags combine logical and visual capabilities, and some, being essentially logical, are widely used for the sake of their visual side effects.  This reflects the bilateral nature of HTML which, having originated as a strictly logical markup tool, had to fill in the need of visual formatting means for Web documents.  At this stage, while usage conventions similar to what I termed the academic style are still highly encouraged, the battle for the "purely logical" HTML seems to be pretty much lost: In a great majority of cases, the language is used as little else but a simple formatting engine.

Nature, however, will not tolerate a vacuum, and the dire need for a logical markup system is not due to W3C standardizers---it is felt by everyone whose task is to manage Web documents.  At the moment, what this vacuum is most often filled with could be called "Modular HTML," with groups of tags gathered into fixed "modules" that act as a sort of logical markup units storing their associated presentation parameters within their own code.  This is not exactly what one would call "separation of content and presentation," as both these aspects of the document are still stored in one file, but the important new feature is that they can be now algorithmically separated.

Let's see how it works.  My recent redesign of the Object International Web site was built from ground up using the modular approach.  Thus, in the source code of this typical page you can see separate modules for the top navigation bar, a "solid" heading (the one with an orange background), a "framed" heading (the one with an orange frame on top and right, supposed to be a lower heading level than a "solid" heading), opening and closing text blocks, customer quotes, etc.  For an example, here is the HTML module for a framed heading:


<!-- framed heading -->
<table border=0 cellpadding=0 cellspacing=0><tr>
<td bgcolor=ffaf60><img alt="" src="e.gif" width=15 height=4></td>
<td bgcolor=ffaf60><img alt="" src="e.gif" width=350 height=4></td>
<td bgcolor=d8d8d8 align=right valign=top rowspan=2>
<img width=16 height=26 alt="" src="zak-gob.gif"></td>
<td bgcolor=d8d8d8><img alt="" src="e.gif" width=15 height=22></td>
<td bgcolor=d8d8d8 valign=bottom><small>DETAILS</small></td>

  And here's how it renders:  

Figure 1

  Here, <!-- framed heading --> is a comment serving as a label to identify the module type, and "DETAILS" is the only piece of text that changes from one instance of the module to another.  The convention that only the module as a whole may carry a logical meaning makes the distinction of logical and visual HTML tags irrelevant, so your modules may employ any tags freely mixed.  Moreover, a module can even use unmatched opening or closing tags if you're sure that they will find their pairs in other modules in the file (that is, if the module can only be used in limited contexts).  To summarize, here are the main rules to be observed in modular HTML:
  • There should be as few module types as possible, and once the site design is more or less settled down, introducing a new module type must be an exception justifiable only by emerging an essentially new type of content which wouldn't fit into old templates.

  • Instances of the same module must be identical verbatim except for insertions of variable content (for example, heading text in a heading module).

  • There shouldn't be any "orphan" tags left outside the modules, except for a minimum tag set needed for marking up plain text (e.g. <P>, <STRONG>, and <EM> tags).

  • Each module type must be marked by its corresponding comment label in order to facilitate identifying the module type both in manual editing and in automatic processing.


Created: Sept. 17, 1998
Revised: Sept. 17, 1998