Simplified DocBk XML on the Web | 2 | WebReference

Simplified DocBk XML on the Web | 2

Simplified DocBk XML on the Web

Writing Articles with Simplified DocBk

Starting Out

Before you get to writing the body of your article, you must first supply some information at the top of the file that tells the XML parser that it's XML and what DTD it's associated with. A Document Type Definition gives the XML parser the grammar that the document must follow such as, what elements we can use, where they are allowed to exist, and at what level.

The first thing at the top of each document is the XML and DTD declaration.

<?xml version="1.0"?>
<!DOCTYPE article
          PUBLIC "-Norman Walsh//DTD Simplified DocBk XML V3.1.7.1//EN"

Next, we need to complete the article header which contains important information about you (the author) and the article itself. Since I've recently begun to use the Simplified DocBk XML format to write all of my articles for Webreference, below is an example if an article header I might use.

<date>December 28, 1999</date>
<pubdate>January 21, 2000</pubdate>
<title>Unix Daemons in Perl</title>

Most of the elements above are self-explanatory. Some of them have special meaning for Webreference though. The issuenum element is used to determine the path at the the top left of each article page. It's also associated with the sub-directory that the article is sitting in. For example, the URL for Mother of Perl tutorials is When writing a new article, I usually assign it a number that corresponds to the subdirectory its in, i.e. I could have also used a string or sequence just as easily as long as it doesn't contain spaces.

The productname element is also critical because it's used in my Perl script to figure out the URL to the article since different authors have different directory structures. It should be the same string that's used in your homepage URL. These are currently: 3d, dlab, dhtml, graphics, html, js, perl, and xml.

The keywords will (eventually) be used to create HTML meta keywords to get better search engine rankings. You can add as many keywords as you like.

Article Body

The main article body consists of multiple sections denoted by numbers (1-5). Each section must contain a title element. Each section should contain one or more para elements which contains the article text. sect1 is the top level section whose beginning signifies the beginning of a new page. The sect1 header will show up centered at the top of each article page in a <H2> tag. The sect2 element signifies a new subsection. These normally exist to let the reader know you are switching to a new topic or talking point within the context of the sect1 title. These elements are converted to a <H3> tag.

The paragraph is the real meat of the article. It contains all kinds of elements for lists, tables, images, and programming related identifiers. All of them can exist inside the para element. I recommend taking a look at one of my articles as an example:

Some of the elements I've been using so far are: emphasis, function, programlisting, constant, varname, ulink, citation, literallayout, command, filename.

XML Rules

Since you're writing an article in XML, you must be aware of a few basic rules. First, you must encode the default entities: &, <, and > when they're not part of an element or entity. Second, every start tag must have an ending tag. Third, when you have a large body of text with special characters, it's best to use a CDATA section. Look at the XML source of one of my articles to see what it looks like. When you wrap text in a CDATA section, it will print exactly what's inside verbatim.

Produced by Jonathan Eisenzopf
All Rights Reserved. Legal Notices.
Created: March 4, 2000
Revised: March 4, 2000