dtddoc step 3: Element and attribute descriptions (3/3) - exploring XML | WebReference

dtddoc step 3: Element and attribute descriptions (3/3) - exploring XML

dtddoc step 3: Element and attribute descriptions

The PHP version listed below works similarly: We parse the DTD and filter out all comment items before testing them for the dtddoc comment regular expression:

#!/usr/bin/php -q
function is_comment($item) {
    return get_class($item) == "dtdcomment";
$parser = new DTDparser(new URLReader($argv[1]), false);              
$dtd = $parser->parse(false);  
$comments = array_filter($dtd->items, "is_comment");
foreach ($comments as $comment) {
    if (preg_match('/<(\W*)(\S+(\W+)="[^"]*")?>\s*(.*)\n([\w\W]*)/m', 
        $comment->text, $matches)) {

The regular expressions can be copied & pasted straight into the Perl version. The array filter has been replaced with the callback function:

#!/usr/bin/perl -w
use strict;
require "dtd.pl";
sub callback {
    local (*comment) = @_;
    if (${*comment} =~ /<(\W*)(\S+(\W+)="[^"]*")?>\s*(.*)\n([\w\W]*)/m) {
        print STDOUT ${*comment}, "\n";
open DTD, $ARGV[0];

This is an instructive example of how the same task can be accomplished in different programming languages. The Java version is only slightly longer because it has to explicitly instantiate objects for the pattern matching, and has to deal with the exception handling that PHP and Perl lack.


Now we have validated the annotated DTD approach in three programming languages. The next article will conclude version 1 of dtddoc with the generation of HTML around the extracted information. Hang in there, we are almost there!

Produced by Michael Claßen

URL: http://www.webreference.com/xml/column67/3.html
Created: Oct 28, 2002
Revised: Oct 28, 2002