Net Buzz with Richard Wiggins | 2 | WebReference

Net Buzz with Richard Wiggins | 2

Volume 1, Number 19 March 18, 1998

East Lansing, Michigan
XML: What Every Webmaster Should Know

By Richard Wiggins


ML is on everyone's lips these days. What does XML mean to you as a Webmaster? How will you convert from HTML to XML? And is it really possible that XML will unlock proprietary formats such as Word, Excel, and Powerpoint? We asked Peter Flynn, an expert on SGML and co-author of the XML FAQ, for his insights.

Peter, in your XML FAQ, you say that SGML, the "mother tongue" of XML, can be used for an infinite variety of purposes:

…from transcriptions of ancient Sumerian scrolls to the technical documentation for stealth bombers, and from patients' clinical records to musical notation.

Each of those applications requires a user community to form and to agree upon a common set of tags. Does this really happen in practice?

Yep, that's almost exactly what they've done, except that the Sumerian scroll research would actually use the same set of definitions as most other Arts & Humanities projects, as they share a common set of requirements. The bomber makers would share needs with other military hardware users. Medical users would have common needs too, and encoders of Bach would share requirements with encoders of Elvis, so it's more likely per-industry than per-topic.

XML is inspired by SGML; in fact, XML is SGML with a few features removed so that millions of users on the Web can exploit the language. If millions of people adopt XML, won't we end up with many thousands of tag sets that are meaningful only to small groups of users? If so, is that bad?

That's entirely possible, although I suspect it will settle down to a few hundred in common use. But absolutely it lets you do this, in exactly the same way that other sets of standards and practices are established on a per-industry or per-topic basis. It shouldn't matter to the end user, as the display they see will hide the XML stuff, just like Web browsers hide the HTML.

What will happen in the short term is that it will be new, so user communities will want to experiment, and reassure themselves that they are distinct and unique, and couldn't possibly use other people's solutions. In the long term, once the newness has worn off, they'll discover that, hey, those guys have a tagset that does just what we want, let's share. So subsets of tags will settle down to a common core, plus some specialist sets in each user community.

So although in theory it's infinite, in practical terms there's only a limited number of ways people want to tag lists, for example: (a) lists with bullets, (b) lists with numbers or letters, (c) lists with keywords, (d) lists inside paragraphs, like this one, and some of the requirements overlap anyway. Generic ways of doing these things are called "architectures," and there's already a solid and well-established way to handle them in SGML and XML.


Comments are welcome

Produced by Rich Wiggins and
All Rights Reserved. Legal Notices.
Created: March 18, 1998
Revised: March 18, 1998