Practical XML for the Web, Chapter 8: Introduction to Server-side XML | WebReference

Practical XML for the Web, Chapter 8: Introduction to Server-side XML

Chapter 8 of "Practical XML for the Web"

This is an excerpt from Chapter 8 of "Practical XML for the Web", by glasshaus, (ISBN: 1904151086, copyright glasshaus 2002). It sets the scene for server-side XML, and shows what you can do with it, by way of a parallel example done in ASP, PHP, and JSP (we have only included the first of the example sections here). The three chapters that follow this one in the book are case studies, which go into using XML with the three server-side languages mentioned above in much more detail.


Introduction To Server-Side XML

In this chapter, and the ones to follow, we'll switch from client-side to server-side XML processing. We'll start by examining why you would want to consider server-side processing, and then we'll introduce the three main server-side languages used in web development: ASP, JSP, and PHP.

We'll discuss the pros and cons of each, and then go through some simple XML processing techniques with short code examples from each language. By way of a running example, we'll show you how to maintain an online list of your favorite CDs, stored in XML format. You'll learn how to add new elements to the XML document, modify existing ones, and delete unwanted items. You'll also learn how to transform the XML to HTML.

We'll go through this step-by-step, showing you the code needed in all three languages, giving you a solid overview of basic server-side XML techniques.

The chapters that follow will then show you in-depth case studies for each of these languages.

Server-Side Versus Client-Side XML Processing

By now you're probably all excited about what's possible using client-side XML processing. So why would you want to learn about server-side techniques? Trust me, you do.

Server-side XML gives you even more power. For example, since the processing is done on the server, only the results are sent to the client, so you don't have to worry about making your code cross-browser compatible. It is true that processing XML on the server-side transfers some of the load to the server, but since web servers are usually extremely powerful creatures with the ability to cache data, that is probably nothing to be worried about.

In addition, the server-side approach also greatly reduces the amount of data that flows across the network connection. If you want to display different results for different browsers, then that is easily accomplished by detecting the browser-type on the server side and using the correct stylesheet for the transformation.

The following points sum up the advantages of server-side processing:

·          Systems can provide better performance and maintainability for data-driven web sites by generating and caching frequently accessed pages ahead of time on the server.

·          You can have direct control over the security of your data – sensitive information can be filtered out before sending the data to the client. For example (an extreme one), if we have an XML document that contains a list of users and passwords, there is a security risk involved in sending the whole XML document to the client to be transformed there – it would be better to filter the passwords out of the server side.

·          Maintaining your code becomes easier, since you don't have to modify it when new browser types or versions become available.

·          By doing transformations on the server side you can greatly simplify what is sent to the client, avoiding the problem of designing functionality that works for all possible browser combinations (even mobile devices).

XML support in browsers is still limited, so if you want to dispense with browser-compatibility issues and suchlike, I would recommend that you use client-side XML processing only when developing something like closed intranets where all clients use the same browser. Your safest bet is still usually server-side processing.

Before We Continue

This chapter contains code examples using ASP, PHP, and JSP. If you wish to try running these examples on your computer, then you need to install some software before you continue.

Detailed installation instructions for each language can be found online at

Server-Side Languages

The server-side market is a crowded one. Developers can choose from a variety of languages – ASP, ColdFusion, JSP, Perl, PHP, and more. With such a wealth of options out there, determining which environment best suits your needs can be bewildering.

The choice of a server-side programming language is a constant source of heated debate. The languages can all pretty much achieve the same things, but there are differences in portability, scalability, performance, and learning curve.

In this section we'll cover the three biggest players in the server-side market: ASP, PHP, and JSP. We'll give you a very brief overview of the advantages and shortcomings of each platform, and an idea of the XML support they offer.


Active Server Pages (ASP) is a framework that lets you combine one of a number of scripting languages (VBScript and JScript being the most popular choices) with an expandable set of software components. It's easy to learn, powerful enough for most mainstream server-side web development, and good on performance (since ASP files are compiled to native code, as opposed to JSP files, which are translated each time).


·          Professional support available (at a price).

·          Extensively documented on MSDN (

·          A large number of corporate intranets are already running on Windows NT/2000 servers, and ASP is ideal for intranet applications in these circumstances.

·          It's easy to learn for developers used to a Microsoft environment.

·          Although it isn't totally "free", it is widely available since it runs on all PWS or IIS servers, which are packaged free with most recent Windows operating Systems.


·          ASP is closely linked to the Windows operating system, and Microsoft IIS web server. It is neither practical nor desirable to run a web site based on ASP on anything but a Windows-based server, so in this way, it is rather limiting.

XML Support

Extensive support for XML is provided for ASP and indeed any kind of programming on the Windows platform through Microsoft XML Core Services 4.0 (MSXML 4.0), which is a full API for the parsing, validation, and processing of XML documents. Previous versions of the parser were distributed with various versions of Internet Explorer and other products, but to get the full functionality of the latest version, it needs to be downloaded.

Download the latest version (MSXML 4.0, Service Pack 1) from:

Microsoft's XML parser has gone through a number of generations, the latest of which has been renamed to reflect the fact that it is far more than just an XML parser. In previous versions, Microsoft have jumped the gun a bit and provided their own functionality, such as support for their own version of XPath and their own version of XSLT. However, in version 4.0 they have fully adhered to the W3C's recommendations and come up with a fully compliant validating parser and processor.

MSXML 4.0 supports the following:

·          The Document Object Model (DOM) – allows an XML document to be loaded into memory and manipulated. Nodes of the document can be read, written to, added, removed, moved, replaced etc.

·          The XML Path Language (XPath) 1.0 – the querying language used to navigate XML documents. Support for the full W3C standard for XPath is provided, as well as support for Microsoft's earlier implementation. XSLT uses XPath for document navigation, as we saw in Chapter 5.

·          Extensible Stylesheet Language Transformations (XSLT) 1.0 – the current W3C XML stylesheet language standard. Support remains for Microsoft's earlier XSL-WD implementation, though this should only be used for legacy applications (see Chapter 5 for more on these different XSLT versions).

·          The XML Schema definition language (XSD) – the current W3C standard for using XML to create XML Schemas. XML Schemas are used for the validation of XML documents, as an alternative to DTDs (we met both of these in Chapter 1).

·          The Schema Object Model (SOM), an additional set of APIs unique to MSXML for accessing XML Schema documents programmatically.

·          The Simple API for XML (SAX) – an alternative to the DOM for processing XML documents. It doesn't load the whole document into memory so its much more lenient on server resources, but it is also more limited in its functionality. We first met SAX in Chapter 1.

·          There is also a unique API for transferring documents over HTTP, which comes in versions optimized for client or server use. This is particularly useful for facilitating communication between disparate systems.

Server Used for Examples

For the examples in this chapter, we used IIS 5.0 on Windows 2000 Professional. IIS comes with Windows 2000, and is an add-in component. The examples in this chapter will actually run with MSXML versions as old as MSXML 2.0, so if you have IE 5 or newer on your machine, you will be OK in this respect.


You've probably heard of .NET – one of the latest Microsoft initiatives. Along with updates to much of its software and languages, we now find ASP.NET available to us. This extends the functionality of ASP to include all of the .NET Framework, including some expanded libraries for working with XML. However, this is rather a large area to explore, so we won't be covering it in any detail in this book.


PHP: Hypertext Preprocessor (PHP) is a server-side scripting language that can be embedded in HTML pages. It has been around for a few years now, but has undergone significant changes over that time. PHP borrows much of its syntax from Perl. When it was first created, it was intended to provide a more trimmed-down, easier-to-write, HTML-embeddable alternative to Perl, a task at which it seems to have succeeded. PHP is free, cross-platform, open source software; it integrates with all major web servers on all major operating systems.


·          It's open source and freely available from

·          It's cross-platform.

·          It has a very active user community.

·          It's seen as having a light footprint and not being processor-intensive.


·          It's relatively difficult to expand the language to add non-standard functionality that not handled by its built-in functions.

·          PHP's extensibility is limited compared to say, Java, ASP, and COM (although new libraries pop up with every release)

·          The function syntax to connect to each different brand of database is slightly different. Compare this to Java, which has a generic JDBC interface to connect to databases or ASP, which has its ADO abstraction layer.

XML Support

PHP has 4 extensions for performing XML tasks. Perhaps the most widely used of these are the XML parser functions – these use the Expat library, a SAX-based parser. Although it can parse XML, it does not perform any validation of the document. It supports 3 character encodings, namely US-ASCII, ISO-8859-1 and UTF-8, but does not support UTF-16. As you already know, with a SAX parser you define event handlers for XML events: as the parser works through the XML document it will call these handlers as and when events occur. The Expat library can be found at

PHP can also do DOM parsing of the XML document, but at the moment this extension is considered experimental. The extension is being overhauled for PHP 4.3.0 and the behaviour of many of the functions may change, so when using this extension it is best to avoid any non-object-oriented function (a full list of deprecated functions is available with the documentation). The extension uses the Gnome XML library, which you can find at

The PHP extension that provides XSLT support currently supports the Sablotron library. This extension has recently been rewritten in order to provide support for other libraries like Xalan and libxslt. Sablotron can be found at

For our PHP examples here to work properly, you need to make sure you have the XSLT and XML DOM extensions installed. Don't worry, as they are included in the PHP package downloadable from (see the InstallPHP.txt file in the code download for more instructions on installing this properly).

Finally, PHP also contains RPC support through the XMLRPC extension, although this is also considered experimental.

Server Used for Examples

For the examples in this chapter, we used Apache 1.3.26 and PHP 4.2.2 on Windows 2000 Professional.


JavaServer Pages (JSP) are written in Java, which (unlike VBScript and PHP) is an object-oriented programming language that can be used to build enterprise-strength applications. Java is arguably the most powerful platform for server-side web development today. Portability, multithreading, extensive class libraries, object-oriented code, strong safety features, robust security measures, elegance, and extensibility are just a few of Java's advantages.

Java was designed to be platform-independent and very portable. Therefore, a web application developed in Java can be packaged as a WAR (web application archive) file and installed on any Java-enabled application server, on any platform.

The disadvantage is that Java is not very easy to learn. If you just need to get a more simple site up and working quickly, and are not serious about learning an object-oriented language, a simpler language like PHP might be a better choice. By using JavaBeans and tag libraries, however, web designers can quickly learn to create JSPs that retrieve data from a database, process XML, and carry out other powerful functions, without having to know anything about the underlying technology. You don't need to be a sophisticated Java programmer to utilize the power of JSP.

JSP files can also be a little slower than ASP or PHP because of the way they work. The first time JSPs are called, they are converted into servlets (special Java classes that produce outputs for sending over HTTP), which are stored by the JSP engine. After this, requests for the JSP files are served from the converted servlet (JSP engines also double as servlet engines). The most readily available JSP engine is Tomcat, which also serves static content at a slower rate. If you want a fast web site, get a dedicated web server such as Apache or IIS to serve the static content, while Tomcat (or another engine) serves the JSP content.


·          It's free. You can download the Tomcat application server from the Jakarta web site, and start coding in minutes (assuming you've got Java installed, which is also free).

·          It's cross-platform.

·          It has a very active user community.

·          It's extremely powerful and scalable.


·          A steep learning curve.

·          Third-party hosting isn't common, and can cost extra for installation.

XML Support

There is enormous XML support in Java. There are loads of parsers and XSLT and XPath processors available, and most of them are open source. To name them all would be pointless, but the following is a short overview of some of the products available (the list includes the most popular products in each category).

XML Parsers

·          Xerces: Xerces is a high-performance, fully compliant XML parser from the Apache XML Project. It is a fully conforming XML Schema processor. It is free, and available both in sourcecode and precompiled binary (JAR file) form. For more information, visit

·          XML4J: XML Parser for Java is a validating XML parser and processor written in 100% pure Java – a library for parsing and generating XML documents, available as freeware from IBM. For more information, visit

·          XP: XP is an XML 1.0 parser written in Java, fully conforming, which detects all non well-formed documents. It is currently not a validating XSLT processor, but it can parse all external entities: external DTD subsets, external parameter entities, and external general entities. For more information, visit

·          MXP1: MXPl, or Maximum Perf. Minimum Size XML Parser, is a Java-based, non-validating pull parser that implements the Common Application Programming Interface (API) for XML Pull Parsing ( specification. MXP1 was designed for minimal footprint (less than 20k) and maximum speed (it claims up to 20% better performance than the nearest competitor) and is suited for fast serialization and deserialization of Simple Object Access Protocol (SOAP)-based XML objects. For more information, visit

XSLT Processors

·          Xalan: Xalan is an XSLT processor for transforming XML documents, from the Apache XML Project. It implements the W3C Recommendations for XSL Transformations (XSLT) and the XML Path Language (XPath). It can be used from the command line, in an applet or a servlet, or as a module in other programs. For more information, visit

·          XT: XT is a fast, free implementation of XSLT in Java. For more information, visit

·          SAXON: The SAXON package is a collection of tools for processing XML documents. It contains an XSLT processor, and Java libraries for access to the processor from Java applications. For more information, visit

Java-Specific Document Object Models

These are frameworks that provide a more Java-centric coding approach to parsing, transforming, etc. than the DOM and SAX interfaces. They can be configured to use DOM or SAX for parsing, but provide a much more convenient API for Java programs.

·          JDOM: JDOM is, quite simply, a Java representation of an XML document. JDOM provides a way to represent that document for easy and efficient reading, manipulation, and writing. It has a straightforward API, is lightweight, fast, and is optimized for the Java programmer. It's an alternative to DOM and SAX, although it integrates well with both of them. For more information, visit

·          dom4j: dom4j is an easy to use, open source library for working with XML, XPath, and XSLT on the Java platform using the Java Collections Framework and with full support for DOM, SAX, JAXP, TrAX, and XSLT. dom4j is distributed under an open source, Apache-style license that does not restrict users to creation of open source products only. For more information, visit


·          Java XML Pack: The Java XML Pack is an all-in-one download of Java technologies for XML from SUN. Java XML Pack brings together several of the key industry standards for XML Â– such as SAX, DOM, XSLT, SOAP, UDDI, ebXML, and WSDL – into one convenient download, thereby giving developers the technologies needed to get started with web applications and Web Services. Included in the bundle are: Java API for XML Processing (JAXP), Java Architecture for XML Binding (JAXB), Java API for XML Messaging (JAXM), Java API for XML-based RPC (JAX-RPC), and Java API for XML Registries (JAXR). For more information, visit

·          Cocoon: Cocoon is an XML framework that allows easy integrated usage of XML and XSLT technologies for server applications, around pipelined SAX processing, with a centralized configuration system to make things simple. It is available for usage under the Apache Software License. For more information, visit

There are also lots of utility packages available, such as XML tag libraries (XTags) for JSP from Jakarta, which give web designers with limited Java knowledge the full powers of XML processing through simple tags. For more information, visit

Server Used for Examples

For the examples in this chapter, we used Tomcat 4.0.4 on Windows 2000 Professional. Tomcat is available from (sourcecode is also available from if you want to compile Tomcat yourself).

As well as Tomcat, we also used XTags, nightly builds of which are available from (this is a work in progress), and dom4j, available from The JAR files for these two resources are provided in the code download.


The following table summarizes these three server-side languages:






VBScript, JavaScript (amongst others)




Windows (other platforms need third-party porting software).

Any platform for which the sourcecode or binaries are available, which is most.

Any platform for which the sourcecode or binaries for a JSP/servlet engine such as Tomcat are available, which is any with Java.

Web Servers

Microsoft IIS (other servers need third party software).

Apache, IIS, Netscape, etc.

JSP files are served by a JSP/servlet engine (such as Tomcat). Any web server, including Apache, IIS, and Netscape, can be configured to send requests for JSP files to the JSP engine. Any J2EE-compliant application server should have a JSP/servlet engine.









Component Support

COM objects


Java classes, JavaBeans, Enterprise JavaBeans

Learning curve




The only major vendor for ASP is Microsoft (Sun market an opensource version of ASP called Sun ONE Active Server Pages – formerly known as Chili!Soft ASP.) We won't go into this here – see for more details). PHP is open source, so there is no vendor to deal with. JSP is a set of standards and interfaces that can be implemented by anyone interested. Sun provides a reference implementation of a Java application server, which uses Tomcat as the JSP engine, but there is currently a variety of implementations (both commercial and open source) on the market.

Produced by Michael Claßen

Created: Dec 16, 2002
Revised: Dec 16, 2002