The Voice of XML (1/4) - exploring XML | WebReference

The Voice of XML (1/4) - exploring XML

The Voice of XML

VoiceXML, an XML vocabulary for specifying IVR (Integrated Voice Response) Systems was submitted to the W3C more than one year ago. Initially it received little attention but now with more services like Tellme and BeVocal providing developer platforms for such applications the interest level has risen dramatically in the last couple of months.

VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

The language describes the human-machine interaction provided by voice response systems, which includes:

Not all of these capabilities are mandatory for a VoiceXML platform, but at least output of synthesized speech and recognition of touch tones are required.

The language provides means for collecting character and/or spoken input, assigning the input to document-defined request variables, and making decisions that affect the interpretation of documents written in the language. A document may be linked to other documents through Universal Resource Identifiers (URIs).

A Sample Conversation

A phone conversation with a VoiceXML system at your bank could go like this:

System:	Welcome to Big Buck's Bank.
	Press or Say "one" for Account Balance Inquiry, "two" to speak to an operator.
You:	One
System: Please type in or spell out your account number.
You:    123456
System: Please enter or spell your PIN for your account 123456.
You:    PIN
System: Please enter or spell your four digit personal identification number, PIN.
You:    1234
System: Thank you. The balance on your account is ten dollars. If you wish to establish
	a credit line, please answer "yes", otherwise "no".
You:    No, thanks.
System: Thank you and Goodbye.

Note how the system is echoing spoken or keyed in information and uses it to look up bank information such as the account balance in backend systems. Built-in error handling prompts again for wrong or misunderstood data.

Produced by Michael Claßen

Created: Jun 06, 2001
Revised: Jun 06, 2001