Be Your Own Browser
BY Tony Rybczynski
PC-based browsing was the key enabler behind the Internet boom. But, how often would you like to get Web-based information or even undertake Web-based transactions, but don’t have access to a PC or PDA or are outside the WiFi hotspot coverage area? Wouldn’t it be nice to give customers broader access wherever they are and whenever they want? Now you can with Voice XML and advanced speech technologies! Phones are available just about everywhere in the world, are always on (and cell phones have longer battery life than WiFi devices), and don’t have to be booted up.
"Natural language advanced speech applications actually can make it easier to get information and complete a transaction."
What Is Voice XML?
XML (eXensible Markup Language) — a standard of the World Wide Web Consortium (W3C) — is the most widely accepted, platform-independent standard for building structured documents for Web applications. Voice XML is a dialect of XML developed to write voice dialogs for self-service solutions, and is also being steered by the W3C.
In the Web world, Voice XML is to voice interfaces what HTML is to visual displays. HTML applications are accessed via a graphical Web browser with display, keyboard, and mouse.
In contrast, Voice XML applications are accessed via a voice-capable device that accepts audio and touchtone keypad input and delivers audio output... such as a telephone. In short, Voice XML empowers users to interact with the Web through any wired or wireless phone — making Internet content available to anyone with a phone. For users, Voice XML enables them to interact with the application in the most natural way, namely by speaking and listening. Voice XML scripts use a combination of speech and touch-tone commands to exchange data between people and machines, independent of the vendor’s hardware. Application developers can create audio dialogs that use speech and touch-tones as input, and deliver synthesized speech or digitized, pre-recorded audio as outputs.
Voice XML documents can perform a variety of functions, such as user prompting, natural language speech recognition enabling the user to provide multiple pieces of information in a single utterance, event-handling features (e.g., timeout or unrecognizable input), branching (“if-then-else”) and Boolean (“and, or, not”) logic, and progressive prompting to better handle invalid responses to a prompt.
It’s The Application, Stupid!
Voice XML is well-suited for applications that require relatively little input from the user and deliver highly targeted output that generally is available from an HTML Web interface. In many ways, natural language advanced speech applications based on Voice XML actually can make it easier to get information and complete a transaction. Because it’s interruptible, the user can just say what he or she wants minimizing the time spent to complete the task at hand.
The simplest Voice XML application is for information retrieval, such as account balances, airline flight information, or weather from a Web site. Voice input can often handle large vocabularies much easier than touch-tone. Free-form street addresses for a city, or stock quotes for a specified company and period are good examples. Speech is also well-suited for single utterance input of multiple related items. For example, the user could say, “What is the interest rate for a 20-year fixed-rate jumbo mortgage?” Voice XML is also naturally suited for customer service applications, such as parcel shipment tracking and online banking, and for accelerating call center services. In addition, personal name dialing, one-number “follow-me” services, teleconferencing set-up, and other telephony features can be voice-enabled through Voice XML. Because security features that apply to the Web, such as firewalls and encryption, can be applied to voice applications as well, Voice XML can be used to create secure intranet applications that voice-enable internal processes, such as supply ordering, HR self-service, and corporate news. This underscores an important benefit of Voice XML: it enables customers to share their Web-based infrastructure and resources with voice processing applications. Finally, Voice XML can unify voice and electronic channels, for example, by allowing users to read their voice mails, dictate their e-mails, and originating and terminating pager messages on the phone.
Voice XML is a powerful yet simple language for building dialogs that blend the voice world with the Web to enable innovative new self-service applications. For IT, voice applications can now be constructed with widely available Web application development tools. For the user, just say what you want! IT
Tony Rybczynski is director of Strategic Enterprise Technologies at Nortel. He has over 30 years experience in the application of packet network technology. For more information, please visit www.nortel.com.
Return To The February 2005
Table Of Contents ]