TMCnet - World's Largest Communications and Technology Community



Mind Share

September 1999

Marc Robins Extending The Web With Phone-Based Voice Browsing


At this point in the CTI continuum, it’s pretty clear that traditional Interactive Voice Response (IVR) applications are under increasing assault from the meteoric rise of an 8,000-pound King Kong of a competitor — the World Wide Web. In pre-Web days, IVR was really the only way for companies to provide ubiquitous access to database information to their customers and prospects, allowing them to use a terminal they had easy access to and knew how to use — the telephone. But with the exploding popularity of the Web and e-commerce, most companies seem to be focusing increasingly on providing access to a wealth of content and account information through a browser-based interface running off a PC. The Web server has become the de facto information portal for corporate America.

Furthermore, the Web has become the destination of choice for people searching for information — whether it’s to find out what the best gas grill for the money is, or to book the best roundtrip ticket to Hong Kong. We “surf” the Web to discover everything from the mundane to the esoteric. Together with a browser and a search engine, the Web has become the ultimate “411.”

The conundrum, of course, is that Internet access is still by no means ubiquitous — especially when it comes to mobile access. In reality, “only” around 20 percent of the country has Internet access (albeit this number seems to be doubling every couple years). According to some published research reports, as recently as 1997, 95 percent of customer interactions with a company were over the telephone — either through contact with a live customer service rep or through an automated self-service application such as IVR. And although this percentage has declined due to a shift to the Web, it is forecast that phone-based customer contact will still account for 60–75 percent of the total. This state of affairs means that phone-based access to account and product information is still a highly viable — even arguably essential — component to a company’s overall customer service game plan for the foreseeable future.

Another reality: A great number of people still rely on the phone, coupled with their trusty, dog-eared yellow pages directory, as their main information retrieval device.

I hear grumbling coming from MIS, telecom management, and Web development teams across the land — and it’s understandable. The problem that stems from this scenario is that both Web-based and phone-based information access models have to be factored into the overall customer/prospect service equation. Information must be made available and easily accessible by phone AND by browser — and to most departments, this means a heavy load of double-duty. But have no fear — and hold the grumbles — for fortunately, due to the work of some forward-thinking developers at AT&T, Motorola, and Lucent Technologies, this load may not be so heavy after all.

A few years ago, a small number of IVR vendors, such as Intervoice and Syntellect, were quick to note the growing impact of the Web, and began to Web-enable their IVR platforms. In essence, a Web-enabled IVR system offers not only telephone access to a repository of database information, but also provides a Web interface to access that same information. The individual seeking the information can access it either from a telephone or from a Web browser. Architecturally, the Web server has a connection to the Web-enabled IVR server, which exploits its current links to databases that reside on a minicomputer, mainframe, or network computer. The benefit of such deployment focuses around “terminal of choice” for the customer, and decreased administration and development for the vendor. The customer now has a choice about how he wants to access information, and can freely move back and forth between those choices. The vendor’s benefit is that the application is written only once, and that the database information is maintained in a single location.

These solutions are great for “point-specific” problems — in other words, they help open a company’s intranet to its customer base, and resolve a big chunk of the “double-duty” dilemma as it makes information available through both electronic and telephone “channels.” However, they are also proprietary in nature, and don’t extend to the general non-Internet connected populace — they don’t address the need or desire by the Web-challenged to have general access to the wealth of information on the Web — and in essence, be able to surf along with the rest of us.

For the last couple of years, AT&T, Lucent Technologies, and Motorola have each been quietly working on creating extensions to the HTML standard that would make the resources of the Web accessible by phone. (For those not in the know, HTML (for Hypertext Markup Language) is a high-level programming language that simplifies Web-based content development. To place an image on a Web page, for example, a programmer writes a simple instruction in HTML calling for retrieval of a particular image file. Similarly, a content developer could use VXML to program a particular audio prompt to play over the telephone.) Motorola worked on its VoxML, or Voice Markup Language, while AT&T and Lucent worked on PML, or Phone Markup Language. Each company based its work on the W3C (World Wide Web Consortium) eXtensible Markup Language (XML) standard. Over time, these new voice-centric extensions to HTML started to be commonly referred to as VXML, for Voice eXtensible Markup Language.

The goal of all three companies was the same — to allow users to query Web servers anywhere in the world and gain access to Web-based content by simply using their phones and their voices. In theory, a user would be able to call into a “voice browser” by dialing a regular phone number from any wireline or wireless phone. This voice browser would allow the caller to surf the Web and interact with Internet and intranet applications hosted on any Web server. An example of a typical application would be a user who requests the flight status for a specific flight by calling into the browser. The voice browser, using speech recognition, recognizes the request and translates it into a URL for a travel service provider’s Web server. The Web server processes the request and responds with a “VXML” page. The browser interprets this page, and relays the flight information to the “phone surfer” using prerecorded or synthesized voice.

In March of this year, VXML development got a big boost with the formation of the VXML Forum. AT&T, Lucent Technolo-gies, and Motorola joined forces to help develop a standard specification for VXML, a computer language used to create Web content and services that can be accessed by phone. What makes things even more interesting is that the W3C is interested in codifying a standard and has developed an entire initiative to support VXML development, including workshops, a working group, and ambitious charter (see www.w3.org/voice for more info), and is working with a number of groups including the VXML Forum.

On the vendor front, AT&T, Lucent, and Motorola have agreed to contribute their markup language technologies to the development of the open VXML specification. Seventeen other leading companies from the speech, Internet, and communications markets have agreed to support the VXML Forum and play an active role in reviewing or contributing to the VXML specification. Industry supporters include 3Com Corporation, 4th Peripheral Technologies, Array Systems Computing, Blue Diamond, British Telecommunications, plc, Dragon Systems, General Magic, Fletchers Communications Pte., Ltd., Hewlett-Packard, IBM, Lernout & Hauspie, Locus Dialogue, Nortel Networks, Nuance Communications, On-line Anywhere, passcal Advanced Technologies, Philips, Registry Magic, Sun Microsystems, SpeechWorks International, Unisys, Vocalis and Vogo Networks. Other companies interested in seeing access to Internet information and content become voice- and phone-enabled may join as supporters, contributors, or adopters.

The VXML Forum wants to promote a broadly supported standard that creates an open, platform-independent environment and enables equipment and infrastructure pro-viders, speech technology providers, speech application developers and content providers, and communications service providers to participate in the growth of this market. In addition to giving users the option of voice-enabled Internet and intranet access, expected benefits include new business opportunities for content developers, greater ease of application development — and thus an expanded developer base for the speech community — and more rapid creation of differentiated services for carriers.

From where I’m sitting, VXML can help plug a yawning gap — the gap between the rich content of the Web and the vast horde of the unconnected. But isn’t this gap a relatively short-term problem? With the advent of new, low-cost Internet appliances and plummeting PC prices, the price of admission will be within almost everyone’s reach. New broadband wireless technologies, such as the emerging 3G standard, promise to connect us on the go. I guess there will always be a few holdouts or situations where Internet access is a problem, in which case the ability to default to a phone would be a plus. But is this enough to ensure the long-term prospects for VXML? And what about the capabilities of the speech technologies involved, such as text-to-speech and speech recognition? Will they become robust enough to deliver the benefits?

There are some intriguing “what-ifs” I can think of with respect to VXML. What if a link could be standardized to enable voice browsers to make live, Internet telephony calls right from the VXML page? Sort of a “Hypercall Link,” in which the browser would simply ask, “Would you like to speak to a live representative?” and then transparently connect the phone surfer to a call center or some other location? Such a feature could certainly help Internet telephony and e-commerce measure up to all the rosy predictions. What do you think? Let me know where you stand on the issue, and if you have any interesting “what ifs” of your own. 

Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: [email protected].
Comments about this site: [email protected].


© 2023 Technology Marketing Corporation. All rights reserved | Privacy Policy