From Foe to Friend

By Brendan B. Read, Senior Contributing Editor  |  March 01, 2011

This article originally appeared in the March 2011 issue of Customer Interaction Solutions

Automated voice systems, both DTMF, a.k.a. TouchTone IVR and speech recognition, a.k.a speech rec-based, have had an unfortunate reputation as customers’ foes thanks to too many poor implementations. They have driven customers to “zero-outs”, resulting in higher expenses and risking decreased loyalty and lowered sales from them and from their friends and followers and others via social media, thereby clawing back the tools’ cost savings. Some firms have capitalized on these poor customer experiences by trumpeting that they do not use these technologies; they connect callers to live agents instead.

This does not have to be the case though, for automated voice systems contain elements that also make them customer-friendly through providing quality service that is often superior to that delivered by live agents. These include speed through queue-elimination, consistent, accurate and clearly-delivered responses, confidentiality—these “agents” are not going to blab about so-and-so—and privacy. Advances made with these solutions have made reaching both efficiency and service/retention goals possible and affordable.

Joe Outlaw, principal analyst at Frost and Sullivan is seeing that firms are getting the message that it pays to make automated voice the customers’ friends. So much so that there is a growing segment of the population that actually prefers self-service both automated voice- and- web-based. His firm estimates that up to 41 percent of inbound and outbound customer interactions will be handled by voice as well as by web self-service by 2015 from as low as 30 percent today. 

 “The message here for enterprises is that self-service applications must be done well,” says Outlaw. “They must be fast, easy-to-use, have intuitive interfaces, accessible 24/7 from a variety of devices and always offer live assistance when requested, so that customers and prospects will appreciate and use them.”

The Speech Trend

Making automated voice approachable is a slow but gradually shift to speech rec from DTMF-based IVR. While speech rec currently accounts for only 16 to 23 percent of IVR and voice portal system ports, depending on contact center size, its share, says Frost and Sullivan will steadily increase to climb to 28 to 36 percent by about 2015.

The reason is that for most interactions speech rec can be easier and more intuitive for customers to use. As speech applications improve and with this become more widespread, there will be greater customer comfort with them. 

“Customers have been exposed to speech for some years now; most every vertical has a speech interaction to offer,” Jeff Foley, Nuance senior manager of solutions marketing points out. “For example callers moving from one bank to another have come to expect speech self-service.”

The speech rec technologies have become sufficiently powerful, functional, practical and affordable to be developed into and deployed for mobile self-service applications. These tools are becoming critical in enabling quality customer service as more customers port landlines to their wireless devices or go wireless only.

A key driver has been the advent of all-you-can-call plans. They have unshackled users from counting the minutes, making them freer to talk more and more often, including to computers.

“Doing away with the caps has given device manufacturers, OS developers and carriers a conduit to expose new services because customers do not feel they are limited in how they use wireless,” explains Grant Shirk (News - Alert), senior product manager of IVR at Microsoft Tellme. “It opens up the ability to make speech an integrated part of the device experience.”

Speech-based self-service solutions is a more practical and safer means to communicate over these increasingly popular devices than typing, touching and thumbing on raised or on-screen keyboards. Nuance has developed a mobile care application that intercepts smart phones’ calls to contact centers. It then provides answers to the most common customer service questions on the devices’ displays. Even so, contact centers should not be ready to let go large numbers of live agents just yet, for going mobile has given more opportunities for customers to contact organizations when they need them.

“The creation of mobile service applications are not so much displacing other self-service channels as they are decreasing the customer effort required to contact customer service by choosing the most convenient option available,” says Foley.

Refining the Technology

The gradual speech application adoption rate reflects what is still a painstaking application development processes. These solutions can take weeks if not months to install, tune and test and refine before going live.

There are signs though that speech rec is in a gradually accelerating “virtuous wheel” of innovation, improvement, demand and affordability. The Frost and Sullivan analyst is seeing growing availability and maturity of application development tools, including development environments, grammar builders and reusable application modules. The firm is witnessing three to four percent annual drops in average systems’ sale prices.

“As the tools and techniques for speech application development have moved from the realm of the experienced speech engineer to a broader developer base, competition for application development has increased and costs are coming down accordingly,” says Outlaw.

Pre-programmed applications are also making speech applications more affordable and viable. Businesses can often tailor these applications to their requirements through parameter settings and configurations. For more specific requirements some application customization may still be required, but with at least the shell of the application pre-built, the cost and time to deployment are less than with applications which are entirely custom-built.

Helping to drive these improvements is a shift to standards-based writing, adoption and use in application, grammar, development, interfaces and communications creation and away from proprietary software. Key among these standards are: CCXML, GRXML, VXML, SRGS, Eclipse, MRCP and SIP.

“Even though standards never seem to deliver on their full promise they are having a positive impact on the development and portability costs of speech applications,” says Outlaw.

Suppliers have been refining their speech-supporting solutions. In 2010 Voxeo came out with the Prophecy 10 platform, which can support over 6,000 concurrent calls per server - more than 10 times it says the performance of other standards-based platforms. This greater efficiency simplifies and reduces deployment time and costs both upfront and ongoing support. For example, a Prophecy 10 software installation and configuration for the 16,000 port system above took less than 30 minutes in total while most other VoiceXML (News - Alert) IVR vendors require more than a week to configure a similar-sized system.

Prophecy 10 features bundled U.S. English speech recognition and synthesis (TTS) engines; it supports more than 30 additional language engines from vendors including Cepstral (News - Alert), IBM, Loquendo, Lumenvox, Microsoft, Nuance and Telisma. Prophecy supports industry standards such as VoiceXML, CCXML, SCXML and SIP.

Speech solutions have improved in specific verticals in part through experience gained from existing deployments. For example when a designer understands the language callers use when asking, say about their health insurance claims, the options they create and pose are smarter and more relevant to everyday callers, reports Foley. These then generate more appropriate responses and successful from the automated systems.

Nuance measures Key Performance Indicators (KPIs) by vertical applications and analyzing which interactions perform best on a series of dimensions, including containment, caller satisfaction and authentication. With the insight gained, it can pinpoint common characteristics of best in class applications which informs future designs. This knowledge enables Nuance to offer performance guarantees for vertical-specific applications, which is more likely to have them greenlighted as it eliminates the risks and provides reliable ROI projections.

“KPI insight also uncovers applications that do not perform well with speech self-service, giving us and our customers the confidence to deploy only those interactions that are apt to work as desired,” says Foley. “This eliminating time and expense wasted on applications that will not meet enterprise performance expectations.” 

It is people who lie at the heart of automated voice applications: designers, engineers and programmers. Outlaw sees increasing availability of skilled and experienced speech application developers, VUI (voice user interface) designers, grammar builders and project managers whereas a few years ago this separate expertise, and these positions, did not exist.

“As application developers build more of these solutions their development expertise grows as well as their industry-specific business process knowledge,” says Outlaw. “Their skills will grow even faster, leading to more robust, feature-rich, more intuitive, lower cost, faster-to-deploy solutions that will improve the self-service experience.”

The shift to open standards is in turn driving more talent into the speech field. Professionals who are increasing technology accuracy and quality across all usage environments–landlines, computers, games products and wireless devices.

“We process so many calls on our platform across so many industries our design teams and our partners’ design teams are really improving on these great experiences they are already delivering to make sure speech is easier, more efficient and more pleasant for their customers,” says Shirk.

Natural Language Developments

The key focus of speech work is in natural language development. Natural language speech rec comes the closest of any automated voice technology to human interaction, which means applications that use it typically have the lowest zero-out rates.

Nuance’s Foley points out that natural language technology helps automated systems better understand humans’ words because it can recognize a wider variety of responses even if it has never heard them before. It studies examples of what callers’ might say and creates statistical models that help it understand their intents without manually predicting each variation.

The flip side is that natural language is more expensive. It also typically takes longer to deploy and refine compared with directed dialogue and DTMF.

Suppliers have been working on making natural language more customer-usable, easier to deploy and less expensive. Nuance’s researchers have found ways to decrease the amount of data required to initially deploy natural language, dramatically reducing upfront costs. The firm’s professional services team has developed tools to automatically employ design best practices and decrease time to market. The efforts are paying off. About a fifth of Nuance’s top accounts use some sort of natural language system, reports Foley, and that number is growing rapidly.

Outlaw is seeing advances in natural language interfaces that will gradually make speech applications easier to use, leading to greater customer acceptance and adoption. They will also be capable of supporting increasingly sophisticated and interactive customer contacts.

“The promise of natural language technologies and techniques is to make computer-based applications more and more human-like and conversational,” says Outlaw.

Farewell DTMF IVR?

The unloved DTMF IVR is slowly becoming replaced by customer-friendlier speech rec systems. Yet it still and will, for the next several years, comprise the majority of automated voice system installations. And, thinks, Joe Outlaw principal analyst at Frost and Sullivan, it will deliver value for many years to come.That is because DTMF interfaces, where users are permitted to override prompts, can be faster for customers to use for routine interactions that entail a small number of options and they also tend to cost less to deploy. It also provides privacy and security; keying in credit card numbers and PINs is safer in crowded areas like airports than saying them out loud.

Jeff Foley, Nuance senior manager of solutions marketing, thinks though that even the latter rationale for DTMF will fade away. More sophisticated contact centers can deploy biometric voice verification, which is even more secure than touchtone since people can't steal or guess one’s voice like they can PIN.

 “While many still believe that DTMF should be used as a backup to speech self-service in noisy situations, speech recognition technology has significantly advanced in the past few years and has addressed many of these factors,” says Foley.

The Hosted Speech Option

Sourcing cloud/hosted automated voice solutions –DTMF IVR–has long been a popular alternative to buying them. It enables rapid capacity changes to match demand, offers shorter deployment times, avoids capital and support expenses and provides business continuity through being installed on offsite servers at secured and hardened locations. The hosting firms, not the purchasers, keep the technologies current.Hosted speech solutions have the added benefits of not having to have skilled speech application developers on staff to create, maintain and tune these applications. Most service providers will build, develop, support and tune them as part of their service.

For example, USAN offers Automated Call Care, which is an integrated inbound and outbound multichannel (speech recognition and DTMF) hosted automated voice solution that can be standalone self-service or blended into live agent services. It provides in-depth speech application development services including grammar and persona design, and application tuning including for-directed dialogue and natural language applications.

Clients can configure their Automated Call Care solutions and campaign management through web-based tools. This allows them to fine-tune their calling campaigns by changing application features and business-defined call rules.

The hosted platforms are incorporating and integrating with a wide range of functions and other channels. The Angel 4 Customer Experience Platform offers options including chat, e-mail, SMS, mobile, e-mail and phone communications, inbound and outbound. Plug-in options include voice biometrics, name and address capture, phone payment solutions, seamless CRM integration, workforce management, real-time transcription and CTI.

“We have been amazed at how quickly large enterprise customers are moving everything to the cloud,” says Don Keane (News - Alert), vice president of marketing and product strategy at Angel.

The following companies participated in this article:


Microsoft Tellme (News - Alert)




Brendan B. Read is TMCnet’s Senior Contributing Editor. To read more of Brendan’s articles, please visit his columnist page.

Edited by Stefania Viscusi