×

SUBSCRIBE TO TMCnet
TMCnet - World's Largest Communications and Technology Community

CHANNEL BY TOPICS


QUICK LINKS




 

Feature Article
February 2000

 

Service With A Smile: The Art Of Speech Recognition Over The Phone

BY STEVE CHAMBERS AND CHRISTOPHER �BLADE� KOTELLY, SPEECHWORKS INTERNATIONAL

We�ve all encountered products we don�t like: hard-to-hold toothbrushes, impossible-to-program VCRs, confusing touch-tone phone systems. These products, and thousands like them, become even more frustrating when the technology is new and there is little or no precedent set for the best way to interact with them. It�s far better for new technology to be designed appropriately � so that it delights the user from the first encounter with the product.

Speech recognition technology is a good example. While speech recognition systems over the telephone are beginning to go mainstream in companies such as United Airlines, E*Trade and HP, many callers are still using them for the first time, and that first user experience often sets the tone for how the system will be received by callers in the long-term. Since you, as a stakeholder in your company�s customer-facing applications, never get a second chance to make a first impression, the user-design interfaces must be effective, right from the first call.

While much has been written about the science and technology of automated speech recognition systems � the recognition engines, accuracy algorithms, databases and phone networks � little has been written about the art of the systems, or the way callers think and feel while using them. Ultimately, the art focuses on creating an engaging and satisfying caller experience.

Speech systems should be designed to help callers with what is important to them. Good systems excite callers and make them want to call back because they sense that the system works quickly and intuitively to provide the desired answers. A really good system makes callers feel as if they are speaking with a human and not a computer.

In contrast, poorly designed systems are frustrating and induce callers to switch to an operator immediately, resulting in increased hold times, over-worked agents and a lower ROI.

The Three Steps Of Speech-Interface Design
There is an art to designing systems that cause callers to hang up with a smile and choose to use the system again and again. Like many artistic endeavors, the �art� involves a disciplined process to get the right input and create the right interactions to which your audience will favorably respond. With over the telephone speech recognition, the process falls into three phases: researching the way callers think and speak about the application, designing the application so that callers understand and enjoy using it, and testing and refining the application with a focus on caller satisfaction.

Phase 1: Research. Ocean explorer Jacques Cousteau once said, �The best way to observe a fish is to become one.� This holds true for the speech system design team. To understand what motivates callers, the team must know how callers are feeling when they call a company, their motivation for using the system, how experienced the callers might be with speech systems, how frequently they call, demographic information (will the callers have thick accents or want to speak in a different language?) and other aspects of the entire caller experience that are particular with each system. This may include knowing the culture of the industry.

The team must determine what callers really need to know and when they need to know it. For example, a caller may want to find out fare information for inexpensive flights, but she probably won�t want to know about the kind of plane, availability of special meals and seat assignment until she has found the best airfare.

The designer also needs to know what�s motivating the company providing the speech application. What services does the company want to provide through the speech system? What is the task at hand � booking flights? Trading stocks? Making a dinner reservation? On what kinds of callers should the company focus? This might be frequent fliers for an airline or high-volume stock traders for a brokerage. Why should the company focus on a specific customer segment? How can the brand of the company be communicated over the phone? Ultimately, the designer needs to be both the fish (callers) and the ocean (the company).

It is vital for the design team to understand the mental state of the caller. If designers look only for the obvious motivations of the caller (for example, the caller wants to access his or her bank account to transfer some funds), they might assume that the caller is relaxed and thinking clearly. But what if that same caller is transferring funds because five checks just bounced or his credit card was stolen? That caller might be frustrated and angry and may not want to hear long, slowly paced prompts. This caller would prefer to hear succinct prompts that offer helpful shortcuts and �barge-in,� or the ability to talk over the prompts to jump quickly through the call.

During the research phase, the design team performs several tasks to get information about the proposed system and the role of the caller. They listen to calls coming into the company�s call center, observe how customers use the company�s Web site and touch-tone systems (if available) and role-play sample conversations. They then take the knowledge gained from their research and leverage it in the design phase.

Phase 2: Design. After the designer gains an understanding of both the callers� and the company�s needs, the system design can be developed. This is where the team constructs a diagram, or call flow, which outlines the way questions are asked and describes the way the answers are handled. There is one major difference between a traditional touch-tone application call flow and one that is designed for speech recognition systems which make them far more user-friendly: Personality is built in. Personality refers to the way the computer speaks to the caller, the words it uses, its tone of voice (the designers may use a professional voice talent to record the audio prompts in keeping with the system�s personality), and the pacing and pauses of the automated system. Personality affects the ability of the caller to understand and enjoy using the application.

If a system has asked the caller to answer a long series of questions only to determine that the information the caller is seeking isn�t available, the system must inform the caller in a way that does not upset him. If a warm, friendly system which had told the user, �I�ll get that information for you right away� suddenly breaks out of character and informs the user in a stilted fashion that, �The requested information was not retrievable. Hang up and call again later,� the user will feel stunned and confused. A live agent would most likely express an emotion that would sympathize with the caller�s impending frustration � and so too should the automated system.

So how does the design team know when they have the call flow and the personality just right?

Phase 3: Testing. Usually, when people think about testing technology, they think about quality assurance (QA). But QA doesn�t tell you if people will enjoy using the application. To ensure that callers will like the system and use it effectively, usability testing is imperative. In this stage, the design team gathers a focus group of potential callers and observes them using the system to complete particular tasks. After watching just a few people use a system, issues become readily apparent. Some of these may include callers who respond with answers the system cannot comprehend. Most of the time, after observing a random sample within the expected user community, the design team will understand about 95 percent of the usability issues and will know how to revise the design issues that might prevent callers from having a satisfactory experience.

As an example, a system used by a bank in the Southern U.S. was tested with participants from that region. When the system asked a �yes� or �no� question, many callers responded with, �yes ma�am� and �no, ma�am.� This is something that would probably not have occurred with a system used primarily in the Northern U.S. The design team quickly added �ma�am� to the system�s vocabulary and the callers were then able to complete the interaction flawlessly.

When conducting usability testing, the team must test the call flow and also the personality of the system. After testing, some callers may think the voice prompts are too slow, but what they may actually mean is that they want the information faster. In this fairly common occurrence, the system can be redesigned so it need not keep returning to a database to retrieve information for the caller. It can be designed so that it draws on database information at only one point during the call, rather than repeatedly searching. This is another instance in which an expert design team is critical to ensuring a successful application.

There are two primary methods to determine if people enjoy using the system. First, designers can observe the expression on callers� faces and how they talk while using the system. If they grimace and speak in a strained voice, the system needs redesigning to make it more appealing. If callers smile, a primary goal of any speech recognition design team, the design is probably right on track. Second, sample callers can complete a survey to rate their impression of the system�s automated voice, the call flow, ease of use, etc.

While testing a new system is necessary before it becomes available to all callers, the systems may require periodic tune-ups after they go live and throughout their lifecycles. For example, if a company launches a new ad campaign, a different caller population might begin to use the system and have some particular expectations which the system was not designed to address. Or, if a company changes its name or adds service features, the speech system will need updating to reflect the changes. As a result, maintenance of the system is an ongoing process.

Choosing A System
Following are some things to consider when assessing a vendor. Try some of the vendor�s deployed systems yourself. Are they friendly and helpful? Do they have the requirements you are seeking? Does the system enhance the caller�s experience? What is the experience of the vendor�s design team and how many systems have they designed? Do their deployed systems all sound the same or do they have their own personality? Is it important to you to have features that allow companies to monitor the application once it�s deployed so that the design can be continually checked and improved? Finally, is the vendor willing to work with you well after the system is deployed to fine tune it and make adjustments?

The important point is that the perfect system is one that meets both your needs and your callers� needs. A system that will make your callers smile and want to use it again makes all the difference between acquiring a short-term customer and retaining a long-term one.

Steve Chambers is vice president of worldwide marketing for SpeechWorks International, Inc. and Christopher �Blade� Kotelly is creative director of interface design.


Speech Starts A Revolution: Speech Recognition Ignites The Call Center Environment

BY DEBORAH MYRICK, PHILIPS SPEECH PROCESSING

A revolution is underway. This revolution can either propel you and your company to the top or it can pass you by, but there is little doubt that it�s taking place. It�s speech recognition and the word is starting to spread. Technological advancements such as Web-enabling the call center, call forecasting and scheduling programs, and self-healing support tools have allowed call centers to stay ahead of customer expectations. While all of these innovations are advantageous, speech recognition stands out as one of the most revolutionary tools in enhancing a call center�s viability by increasing customer satisfaction and employee efficiency.

Progressive speech recognition utilities provide call centers with a resourceful solution, allowing customers to carry on a free-flowing dialog with the operator�s systems, data and transaction services in each individual�s natural language. Thanks to speech recognition applications, voice interfaces may be created to provide a countless manner of services to a call center, ranging from directory assistance to banking and more. Speech-enabling and automating many of the commonly performed tasks previously done on a telephone keypad not only makes good business sense, but in the long run becomes a key customer relationship builder.

Speech recognition is more than just the next level of DTMF or an alternative to IVR menus. It is a natural language understanding tool that works seamlessly with existing telecommunications and data infrastructure. No longer are service providers required to invest heavily in equipment due to the advent of compact, self-contained systems designed to meet the user�s needs. Effective speech recognition applications can now be implemented faster and easier, helping to increase productivity, improve call management and eliminate extended customer hold times.

An additional concern for today�s call center manager is staffing. Finding qualified personnel and keeping agents (who are sometimes stretched in order to meet peak volumes and increasing workloads) motivated is a challenge. Speech recognition utilities allow a call center to maximize its resources by freeing up agents to focus on the more complex tasks and devote attention to critical and time-sensitive customer issues.

The returns on using an advanced speech recognition technology are not solely for the external customers of the company or call center. Using speech recognition within wireless services and other telecommunications applications can enable a sales force or other internal customers to obtain information directly from company databases while traveling via speech recognition technologies, allowing them to devote added time to other projects.

Call centers considering practical ways to improve customer relationship management issues should examine the beneficial results of implementing natural language understanding speech recognition applications. Many examples are present in today�s market and, because of high-volume and global access, the trend continues to grow, including airlines that institute speech-enabled applications within their reservation and flight information networks, automating their call centers so that customer queries are handled with maximum efficiency; financial institutions are using speech technology to handle routine information, freeing up their workforce to concentrate on handling other sensitive transactions; and travel service providers are turning to speech recognition technology instead of an increased workforce as a financially viable means of keeping pace with consumers� ever-increasing demands for information.

Say What�s On Your Mind
Unlike typical IVR systems in which individual words are analyzed, speech recognition software understands words by individual phonetic components as it proceeds. It is speaker-independent and extracts the meaning from what is said, regardless of the specific information. Because users have a long list of selections when dealing with industries such as banking, travel and finance, these systems can be trained to have nearly the same success rate of understanding as a human operator, without demanding customers speak in a particular cadence or intonation. Callers may speak as fluently as they would with a live agent, an innovation attributable to advancements in speech-recognition technology which allow systems to possess natural language and grammar understanding.

The phonetic approach is important for adding vocabulary and new words. It is also remarkably useful when working with a language system that will permit the use of various idiomatic and colloquial phrases. This allows call centers to seamlessly add speech recognition capabilities to existing systems without demanding callers change the way they speak. From Dutch to Taiwanese, speech recognition systems can now be tail-ored to support any number of languages, allowing automated call centers to offer a human dialog system and approach.

In addition to the benefits offered regarding customer satisfaction, integrating speech recognition technology can also provide added safety benefits when used in a wireless environment. With the multitudes of automobile drivers using mobile phones, speech recognition can make a safety difference, allowing callers to speak the option they choose instead of having to punch in their selection on the keypad. Through speech recognition systems� allowance for a casual stream of dialog, callers spend less time concentrating on using these devices, allowing them to pay more attention to driving.

Speaking To The Net
Increasingly, customers are becoming technologically savvy and eager to find information and solutions on their own. Without losing touch with the technologically advanced customers, call centers are looking at ways of expanding their informational services and resources. Many of today�s call center operators are expecting to see a productivity boost in the convergence of the Internet and telephony. Combining the resources of the Internet with speech recognition technology is an appealing trend to call center operators, both economically and from a value-added marketing perspective.

Already call centers have taken the convergence one step further and created services that are evidence of a paradigm shift within the industry. Having gone �live� with the deployment of natural speech recognition in intelligent network environments, including the implementation of voice portal services, these trendsetters have increased customer satisfaction by leaps and bounds. As a result of their efforts, callers can access a large number of services customarily found via the Internet, just by using the telephone.

Customers tapping into these types of services can access a suite of hundreds of different databases, including restaurant guides, white/yellow pages, travel, banking. Additionally appealing to the tech-savvy customer is that their inquiries are answered by an automated system based on advanced natural dialog technology that understands what they say and intuitively anticipates where their inquiry may be headed, allowing them to quickly and efficiently accomplish their task.

Speech recognition has also been used by call centers to develop options that offer customers a convenient way to find out about new products, upcoming offerings and information services. For example, when customers dial a call center�s telephone number, their inquiries are answered by an automated system where they may inquire about company services without having to spend excessive time on the phone or use extensive keypad menus. Since many companies operate several centers at different locations, speech technology can be pervasive throughout the system. Regardless of where the service is implemented, each facility�s performance can be uniquely and universally enhanced.

With more overhead being required of call centers in order to meet customer demand for information and services, speech recognition technology can be an important aspect of efficiently managing operations and preserving high levels of customer satisfaction. Also important to recognize is that many of today�s customers are calling from distant countries or speak different languages, resulting in the need for specialized programs for each user. Speech-recognition technologies are not only advanced enough to address this issue by recognizing multiple languages, but know the differences and intricacies of regional dialects as well.

The movement is happening right now. Speech recognition is revolutionizing the way call centers operate and handle the challenges of administering to the needs and demands of their customers. After all, speech recognition places important emphasis on what�s vital to a call center � what the customer is saying. As the telecommunications industry moves closer to acceptance and widespread use of natural language understanding and speech recognition, a greater understanding of customers and what satisfies them will not be far behind.

Deborah Myrick is director of marketing, Americas with Philips Speech Processing. Philips Speech Processing maintains offices around the globe and has realized implementations of its telephony-based speech recognition applications throughout the U.S., Europe and Asia.







Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: [email protected].
Comments about this site: [email protected].

STAY CURRENT YOUR WAY

© 2024 Technology Marketing Corporation. All rights reserved | Privacy Policy