×

TMCnet - The World's Largest Communications and Technology Community
ITEXPO begins in:   New Coverage :  Asterisk  |  Fax Software  |  SIP Phones  |  Small Cells
 


February 2003


With this issue we are introducing a new monthly feature, the Executive Roundtable. Every month we will pose questions to executives from industry-leading companies, examining topics that are of vital importance to the industry. Featured this month in our Executive Roundtable are Joseph G. Brown, CEO of Edify Corporation (www.edify.com) and Chris Lotspeich, director of marketing at LumenVox LLC (www.lumenvox.com).
' The Editors

CIS: How are speech technologies improving the efficiencies of contact centers?

Brown: Speech technologies are impacting contact center efficiency in several ways.
a) Speech recognition expands the automation vocabulary from 30 words (the characters on a touch-tone keypad) to hundreds, thousands, or even tens-of-thousands of words. With larger vocabularies, automation solutions can now extend into previously nonautomated applications. 
b) Because speech is a natural way for users to interact, it is possible to develop powerful user interfaces into complex speech automation systems. These user interfaces are providing not only higher automation rates, but they can measurably increase customers' satisfaction levels through both application efficiency as well as application persona. 
c) Sound speech automation includes an intelligent transition to live agent assistance. Persona, application flow and CTI can all be key elements in the development of good speech applications. Any of these components can capture necessary data points that can then be transferred to a live agent. A good speech application can make the live agent transfer a much more intelligent and efficient process. 
d) The bottom line ' with more tasks being automated, customers are spending less time on hold waiting for a live agent to get their issues resolved. Many more customers can complete their query via automation, and a greater number of live agents are available to work the complex situations that are too difficult to automate.

Lotspeich: Contact centers have long held an interest in self-service. When touch-tone was introduced, many thought that it would dramatically improve the efficiency of contact centers. More people now attempt to immediately reach a live person, as opposed to attempting the touch-tone interactive voice response (IVR), because the touch-tone systems are arduous, lengthy and confusing. Speech recognition allows a customer to say what he or she wants, thus removing these problems and improving efficiency. The list of improvements for a contact center is numerous:
a) Shorter queue times: many customer questions can be answered much quicker through an automated system. The customer does not wait for a human operator, and human operators do not have to answer the same question(s) over and over again.
b) Access to information 24/7: Customers can call the system at any time, providing access to account balances, telephony information, etc. However, human operators need only be on location part of the time, resulting in significant savings in agent salary costs.
c) Ease of use: Touch-tone systems require customers to pull the phone away from their ear in order to push the button corresponding to their request. Speech technology enables hands-free access to information and help, particularly in nontraditional use environments such as a car. 
d) Saying it vs. spelling it: If a customer wants to locate a store in a national chain, he cannot spell out California on the keypad with touch-tone. With speech recognition, the customer can just say 'California.' 

CIS: Are speech technologies putting an end to push-button IVR systems?

Lotspeich: Absolutely not. Just as people thought that the Internet would remove the need for retail stores, speech technology is being touted as the 'replacement' for DTMF. In fact, speech technology is best utilized in conjunction with DTMF. The best example of where DTMF would still be utilized is for an account number or password. Some people do not feel comfortable saying private information out loud, so DTMF can still grant access to users as well as provide privacy. Extremely noisy environments require some method other than speech. Likewise, speech-impaired people (or just someone with laryngitis) may not be able to use a system that is only speech-enabled. 

Brown: Philosophically, absolutely. Speech technologies free the caller from the limited vocabulary of the telephone keypad. With a larger vocabulary and a more natural user interface, the application design is completely different. Gone are the rigid rules of IVR application nested menus. The biggest mistake an organization can make is to simply 'speech-enable' its existing DTMF applications.

What remains, however, is the essential integration of DTMF functionality in a speech application. In instances where a caller might be uncomfortable speaking a PIN or social security number over the phone or where a mobile caller is in a high-noise environment where speech recognition accuracy is not reliable, providing the application with a DTMF option allows the caller to continue his or her automation experience without needing live agent assistance.

Edify's platform allows for the complete integration of DTMF and speech, allowing developers to build rich, customer-centric applications that can function as the caller requires depending on the application or environment.

CIS: Are speech technologies becoming more affordable for smaller businesses?

Brown: In examining speech recognition, it is important to view it within the overall context of the solution ' the customer interaction. The speech technology is just a component of that overall solution. As with any technology, you will see a gradual downward trend in technology costs, but at the same time you will see an increase in expenditure on implementation and integration. 
A key benefit for smaller businesses is the increasing number of pre-built speech applications. These applications allow businesses to bring speech recognition into their organizations for less, providing a stronger and quicker ROI, and generating add-on orders for additional speech recognition solutions.

Lotspeich: Numerous speech companies state that they are more affordable, but will stack on professional services that increase costs substantially. At LumenVox, we are focused on the small to mid-sized companies that have never approached speech recognition or even IVR development before. We listen to what small business owners want. We developed a pricing model that reflects the needs of these small business owners. We do not make our revenue from professional services, which means we have a completely different business model. 

CIS: One of the drawbacks to implementing speech technologies has been the difficulty in setting up the systems. How is that improving?

Lotspeich: The most difficult aspect of implementing speech technologies is the fact that specialists from three different industries and experiences are required for the entire speech application. Companies need specialists with knowledge about the phone system, the speech recognition portion, and a person for hardware/portal issues. This blending of different people can increase both the time and money for a company implementing speech. 

The way LumenVox has improved this is by providing easier tools for customers to utilize. We have made our toolkit extremely user-friendly, moving away from professional services, which allow companies to develop, implement and control their own applications. 

CIS: Please briefly explain the difference between natural language processing and deep linguistic processing, the advantages of each, and whether buyers should look to a solution having one or the other or both?

Brown: Natural language processing can actually take on a couple of meanings. The literal meaning within the speech world is a spoken phrase by a caller that contains multiple values. For example, 'Sell 13 shares of IBM at market.' Within that phrase there were four values: sell, 13 shares, IBM, at market. This is in contrast to the term 'directed dialog,' where the speech application directs the caller to speak a word or phrase that has a single value ' 'Please say buy or sell,' 'What is the stock symbol?' etc. 

The extended meaning of natural language in the speech world deals with statistical language modeling, or SLM. SLM technologies predict what the caller will most likely say based on a large historical sample size of captured caller utterances into a system. Through statistical analysis, a speech system can accurately predict what the caller is saying, even if the actual utterance might not be completely understood or in the speech grammar. This methodology provides a more 'natural language' interface for callers, who might be prompted by a system saying, 'How may I help you?'

Deep Linguistic Processing (DLP) is an Edify-developed technology that understands the meaning of written text. Unlike the spoken medium, which usually occurs within a well-defined context, written text is often much less bounded. For example, customer e-mail is often free-form, and contains multiple topics and contexts. Additional challenges of written text are grammar and spelling inaccuracies. Many traditional text-based automation tools struggle in this environment and these tools rarely understand what is actually being written. More commonly, these tools categorize the written text for live agent processing. DLP allows Edify to fully understand the written message and then to automatically process that information properly and consistently.

Strategic buyers should be looking at the overall context of their business. How do customers want to interact with them? What media? What types of transactions do customers want to accomplish through automated and assisted service? The result of that discovery will lead the customer to a common lexicon of written and spoken languages, business processes, and an overall common view of their customer. From that perspective, organizations are much better able to develop a complete automation and assisted service solution that provides consistent customer experience across all the communication channels. In this environment, you will find DLP and natural language technologies, among others, leveraging the common lexicon that has been created for that business.

CIS: What is the most exciting application you have seen for speech technologies and what new applications do you foresee for speech technologies in the next few years?

Lotspeich: Due to the fact that all current speech applications are contained in a very controlled atmosphere, there are no really 'exciting' applications in the marketplace. Most of the current applications provide basic information or database manipulation, but until speech recognition can leave the contained environment, they will remain very static and consistent. True, different voice talents can be utilized to 'spice up' the application, but they all still provide the same style of help or information. To me, the area of speech technologies that is exciting would be the development of toolkits that empower companies to develop their own speech applications, without the requirement of professional services. 

Some applications that are coming up in the marketplace which are intriguing are freestanding informational kiosks and home automation. Kiosks can be used in various locations including the local grocery store to large amusement parks. For home automation, imagine when you arrive home and simply say, 'Lights, television to channel 9 for the local news, and oven at 350 degrees.' Then having your lights turn on, the television turn on to channel 9 and the oven turn on for dinner. 

Brown: The most exciting speech applications are actually several applications integrated together within the context of an actual business. A great example of this is Edify's Automated Address Capture (AAC) application being integrated within a transaction environment. Standing alone, AAC functions as a speech recognition application that provides an automated way for customers to notify a company that they have changed their address. According to the U.S. Census Bureau, address changes happen 44 million times every year across the United States. It made sense to help companies automate the address change process ' that is the original reason we created the AAC application. However, we now have customers who are putting the AAC application within a broader transaction speech application within their business. An example ' a retail company wants to automate the process of customers ordering their product over the phone. One of the biggest challenges to that process is capturing valid shipping and billing address information with the automation. By incorporating Edify's AAC application within a specific speech transaction application, the overall speech application easily captures address information within the overall context of the automation solution. Without embedding AAC within the broader application, the automation would not be nearly as effective from a user interface and data integrity perspective.

My predictions for the future include: a continued refinement of the user interface; a more responsive service where speech is a critical piece of the overall customer experience; and a convergence of speech- and text-based solutions from a common grammar or 'knowledge base' perspective.

[ Return To February 2003 Table Of Contents ]


Upcoming Events
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
Austin, Texas
MSPWorld
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Austin, Texas
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Austin, Texas

Subscribe FREE to all of TMC's monthly magazines. Click here now.