Call Center Solutions Featured Article
November 23, 2009
White Paper Details the Advantages of VoltDelta's CrystalWAVE Speech Recognition Technology
By Patrick Barnard, Group Managing Editor, TMCnet
Thanks to advancements in speech recognition technology, today’s interactive voice recognition (IVR) systems have improved to the point where they can accurately carry out “natural dialogs” with callers – thus cementing their role as customer self-service/call routing solutions for the call center.
As speech recognition technology continues to improve, one can expect more organizations to deploy IVR systems for the purpose of reducing labor costs and improving customer service.
Up until about ten years ago, most IVR systems were considered technically deficient because they were unable to accurately “interpret” what a caller was saying – often resulting in the caller being transferred to a live agent. Due to their inaccuracy and “clunky” design, the early IVRs were derided by consumers as “awkward,” “difficult to use,” “inefficient” and “a waste of time.” So inefficient were some of these systems that consumers actually lashed out at the merchants, banks, utilities and other organizations that deployed them for “not caring about the customer.” And plenty of people joked about getting “lost in the corporate phone maze.”
Needless to say, speech recognition technology has come a long way since then. Today’s IVRs have improved both in terms of functionality and reliability – and, due to the growing number of deployments over the years, consumers have grown more accustomed to using them as well.
In fact, industry research shows that many consumers, especially those in the younger generations, actually prefer to use speech-enabled self-serve systems over live assistance when it comes to simple interactions – such as getting one’s bank balance, carrying out an over-the-phone transfer, or requesting a copy of a brochure, etc. – because it is usually faster than waiting on hold to speak with a live agent.
It seems we are reaching the point where consumers have gained acceptance – that it now “OK” to carry on an interaction with a machine – and it is no longer an “insult” when a customer first encounters a self-serve system before talking with a live person.
As speech recognition technology continues to improve, one can expect more organizations to deploy IVR systems for the purpose of reducing labor costs and improving customer service.
Up until about ten years ago, most IVR systems were considered technically deficient because they were unable to accurately “interpret” what a caller was saying – often resulting in the caller being transferred to a live agent. Due to their inaccuracy and “clunky” design, the early IVRs were derided by consumers as “awkward,” “difficult to use,” “inefficient” and “a waste of time.” So inefficient were some of these systems that consumers actually lashed out at the merchants, banks, utilities and other organizations that deployed them for “not caring about the customer.” And plenty of people joked about getting “lost in the corporate phone maze.”
Needless to say, speech recognition technology has come a long way since then. Today’s IVRs have improved both in terms of functionality and reliability – and, due to the growing number of deployments over the years, consumers have grown more accustomed to using them as well.
In fact, industry research shows that many consumers, especially those in the younger generations, actually prefer to use speech-enabled self-serve systems over live assistance when it comes to simple interactions – such as getting one’s bank balance, carrying out an over-the-phone transfer, or requesting a copy of a brochure, etc. – because it is usually faster than waiting on hold to speak with a live agent.
It seems we are reaching the point where consumers have gained acceptance – that it now “OK” to carry on an interaction with a machine – and it is no longer an “insult” when a customer first encounters a self-serve system before talking with a live person.
Underscoring the growth in IVR deployments is a recent research report from T3i Group predicting that the global IVR market will grow to $514 million by 2013, up from an estimated $431 million this year, due in part to the growth in voice XML technology.
The firm’s “InfoTrack for Converged Applications 2008 IVR Market Report” forecasts that 95 percent of IVR ports shipped in 2013 will support VXML, compared to less than 75 percent today. VXML enables Web sites to offer the same text-based applications, such as order entry, with speech recognition.
The firm’s “InfoTrack for Converged Applications 2008 IVR Market Report” forecasts that 95 percent of IVR ports shipped in 2013 will support VXML, compared to less than 75 percent today. VXML enables Web sites to offer the same text-based applications, such as order entry, with speech recognition.
The report finds that DTMF-only IVR systems are slowly being phased out -- and predicts that speech-enabled (or hybrid DTMF/speech-enabled) systems will outnumber DTMF-only (i.e. touchtone only) by almost 2 to 1 by 2013. IP/SIP port shipments will continue to grow -- by 2013, only 10 percent of all IVR ports shipped will be TDM, compared with 42 percent today.
At the heart of every IVR solution is the speech engine – this is the software that is used to “interpret” what a caller is saying, which in turn allows the system to render an appropriate response. One way today’s IVR systems have improved significantly is that the speech engines they incorporate are more adept at “self-tuning” – that is to say, the more the system is used, the more it automatically “learns” all the different words, phrases, utterances and vocal sounds used by the people interacting with it – and tagging meaning to those words, phrases and sounds.
In a new white paper, hosted call center solutions provider VoltDelta explains how its CrystalWAVE speech recognition technology takes a different approach from other vendors by running multiple speech technologies in parallel to achieve even higher accuracy.
“The use of multiple technologies in parallel allows for flexibility in that the recognition grammars are equally suited to different types of callers; those that give short/succinct responses as well as those that use more natural-language like speech,” the white paper explains. “Within the same application different levels of speech complexity can be handled seamlessly, allowing for applications that are driven by need and not technology restrictions.”
As the white paper explains, VoltDelta’s (News - Alert) CrystalWAVE technology goes way beyond simple word or phrase spotting and can actually interpret meaning by analyzing the context in which those words or phrases are used.
“CrystalWAVE differs from other voice recognition techniques due in large part to the variety of data sources that are considered to more accurately recognize human speech in real time,” the white paper states. “Grammars can be of any size, from thousands of data entries to the hundreds of millions of records found in telephone directories. This data also provides CrystalWAVE with a sense of context. Recognition results can be immediately refined through a speech process known as robust parsing. This technique works to verify results that appear consistent with the data set or discounting those that fall out of logical parameters.”
With its advanced algorithms, VoltDelta’s CrystalWAVE is able to “ignore” (i.e. filter out) utterances that aren’t actually words – such as the “ums” and “uhs” that so frequently adorn caller responses. The system achieves this in part by segmenting words and phrases into categories: those which are in the system’s vocabulary which it is “certain” of; those which are in its vocabulary which it is “marginally certain” of; and words and phrases which are not in the system’s vocabulary at all.
In the event the system encounters a word or phrase it is only “marginally certain” of, it automatically triggers the generation of smaller, focused grammars, or WAVE (News - Alert)-LETS, and in effect takes a “second look” at the utterances and then narrows the recognition task to the words most likely spoken. Through this approach, difficult/complex “problems” are reduced and made simpler. The use of this WAVE-LET technology is seamless and automatically applied when required.
The result is a much higher degree of speech recognition accuracy:
“Tests run on over 30,000 actual calls comparing CrystalWAVE performance to standard directed dialog and natural language voice recognition displays compelling advantages,” the white paper states. “Results revealed accuracy improvements of 10 percent when comparing either traditional phrase-based grammars or SLM-based natural language recognition with CrystalWAVE. More uniquely, CrystalWAVE reduced the false presentation rate by 5 percent. False presentation rate is used to highlight context sensitive benefits such as not presenting the caller with a phrase that might easily be misunderstood as correct.”
To download a free copy of the white paper and learn more about VoltDelta’s CrystalWAVE speech recognition technology, click here.
The result is a much higher degree of speech recognition accuracy:
“Tests run on over 30,000 actual calls comparing CrystalWAVE performance to standard directed dialog and natural language voice recognition displays compelling advantages,” the white paper states. “Results revealed accuracy improvements of 10 percent when comparing either traditional phrase-based grammars or SLM-based natural language recognition with CrystalWAVE. More uniquely, CrystalWAVE reduced the false presentation rate by 5 percent. False presentation rate is used to highlight context sensitive benefits such as not presenting the caller with a phrase that might easily be misunderstood as correct.”
To download a free copy of the white paper and learn more about VoltDelta’s CrystalWAVE speech recognition technology, click here.
Patrick Barnard is a senior Web editor for TMCnet, covering call and contact center technologies. He also compiles and regularly contributes to TMCnet e-Newsletters in the areas of robotics, IT, M2M, OCS and customer interaction solutions. To read more of Patrick's articles, please visit his columnist page.
Edited by Patrick Barnard

TMCnet LOGIN
Webinars



