Unified Communications

The Call for Speech Recognition Grows Louder

By Paula Bernier, Executive Editor, TMC  |  March 01, 2012

This article originally appeared in the March issue of INTERNET TELEPHONY magazine.

Talk about speech recognition, and the first thing to come to mind is probably Apple’s (News - Alert) Siri. Indeed, Siri has become a key driver to move speech recognition into the mainstream. But there’s a lot more going on with speech recognition, which is now being used to enable people to control their TVs, VCRs, set-top boxes, satellite dishes, DVD players, phone systems, and even devices like alarm clocks, in-vehicle systems and thermometers.

As Global Industry Analysts noted in a study released earlier this year, “the use of voice/speech recognition technology has transcended from conventional corporate uses, such as automated speech-enabled interactive voice response systems at contact centers, to mass-market products such as mobile phones, and car navigation systems, among others.”

That may explain why the global market for speech recognition is expected to increase at a compound annual growth rate of 8.8 percent between 2010 and 2015, according to a January 2011 MarketResearch.com study. The study estimated the total market was $38.4 billion in 2010, and forecasts it will reach $58.4 in 2015.

Nuance (News - Alert) Communications is one of the key companies positioned to help make that happen. In fact, Apple licenses Nuance technology for some of its products, Matt Revis, vice president of product management at Nuance tells INTERNET TELEPHONY, although he declines to comment as to whether it’s Nuance technology that is powering Siri.

In any case, Revis says Siri’s impact on the speech recognition space has been: “Massive massive and massive.”

“Anything Apple does just has massive gravitas” across the consumer base and the mobile ecosystem, he notes. And, he adds, it doesn’t hurt that Siri is fun, has a “cute personality”, and is heavily advertised. “People much prefer to interact with speech if the experience is brilliant and delightful,” Revis notes.

One of Nuance’s more recent announcements involved the launch of Dragon TV at January’s Consumer Electronics Show. Revis explains that this solution can be embedded in products such as televisions and TV remote controls to enable people to use natural language to find and/or record the programming they desire. That means that rather than scrolling through an on-demand menu or a lengthy electronic programming guide, users can simply say something like “find comedies with Meryl Streep” to locate the program, movie or general type of content they’re looking for.

Revis says Nuance has partnered with just about everyone in the consumer electronics industry on this technology, and that it announced a deal with LG the same week it did the mid-January interview with INTERNET TELEPHONY. That deal, he says, will enable LG to bring speech functionality to its Magic Remote in the first half of this year.

Natural language like that described above is what’s new and exciting in the speech recognition space, notes Revis, adding this is a key area of investment for Nuance. Of course, natural language has taken its lumps from some consumers who complain that it doesn’t always work as expected. Revis says if that’s the case it’s probably because people are trying to use Siri or other solutions to do things for which these solutions were not designed. That’s why Nuance and others are working to expand natural language to new and additional domains and languages, he says.

Despite any limitations speech recognition might have, Global Industry Analysts says that the “market for voice recognition systems and software is projected to witness unfazed developments with players in the space striving to take the technology to higher grounds by improving the ability of these systems to accurately recognize and respond to natural human speech.”

In addition to mobile phones, TVs and remote controls, Nuance’s speech recognition technology is embedded in consumer computers. In fact, at CES (News - Alert) Intel announced plans to embed Nuance technology in its Ultrabook chipset. A different business unit within Nuance, meanwhile, addresses speech recognition relative to call center applications. If you caught the recent CNBC special “Customer (Dis)service”, you saw Nuance’s Dan Faulkner (News - Alert) talk about a next-generation answering machine that really cares, in the words of the program.

“We train the system to understand what people are saying, but more importantly what do they mean,” Faulker noted.

That’s just another example of natural language at work, explains Revis, who notes that call center automation and health care are the two arenas in which speech recognition is the most widely used today. Airlines, banks and many other companies leverage voice recognition in their call center applications, while on the health care side physicians use the technology to record data that can be channeled into their databases quickly and easily.

Research firm KLAS in a study released in February 2011 said the prospects for speech remain strong even in areas like medicine, in which the technology already is well accepted.

“The speech recognition market is ripe for healthy growth,” says Ben Brown, author of “Speech Recognition 2010: Vocalizing Benefits.”  “Currently, less than one in four hospitals use the technology, however, in light of meaningful use and the benefits providers point out in this study, we expect it will assume a more prominent place in the role of clinical documentation.”





Edited by Jennifer Russell