SUBSCRIBE TO TMCnet
TMCnet - World's Largest Communications and Technology Community

CHANNEL BY TOPICS


QUICK LINKS




Parakeet Squawks a Mighty Solution to Speech Recognition Accuracy

TMCnews


TMCnews Featured Article


September 03, 2010

Parakeet Squawks a Mighty Solution to Speech Recognition Accuracy

By Brendan B. Read, Senior Contributing Editor


Individual words need context for them to impart meaning. Speech recognition programs provide this, reports an article in The Economist by attempting “to match whole chunks of speech with statistical models of phrases and sentences. The rationale it says “is that by knowing statistical rules of thumb for the way in which words are usually put together—an abstract probabilistic approximation of grammar, if you will—it is possible to narrow the search when attempting to identify individual words.”


The story reports that for example, a noun-phrase will typically consist of a noun preceded by a modifier, such as an article and possibly also an adjective. So if part of a speech pattern sounds like “ball,” the odds of it actually being “ball” will increase if the utterances preceding it sound like “the” and “bouncy.”

Yet while this method has enabled speech rec systems to move from the labs and the bleeding-edge users to enterprises large and small and to consumers’ hands via popular software such as Nuance’s Dragon Naturally Speaking (my wife has promised me version 11 for Christmas) it is, as the article says by no means perfect.

“When [speech rec] gets things wrong, it often does so spectacularly,” says the story. That is because there are easy opportunities for word misidentification which “even a single word can take the program off on the wrong path as it tries to predict what the rest of the phrase is likely to be.”

The logic is this: speech rec applications are, as with all software, ultimately computer programs with code which at its core is the computation of zeros and ones: instructions to close and open electrical circuits. And if there is an error in the entry the answer will therefore be erroneous.

Fixing speech rec mistakes can be as laborious as correcting computer code. The Economist story reports that the typical programs require users “to correct each word individually from a drop-down list of alternatives, or else to retype or reutter the words.”

Yet there is a solution being developed called Parakeet that could take much of the pain from correction process. A creation of Per Ola Kristensson and Keith Vertanen, at the University of Cambridge’s Computer Laboratory and now in prototype stage, it permits speech rec applications “to share their thoughts, as it were, with the user in order to speed up the correction process.”

Parakeet’s heart is a touch-screen-based interface for phones and other mobile devices. It shows the words, phrases or sentences that scored highest in the program’s statistical model: and any close alternatives.

“This [functionality] allows the user to select alternatives easily, with a quick tap of the finger,” says the story. “More subtly, if none of the predicted sentences is entirely correct, yet collectively they contain the words that were spoken, the user can simply slide his finger across the appropriate words to link them up.”

Parakeet uses an open-source speech-recognition program called Pocket Sphinx, developed at Carnegie Mellon University but Kristensson reckons it would be easy to apply the same approach to commercially available programs like Nuance’s (News - Alert) Dragon. So far Drs. Kristensson and Vertanen have carried out only limited trials on a handful of people says the story. Yet the results have been spectacular: achieving operating rates of around 22 words per minute (WPM) as compared with 16 WPM for conventional predictive methods.

“In a sense, all Parakeet is doing is allowing the user to see which alternative words or sentences the program would have predicted,” says The Economist piece. The difference it reports is that with existing applications would more often than not the correct strings of words were recognized, points out Dr. Kristensson, “but rejected by the speech-recognition program on statistical grounds. Parakeet [in contrast] makes them all available to the user.”

The commercial benefits? Expect to see more such solutions to fly off the shelves, and into the hands (literally) of people like me who have to pump out a lot of text and/or on the road where it is not safe and legal to key it in.

“With the likes of Google, Nuance and Vlingo (News - Alert) now offering mobile speech-recognition services for phones and the development of speech-driven systems for use in vehicles, Parakeet may be flying into a growing market,” concludes the feature article.


Brendan B. Read is TMCnet’s Senior Contributing Editor. To read more of Brendan’s articles, please visit his columnist page.

Edited by Ed Silverstein







Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: [email protected].
Comments about this site: [email protected].

STAY CURRENT YOUR WAY

© 2024 Technology Marketing Corporation. All rights reserved | Privacy Policy