How many of you remember the scene from the movie LA Story, in which Steve Martin hooks
up his new telephone that features dial-by-voice? In case you've forgotten, or haven't
seen the movie, Steve's character picks up the handset and says, "Call Mom." He
hears the other phone ringing, but is surprised when doesn't reach mother, but he reaches
a pizza place. Obviously speech rec applications have improved dramatically since this
film was made. (Or maybe the mix-up was because Steve Martin did not speak clearly
enough!) But where will this technology take us, in the next months and years of the new
century?
I thought that there was no better way to glimpse the future than to pry inside the
minds of just a few industry leaders in this area. In the past few years, we've all
witnessed advancements in speech recognition technology. IVR has made it easier and faster
for us to get information over the phone. Personal assistants "listen" to our
commands and read our e-mail to us over the phone. And lately we've even seen
voice-enabled Web browsing. But what other kinds of changes will speech recognition
technology make in the way that people and machines communicate?
CH-CH-CH-CHANGES
According to Joe Yaworski, vice president and general manager of Unisys Natural Language
Understand Business Initiative, speech will soon be everywhere -- from your PC, to your
VCR, and even your toaster. "Eventually speech will be one of the modes of
interfacing and controlling all different types of devices and services," he said.
"Not necessarily the exclusive mode, but certainly one of the preferred modes. Speech
will be used where it can simplify and improve the interface with a device, or where
hands-free operation is needed, or where access to the device can be provided over the
phone."
One of the problems that people have found with speech recognition products is that
they want to be able to communicate with machines naturally and not have to be a computer
expert. "Eventually it will become easier to pick up the phone or talk to your
computer in a natural way to retrieve the information you want or to perform tasks,"
said Heather Howland, marketing director of Phonetic Systems.
"People want to be able to get information quickly and easily," she said.
"Too often people sit on hold for long periods of time just to get a simple answer.
Or they will navigate through mazes on a Web site or software application to perform a
simple task. Within the next few years, speech recognition will be much more commonplace.
While it is already starting to infiltrate the market, you will find that more and more
applications, both on the e-commerce side and the telephony side that will be driven with
speech recognition."
ADVANTAGES
There are plenty of advantages to using speech applications. Steve Gladstone, vice
president and general manager of Hammer Technologies cites portability and accessibility
as two main benefits. "When I think of my 78- year-old mother, she hates both
computers and cell phones," he said. "Speech recognition will allow her to talk
to all sorts of applications that were previously unavailable to her."
Steve Ehrlich, vice president of marketing at Nuance said, "For the consumers,
it's better and more convenient service and access to information that might have once
only been available over the Web. For enterprises, it comes down to the ability to provide
better service, often at a fraction of the cost of live operators. For dot-com companies,
speech offers a way to broaden the reach of their products, without building and staffing
a large call center. In many cases, speech also provides a competitive advantage."
HURDLING THE OBSTACLES
Yet, despite the advantages, speech recognition is still in its early stages of
development. Not all of the technology issues are being met. Bill Ledingham, vice
president, product development for SpeechWorks International said, "The technology is
here and has proven itself in a number of large-scale customer-facing deployments at
mainstream companies over the past several years. What SpeechWorks is working on now is
making it less expensive to deploy (through increased processor performance and advanced
application tools), and adding capabilities to handle even more complex, natural language
dialogs. What we are also doing now is marrying the Web and the telephone through products
like SpeechSite. This proves that the technology is sophisticated enough to conduct major
transactions and handle multiple requests from many different callers just as you find
on mainstream e-commerce sites.
There is always room for improvement, and speech recognition technology still has a few
hurdles to leap. According to Ehrlich, "From an end-user standpoint, there are a
number of people who have never used a speech system and don't really trust that it works.
As the number of deployed applications grows, this will cease to be a problem. From an
enterprise standpoint, the biggest problem is the lack of skilled speech developers.
Reusable components (such as Nuance SpeechObjects, which encapsulate the voice interface),
the hardest part of the design and development process, should help to lower this
obstacle."
Yaworski also cites developers as an obstacle, "If we have to rely only on the
developers employed by the speech recognizer vendors, then the speech market will never
grow. The Unisys NLSA toolkit is intended to make speech so easy that any reasonably good
applications developer can now build a workable speech application."
Howland, however, sees that there are different bars to lower, including people's
expectations. "Many expect the futuristic vision of being able to have free-form
conversations with machines," she said. "But in reality, the technology just
isn't there yet. Right now the applications are more structured than most people want them
to be."
According to Howland, obstacles have been fostered and set by the industry, as well.
"There is too much talk about the underlying technology, and not enough about using
the technologies with applications to make them succesful. Unfortunately, there are many
speech vendors in the industry that continue to dash the hopes of their customer base with
unrealized expectations. This becomes an impediment to the growth of an industry. Vendors
need to be more open with that they can actually deliver, and not what they want to be
able to deliver. Customers need to be careful, and should do their homework before
committing to a vendor. Asking the right questions can mean the difference between getting
a product that solves a problem, versus a product that creates one."
WHERE DO WE GO FROM HERE?
It seems that the Internet is the hotbed from which some new speech rec seedlings will
begin sprouting. Yaworski said, "The Internet is simply exploding, with the volume of
business being conducted online growing by leaps and bounds. Yet, even here in the United
States, two-thirds of the people do not have access to an Internet browser. Even those who
do have a browser only have access when sitting at their desk. There are ten times more
phones in the world than browsers. Using speech technology to voice-enable Web sites so
that they can be accessed by telephone would immediately expand the market for Web-based
commerce."
Ehrlich agrees. "The next few years will see the birth of the voice Web and that
will fuel broad adoption of the technology by carriers and voice portals around the
world," he said.
According to Ledingham, "We are seeing recent enormous demand from the dot-com
companies and mainstream companies that recognize the need to expand their e-business
channel strategies. Speech recognition gives these companies a way to reach customers who
might not have access to the Internet, while at the same time, offering those customers
the same self-service options available on the Web."
Moreover, Ledingham be-lieves that dot-com companies will also need to differentiate
themselves in a very crowded market, and they will turn to speech systems to accomplish
that goal.
CONCLUSION
Certainly, a lot is going on in the area of speech recognition. Other companies like
Lernout & Hauspie, Parlance, Edify, Philips Speech Processing, Dragon Systems, and
Vodavi-CT (to name but a few) are helping to advance speech technology even further.
And if the industry trend toward speech-enabling Web sites continues, it could possibly
change not only the reach of the Web, but it could also change how the telephone will be
used. The marriage of the two may seem unlikely, but it's happening, and as speech
technology and the Web grow (and grow up) together, it's very likely that the two will
reside happily.
|