The principal standards and standardized
APIs (application program interfaces) that guide the operation and
interaction of components in speech architecture are listed and described
below. The agency responsible for each standard or API is shown in
parentheses after the standard's name.
CCXML. Call Control eXtensible Markup
Language is designed
Definition Du Jour
Speaker-independence
In the earlier days of speech recognition technology, the ultimate
user of the speech application had to 'train' the system to
his or her voice, therefore increasing the potential for success
that the system will recognize his or her voice. Developers
realized the impracticality of this method and the nuisance of
having to continually 'retrain' the system for different
voices, particularly in enterprise applications. Today, many
applications are 'speaker-independent,' meaning that the
system can recognize a multitude of different voices and speech
patterns, eliminating the need for constant retraining. |
to provide telephony call control support for dialog systems.
CCXML is intended to serve as an adjunct language for use with a VXML,
SALT or other dialog implementation platform.
HTTP. Hypertext Transfer Protocol is an
application-level protocol for distributed, collaborative, hypermedia
information systems. It is a generic, stateless protocol that can be used
for many tasks beyond its use for hypertext, such as name servers and
distributed object management systems, through extension of its request
methods, error codes and headers.
H.323. H.323 is a standard that
specifies the components, protocols and procedures that provide multimedia
communication services ' real-time audio, video and data communications
' over packet networks, including Internet protocol (IP)'based
networks. H.323 is part of a family of recommendations that provide
multimedia communication services over a variety of networks.
JDBC. Java Database Connectivity is an
API that allows developers access to virtually any tabular data source
from the Java programming language. It provides cross-DBMS connectivity to
a wide range of SQL databases and, with the JDBC API, it also provides
access to other tabular data sources, such as spreadsheets or flat files.
ODBC. Online Database Connectivity
is a widely accepted API for database access. It is based on the
Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for
database APIs and uses Structured Query Language (SQL) as its database
access language.
SALT. Speech Application Language Tags is
a platform-independent standard that makes possible multimodal and
telephony-enabled access to information, applications and Web services
from PCs, telephones, tablet PCs and wireless PDAs (personal digital
assistants). The standard extends existing mark-up languages such as HTML,
XHTML and XML.
SIP, RTP, MGCP. SIP (Session Initiation
Protocol) is a signaling protocol for Internet conferencing, telephony,
presence, events notification and instant messaging. RTP (Real-time
Transport Protocol) is a protocol for the transport of real-time data,
including audio and video. MGCP/MEGACO (Media Gateway Control Protocol)
addresses the relationship between the media gateway, which converts
circuit-switched voice to packet-based traffic, and the media gateway
controller (sometimes called a softswitch), which dictates the service
logic of that traffic.
SRGS. Speech Recognition Grammar
Specification defines the syntax for grammar representation intended for
use by speech recognizers and other grammar processors so that developers
can specify the words and patterns of words to be listened for by a speech
recognizer.
SSML. Speech Synthesis Markup Language is
a rich, XML-based markup language for assisting the generation of
synthetic speech in Web and other applications. Its essential role is to
give authors of synthesizable content a standard way to control aspects of
speech output such as pronunciation, volume, pitch, rate, etc., across
different synthesis-capable platforms.
SS7/ISUP. Signaling System 7 is an
architecture for performing out-of-band signaling in support of the
call-establishment, billing, routing and information-exchange functions of
the PSTN (public switched telephone network). It identifies functions to
be performed by a signaling-system network and a protocol to enable their
performance. ISUP (ISDN User Part) defines the messages and protocol used
in the establishment and tear down of voice and data calls over the PSTN,
managing the trunk network on which they rely.
VoiceXML. VoiceXML (Voice eXtensible
Markup Language) is designed for creating audio dialogs that feature
synthesized speech, digitized audio, recognition of spoken and DTMF key
input, recording of spoken input, telephony and mixed-initiative
conversations. Its major goal is to bring the advantages of Web-based
development and content delivery to interactive voice response
applications.
WAP/WML. Wireless
Application Protocol and Wireless Markup Language refer to a markup
language based on XML that is intended for use in specifying content and
user interface for narrow band devices, including cellular phones and
pagers.
XHTML. eXtended HyperText Markup Language
is a family of current and future document types and modules that
reproduce, subset and extend HTML 4. The XHTML document types are
XML-based and are ultimately designed to work in conjunction with
XML-based user agents.
XML. eXtensible Markup Language is a
simple, very flexible text format derived from SGML (Standard Generalized
Markup Language). Originally designed to meet the challenges of
large-scale electronic publishing, XML is also playing an increasingly
important role in the exchange of a wide variety of data on the Web and
elsewhere.
X+V. XHTML + Voice brings spoken
interaction to standard Web content by integrating a set of mature Web
technologies such as XHTML and XML Events with XML vocabularies were
developed as part of the W3C Speech Interface Framework. The profile
includes voice modules that support speech synthesis, speech dialogs,
command and control, speech grammars and the ability to attach voice event
handlers.
[ Return
To August 2004 Table Of Contents ] |