×

TMCnet - The World's Largest Communications and Technology Community
ITEXPO begins in:   New Coverage :  Asterisk  |  Fax Software  |  SIP Phones  |  Small Cells
 

SpeechWorld
August 2004

Can't Stand The Acronyms?
Customer [email protected] Solutions' Presents 'The Standards: A Primer'

By Dr. K. W. (Bill) Scholz, Ph.D, Unisys Corp.


The principal standards and standardized APIs (application program interfaces) that guide the operation and interaction of components in speech architecture are listed and described below. The agency responsible for each standard or API is shown in parentheses after the standard's name.

CCXML. Call Control eXtensible Markup Language is designed

Definition Du Jour

Speaker-independence
In the earlier days of speech recognition technology, the ultimate user of the speech application had to 'train' the system to his or her voice, therefore increasing the potential for success that the system will recognize his or her voice. Developers realized the impracticality of this method and the nuisance of having to continually 'retrain' the system for different voices, particularly in enterprise applications. Today, many applications are 'speaker-independent,' meaning that the system can recognize a multitude of different voices and speech patterns, eliminating the need for constant retraining.

 to provide telephony call control support for dialog systems. CCXML is intended to serve as an adjunct language for use with a VXML, SALT or other dialog implementation platform.

HTTP. Hypertext Transfer Protocol is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless protocol that can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers.

H.323. H.323 is a standard that specifies the components, protocols and procedures that provide multimedia communication services ' real-time audio, video and data communications ' over packet networks, including Internet protocol (IP)'based networks. H.323 is part of a family of recommendations that provide multimedia communication services over a variety of networks.

JDBC. Java Database Connectivity is an API that allows developers access to virtually any tabular data source from the Java programming language. It provides cross-DBMS connectivity to a wide range of SQL databases and, with the JDBC API, it also provides access to other tabular data sources, such as spreadsheets or flat files.

ODBC. Online Database Connectivity is a widely accepted API for database access. It is based on the Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for database APIs and uses Structured Query Language (SQL) as its database access language.

SALT. Speech Application Language Tags is a platform-independent standard that makes possible multimodal and telephony-enabled access to information, applications and Web services from PCs, telephones, tablet PCs and wireless PDAs (personal digital assistants). The standard extends existing mark-up languages such as HTML, XHTML and XML.

SIP, RTP, MGCP. SIP (Session Initiation Protocol) is a signaling protocol for Internet conferencing, telephony, presence, events notification and instant messaging. RTP (Real-time Transport Protocol) is a protocol for the transport of real-time data, including audio and video. MGCP/MEGACO (Media Gateway Control Protocol) addresses the relationship between the media gateway, which converts circuit-switched voice to packet-based traffic, and the media gateway controller (sometimes called a softswitch), which dictates the service logic of that traffic. 

SRGS. Speech Recognition Grammar Specification defines the syntax for grammar representation intended for use by speech recognizers and other grammar processors so that developers can specify the words and patterns of words to be listened for by a speech recognizer.

SSML. Speech Synthesis Markup Language is a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. Its essential role is to give authors of synthesizable content a standard way to control aspects of speech output such as pronunciation, volume, pitch, rate, etc., across different synthesis-capable platforms.

SS7/ISUP. Signaling System 7 is an architecture for performing out-of-band signaling in support of the call-establishment, billing, routing and information-exchange functions of the PSTN (public switched telephone network). It identifies functions to be performed by a signaling-system network and a protocol to enable their performance. ISUP (ISDN User Part) defines the messages and protocol used in the establishment and tear down of voice and data calls over the PSTN, managing the trunk network on which they rely. 

VoiceXML. VoiceXML (Voice eXtensible Markup Language) is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony and mixed-initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

WAP/WML. Wireless Application Protocol and Wireless Markup Language refer to a markup language based on XML that is intended for use in specifying content and user interface for narrow band devices, including cellular phones and pagers.

XHTML. eXtended HyperText Markup Language is a family of current and future document types and modules that reproduce, subset and extend HTML 4. The XHTML document types are XML-based and are ultimately designed to work in conjunction with XML-based user agents.

XML. eXtensible Markup Language is a simple, very flexible text format derived from SGML (Standard Generalized Markup Language). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.

X+V. XHTML + Voice brings spoken interaction to standard Web content by integrating a set of mature Web technologies such as XHTML and XML Events with XML vocabularies were developed as part of the W3C Speech Interface Framework. The profile includes voice modules that support speech synthesis, speech dialogs, command and control, speech grammars and the ability to attach voice event handlers.

[ Return To August 2004 Table Of Contents ]


Upcoming Events
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
Austin, Texas
MSPWorld
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Austin, Texas
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Austin, Texas

Subscribe FREE to all of TMC's monthly magazines. Click here now.