×

SUBSCRIBE TO TMCnet
TMCnet - World's Largest Communications and Technology Community

CHANNEL BY TOPICS


QUICK LINKS




 

November 1997


TAPI 3.0: Excerpts From The Microsoft Whitepaper

As telephony and call control become more common at the desktop, a general telephony interface is needed to enable applications to access all the telephony options available on any machine. Additionally, it is imperative that the media or data on a call is available to applications in a standard manner.

Microsoft’s TAPI 3.0 provides simple and generic methods for making connections between two or more machines, and accessing any media streams involved in that connection. It abstracts call-control functionality to allow different, and seemingly incompatible, communication protocols to expose a common interface to applications. Much of TAPI’s design anticipates IP telephony, a demand poised for explosive growth as organizations begin an historic shift from expensive and inflexible circuit-switched public telephone networks to intelligent, flexible, and inexpensive IP networks. Now in its third major version, TAPI is suitable for quick and easy development of IP telephony applications.

INSIDE TAPI 3.0
TAPI 3.0 integrates multimedia stream control with legacy telephony. It is an evolution of the TAPI 2.1 API to the COM model. Besides supporting classic telephony providers, TAPI 3.0 supports standard H.323 conferencing and IP multicast conferencing. TAPI 3.0 utilizes the Windows NT 5.0 Active Directory service to simplify deployment within an organization, and it supports Quality of Service (QoS) features to improve conference quality and network manageability.

There are four major components to TAPI 3.0:

TAPI 3.0 COM API: In contrast to TAPI 2.1, the TAPI 3.0 API is implemented as a suite of Component Object Model (COM) objects. Moving TAPI to the object-oriented COM model allows component upgrades of TAPI features. It also allows developers to write TAPIenabled applications in any language, such as Java, Visual Basic, or C/C++.

TAPI Server: The TAPI Server process (TAPISRV.EXE) abstracts the TSPI (TAPI Service Provider Interface) from TAPI 3.0 and TAPI 2.1, allowing TAPI 2.1 Telephony Service Providers to be used with TAPI 3.0, maintaining the internal state of TAPI. Telephony Service Providers

(TSPs): These are responsible for resolving the protocol-independent call model of TAPI into protocolspecific call control mechanisms. TAPI 3.0 provides backward compatibility with TAPI 2.1 TSPs. Two IP telephony service providers (and their associated MSPs) ship by default with TAPI 3.0: the H.323 TSP and the IP Multicast Conferencing TSP, which are discussed later in this document.

Media Stream Providers: TAPI 3.0 provides a uniform way to access the media streams in a call, supporting the DirectShow API as the primary media stream handler. TAPI Media Stream Providers (MSPs) implement DirectShow interfaces for a particular TSP and are required for any telephony service that makes use of DirectShow streaming. Generic streams are handled by the application.

CALL CONTROL MODEL

There are five objects in the TAPI 3.0 API:

  • TAPI.
  • Address.
  • Terminal.
  • Call.
  • CallHub.

The TAPI object is the application’s entry point to TAPI 3.0. This object represents all telephony resources to which the local computer has access, allowing an application to enumerate all local and remote addresses. An Address object represents the origination or destination point for a call. Address capabilities, such as media and terminal support, can be retrieved from this object. An application can wait for a call on an Address object, or can create an outgoing call object from an Address object.

A Terminal object represents the sink, or renderer, at the termination or origination point of a connection. The Terminal object can map to hardware used for human interaction, such as a telephone or microphone, but can also be a file or any other device capable of receiving input or creating output. The Call object represents an address’s connection between the local address and one or more other addresses. (This connection can be made directly or through a CallHub.) The Call object can be imagined as a first-party view of a telephone call. All call control is done through the Call object. There is a call object for each member of a CallHub.

The CallHub object represents a set of related calls. A CallHub object cannot be created directly by an application — they are created indirectly when incoming calls are received through TAPI 3.0. Using a CallHub object, a user can enumerate the other participants in a call or conference, and possibly (because of the location-independent nature of COM) perform call control on the remote Call objects associated with those users, subject to sufficient permissions.

MEDIA STREAMING
The Windows operating system provides an extensible framework for efficient control and manipulation of streaming media called the DirectShow API. DirectShow, through its exposed COM interfaces, provides TAPI 3.0 with unified stream control.

At the heart of the DirectShow services is a modular system of pluggable components called filters, arranged in a configuration called a filter graph. A component called the filter graph manager oversees the connection of these filters and controls the stream’s data flow. Each filter’s capabilities are described by a number of special COM interfaces called pins. Each pin instance can consume or produce streaming data, such as digital audio. While COM objects are usually exposed in user mode programs, the DirectShow streaming architecture includes an extension to the Windows driver model that allows the connection of media streams directly at the device driver level.

These high-performance streaming extensions to the Windows driver model avoid user-to-kernel mode transitions, and allow efficient routing of data streams between different hardware components at the device driver level. Each kernel mode filter is mirrored by a corresponding user mode proxy that facilitates connection setup and can be used to control hardware-specific features.

DirectShow network filters extend the streaming architecture to machines connected on an IP network. The RealTime Transport Protocol (RTP), designed to carry real-time data over connectionless networks, transports TAPI media streams, and provides appropriate time stamp information. TAPI 3.0 includes a kernel mode RTP network filter. TAPI 3.0 utilizes this technology to present a unified access method for the media streams in multimedia calls. Applications can route these streams by manipulating corresponding filter graphs; they can also easily connect streams from multiple calls for bridging and conferencing capabilities.

TAPI 3.0 H.323 TSP
The H.323 Telephony Service Provider (TSP) — along with its associated Media Stream Provider — allows TAPI-enabled applications to engage in multimedia sessions with any H.323-compliant terminal on the localarea network. Specifically, the H.323 Telephony Service Provider (TSP) implements the H.323 signaling stack. The TSP accepts a number of different address formats, including name, machine name, and e-mail address. The H.323 MSP is responsible for constructing the DirectShow filter graph for an H.323 connection (including the RTP, RTP payload handler, codec, sink, and renderer filters).

INTEGRATION WITH WINDOWS NT 5.0 ACTIVE DIRECTORY
H.323 telephony is complicated by the reality that a user’s network address (in this case, a user’s IP address) is highly volatile and cannot be counted on to remain unchanged between H.323 sessions. The TAPI H.323 TSP utilizes the services of the Windows NT Active Directory to perform user-to-IP address resolution. Specifically, user-toIP mapping information is stored and continually refreshed using the Internet Locator Service (ILS) Dynamic Directory, a real-time server component of the Active Directory.

IP MULTICAST CONFERENCING IN TAPI 3.0
IP Multicast is an extension to IP that allows for efficient group communication. IP Multicast arose out of the need for a lightweight, scalable conferencing solution that solved the problems associated with real-time traffic over a datagram, “best-effort” network. There are many advantages to using IP Multicast: scalability, fault tolerance, robustness, and ease of setup. The IP Multicast conferencing model incorporates the following key features:

  • No global coordination is needed to add and remove members from a conference.
  • To reach a multicast group, a user sends data to a single multicast IP address. No knowledge of the other users in a group is necessary.
  • To receive data, users register their interest in a particular multicast IP address with a multicast-aware router. No knowledge of the other users in a group is necessary.
  • Routers hide the multicast implementation details from the user.

TAPI 3.0 IP MULTICAST CONFERENCING TSP
The IP Multicast Conferencing TSP is chiefly responsible for resolving conference names to IP multicast addresses, using the Session Description Protocol (SDP) conference descriptors stored in the ILS Dynamic Directory Conference Server. It is complemented by the Rendezvous conference controls, described later in this document. The IP Multicast Conferencing MSP is responsible for constructing an appropriate DirectShow filter graph for an IP multicast connection (including RTP, RTP payload handler, codec, sink, and renderer filters).

TAPI 3.0 uses the IETF standard Session Description Protocol (SDP) to advertise IP multicast conferences across the enterprise. SDP descriptors are stored in the Windows NT Active Directory — specifically, in the ILS Dynamic Directory Conference Server. In contrast to the Dynamic Directory servers utilized by the H.323 TSP, there is only one ILS Conference Server per enterprise, since conference announcements are not continually refreshed, therefore consuming little bandwidth.

TAPI 3.0 RENDEZVOUS CONTROLS
The Rendezvous Controls are a set of COM components that abstract the concept of a conference directory, providing a mechanism to advertise new multicast conferences and to discover existing ones. They provide a common schema (SDP) for conference announcement, as well as scriptable interfaces, authentication, encryption, and access control features.

A session description is broken into three main parts: a single Session Description, zero or more Time Descriptions, and zero or more Media Descriptions. The Session Description contains global attributes that apply to the whole conference or all media streams. Time Descriptions contain conference start, stop, and repeat time information, while Media Descriptions contain details that are specific to a particular media stream.

While traditional IP multicast conferences operating over the MBONE (IP Multicast Backbone) have advertised conferences using a push model based on the Session Announcement Protocol (SAP), TAPI 3.0 utilizes a pullbased approach using Windows NT Active Directory services. This approach offers numerous advantages, among them bandwidth conservation and ease of administration.

CONFERENCE SECURITY MODEL
TAPI 3.0’s conference security system addresses who can create, delete, and view conference announcements. The security system also serves to prevent conference eavesdropping. TAPI 3.0 utilizes the security features of the Windows NT Active Directory and LDAP to provide for secure conferencing over insecure networks such as the Internet. Each object in the Active Directory can be associated with an Access Control List (ACL) specifying object access rights on a user or group basis. By associating ACLs with SDP conference descriptors, conference creators can specify who can enumerate and view conference announcements. User authentication is provided by the Windows NT security subsystem.

QoS AND TAPI 3.0
Quality of Service (QoS) in TAPI 3.0 is handled through the DirectShow RTP filter, which negotiates bandwidth capabilities with the network based on the requirements of the DirectShow codecs associated with a particular media stream. These requirements are indicated to the RTP filter by the codecs via its own QoS interface. The RTP filter then uses the COM Winsock2 QoS interfaces to indicate, in an abstract form, its QoS requirements to the Winsock2 QoS service provider (QoS SP). The QoS SP, in turn, invokes a number of varying QoS mechanisms appropriate for the application, the underlying media, and the network, in order to guarantee appropriate end-to-end QoS. These mechanisms include:

  • The Resource Reservation Protocol (RSVP).
  • Local Traffic Control (Packet Scheduling, 802.1p, and appropriate layer 2 signaling mechanisms).
  • IP Type of Service and DTR header settings.

RSVP
The Resource Reservation Protocol (RSVP) is an IETF standard designed to support resource (for example, bandwidth) reservations through networks of varying topologies and media. Through RSVP, a user’s Quality of Service requests are propagated to all routers along the data path, allowing the network to reconfigure itself (at all network levels) to meet the desired level of service.

Local Traffic Control
Packet Scheduling: This mechanism can be used in conjunction with RSVP (if the underlying network is RSVPenabled) or without RSVP. Traffic is identified as belonging to one flow or another, and packets from each flow are scheduled in accordance with the traffic control parameters for the flow. These parameters generally include a scheduled rate (token bucket parameter) and some indication of priority. The former is used to pace the transmission of packets to the network. The latter is used to determine the order in which packets should be submitted to the network when congestion occurs.

801.2p: Traffic control can also be used to determine the 802.1 User Priority value (a MAC header field used to indicate relative packet priority) to be associated with each transmitted packet. 802.1p-enabled switches can then give preferential treatment to certain packets over others, providing additional Quality of Service support at the data link layer level.

Layer 2 Signaling Mechanisms: In response to Winsock 2 QoS APIs, the QoS service provider may invoke additional traffic control mechanisms depending on the specific underlying data link layer. It may signal an underlying ATM network, for instance, to set up an appropriate virtual circuit for each flow. When the underlying media is a traditional 802 shared media network, the QoS service provider may extend the standard RSVP mechanism to signal a Subnet Bandwidth Manager (SBM). The SBM provides centralized bandwidth management on shared networks.

IP Type Of Service
Each IP packet contains a threebit Precedence field, which indicates the priority of the packet. An additional field can be used to indicate a delay, throughput, or reliability preference to the network. Local traffic control can be used to set these bits in the IP headers of packets on particular flows. As a result, packets belonging to a flow will be treated appropriately later by three devices on the network. These fields are analogous to 802.1p priority settings but are interpreted by higher layer network devices.

ENTERPRISE DEPLOYMENT OF TAPI 3.0
TAPI 3.0 has been designed to scale from the smallest business up to the largest organizations, while at the same time taking advantage of the Windows NT Active Directory to bring IP telephony to the enterprise.

The ILS Dynamic Directory Servers and the ILS Dynamic Directory Conference Server provide functionality for point-to-point and multiparty conferencing. IP telephony clients can utilize video and audio capture equipment, but can also support legacy telephones through the use of a PSTN add-in card. The IP/PSTN Gateway digitizes incoming analog voice calls from PSTN lines and encapsulates them in H.323 streams, and vice versa, providing users with the ability to send and receive legacy voice calls through existing telephony infrastructure. The H.323 Proxy allows H.323 clients connectivity with the Internet by forwarding H.323 streams through the enterprise firewall. This enables H.323 Internet, Intranet, and business-to-business connectivity.

The function of the IP Multicast Proxy is somewhat similar to that of the H.323 Proxy — to forward multicast conference packets — but also furnishes clients with the ability to propagate selected conference announcements to and from the Internet. The IP Multicast Proxy monitors conference announcements stored on the ILS Dynamic Directory Conference Server and broadcasts conferences with appropriate scope and security attributes to the Internet using the Session Announcement Protocol (SAP).

Conversely, the IP Multicast Proxy listens for appropriate conferences from those broadcast over the Internet and populates the ILS Dynamic Directory Conference Server with these announcements. In this manner, the IP Multicast Proxy allows users conference connectivity over the Internet while ensuring the confidentiality and security of private conferences.

The information contained in this article represents the current view of Microsoft Corporation on the issues discussed. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft. For more information on Microsoft’s CTI initiatives, TAPI 3.0, or for a complete version of the white paper, IP Telephony With TAPI 3.0, visit the Microsoft Web site at www.microsoft.com  Direct correspondence to: Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399 USA.







Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: [email protected].
Comments about this site: [email protected].

STAY CURRENT YOUR WAY

© 2024 Technology Marketing Corporation. All rights reserved | Privacy Policy