BY PAUL JONES & LARRY SCHESSEL
H.323, an industry-standard protocol suite for converging audio, video, and
data communication over packet-switched networks, is widely deployed over
both private and public IP networks. It forms the network foundation for
most of the worldï¿½s VoIP services, and installations are growing.
A recent study by an independent testing company, for example, revealed
that of 23 IP PBXs from 16 vendors, 20 supported H.323 for call control, 21
supported the protocol for call signaling, and 15 supported it for
delivering features to endpoints. By contrast, the closest competing
protocol to H.323 was supported by just one IP PBX for call control and by
two products for signaling. On the Service Provider side, the H.323 Forum
(www.h323forum.org) identifies over 70 Service Providers worldwide using the
H.323 protocol to carry public voice traffic as well as nine carriers each
carrying over 1 billion voice minutes per month.
Since its inception in 1996 as a private, LAN-based conferencing
protocol, H.323, an International Telecommunications Union --
Telecommunications Sector (ITU-T) standard, has evolved to fuel
carrier-grade public Internet telephony and conferencing services.
Significant scalability enhancements to the protocol over the years have
made these wide-ranging implementations possible.
In May of this year, the ITU-T approved H.323 version 5, which
concentrated on ï¿½protocol hardening,ï¿½ or stabilizing the core feature set.
This effort began with version 4, but is the primary focus of version 5.
Going forward, new H.323-standard features will generally be developed
through extensions to the base protocol, with few changes expected for the
Together, the scalability improvements and stabilized code have rendered
H.323 both extensible and mature, traits that have contributed to H.323ï¿½s
market leadership. Initially having borrowed heavily from its H.320
predecessor for ISDN-based video conferencing systems, H.323 is now used in
commercially available network services that include toll bypass,
residential voice and video, wholesale voice transit, PC-to-phone, and
More than 90 percent of VoIP traffic is carried using H.323, and it is
supported on 80 percent of new videoconferencing systems. On public networks
alone, billions of minutes of traffic per month worldwide run on H.323,
according to the H.323 Forum.
A ï¿½Carrier Gradeï¿½ Protocol
Scalability enhancements to H.323 that have enabled the large-scale
deployment of the protocol include the following:
ï¿½ Use of direct H.225.0 call signaling between endpoints such as gateways,
user terminals, and multipoint conferencing units (MCUs).
ï¿½ A reduction in the number of messages that must be exchanged between
network elements to set up and connect calls.
ï¿½ Call capacity advertisements from gateway to gatekeeper.
ï¿½ A well-defined means of performing address resolution across a large
ï¿½ A means to recover from a network device failure.
Letï¿½s take a brief look at each of these characteristics.
Direct Endpoint Signaling
It is not necessary for an H.323 gatekeeper to route the call signaling for
every call. In fact, this is not typical in most large deployments.
H.323 gatekeepers are the network elements that determine how to route
calls. They can become network bottlenecks when routing call signaling on a
very large scale. Instead, in large networks, gatekeepers usually just
resolve the IP address of where a call should be routed and send the address
directly to the calling endpoint. Signaling takes place directly between the
two endpoints, rather than bogging down the gatekeeper.
If desired, gatekeepers can route the call signaling as a way to provide
network-based services to users. Such network architectures have been
successfully deployed. However, key to building large H.323 networks has
been the large-scale use of direct signaling and applying routed signaling
only where network-based services are needed.
Reduced Messaging To Set Up And Connect Calls
Another scalability enhancement to H.323 networks is a feature called Fast
Connect. Depending on the transport layer protocol in use, such as
Transmission Control Protocol (TCP) or User Datagram Protocol (UDP), it is
possible to establish an H.323 call in as few as 1.5 roundtrip messages.
Calls to check an IP voice mailbox could be set up and connected in just two
messages: Setup and Connect.
With Fast Connect, an H.323 endpoint sends a Setup message proposing an
open call with another endpoint using one of three popular codecs: G.711,
G.729, or G.723. The called endpoint selects one of those codecs and returns
it in the reply. For most endpoints, this works, and calls are established
with a minimum of messages. Also, with the use of a new feature called
Extended Fast Connect, the mechanisms of Fast Connect may be re-used to
re-negotiate capabilities on-the-fly while the call is established or being
In a worst-case scenario, the receiving endpoint doesnï¿½t support one of
the proposed codecs. So the endpoints open an H.245 multimedia control
protocol session and exchange several handshaking messages to establish
For negotiating more complex capability sets, the H.245 control channel
is often necessary. The reason is that, as the number of options and
parameters within an endpointï¿½s capability set increases, so does the need
for a well-defined negotiating mechanism. H.323 endpoints can establish
calls and media channels quickly and also have the luxury of exchanging and
using richer capability sets through the use of H.245.
Another scalability improvement is the ability to tunnel H.245 messages
within the call signaling channel, rather than a separate TCP connection.
Using H.245 tunneling, normal call signaling messages also carry H.245
messages. This consolidated-message approach removes the latency of
signaling and reduces the number of messages that the endpoints must
process. Further, when TCP is used as a transport, it reduces the number of
socket connections required for a single call.
Call Capacity Advertisement And Load Balancing
H.323 gateways can advertise their call capacity to gatekeepers for
efficient load balancing among available gateway resources. The gatekeeper
serves as a ï¿½traffic cop,ï¿½ forwarding traffic to a destination gateway that
it knows has the resources to terminate a call. Without the gatekeeperï¿½s
ability to monitor gateway resources, calls might frequently fail.
Gateways can report their availability to gatekeepers on a full-system
basis or more granularly, based on the capacity of a group of DS0 (64 Kbps)
To overcome gateway failures, gatekeepers can choose primary and
alternate gateways for terminating a call. If the most preferred gateway
fails, the gatekeeper can route the call via the alternate gateway.
H.323 gatekeepers play a vital role in resolving addresses within H.323
networks. Typically, an endpoint does not have the wherewithal to resolve
the address of the destination endpoint, because it cannot account for
variables such a remote gatewayï¿½s current load or the least-cost route. The
technical and business rules that drive the address resolution functions --
especially within the service provider network -- support the notion that
such functions should be left to network elements such as the gatekeeper.
A single gatekeeper often does not know the address of a remote endpoint,
either. In very large public networks, there are generally many routes,
rules, and interconnection points. Gatekeepers generally communicate with
other gatekeepers to resolve addresses, which might account for gateway
resource availability, least-cost routing requirements, necessary
quality-of-service (QoS), bandwidth requirements, and so forth.
Recovering From Device Failure
In order for H.323 to succeed in the service provider market, it is
essential to prevent the loss of calls due to network device failures. H.323
has the wherewithal to recover from a failed network connection without
dropping the call. Of course, there is little that can be done from the
H.323 signaling standpoint if the physical connection to a DS0 is lost, but
it is certainly possible to ï¿½route aroundï¿½ failed H.323 entities. Work
currently underway within the ITU will also allow intermediate call
signaling entities (such as Gatekeepers that route call signaling, proxies,
etc.) to safely remove themselves from the call path. This leads to improved
scalability and reduces chances of call failure.
STILL FOCUSED ON MULTIMEDIA
With so much attention given to scalability, performance, and robustness
issues, one might be led to believe that H.323 is less useful for multimedia
and is primarily a ï¿½voiceï¿½ protocol. Certainly, that is not the case. In
fact, H.323 still retains all of the strong multimedia capabilities that it
has had from the beginning. Along the way, additional capabilities have been
added to truly support a rich multimedia environment for the user, including
the use of text messages, audio, video, electronic whiteboarding, and
application sharing. While some of the multimedia components are not part of
the ï¿½coreï¿½ H.323 protocol, those capabilities are tightly and smoothly
integrated for seamless operation.
What About SIP?
Interestingly, H.323 development began at the same time as that of the
Session Initiation Protocol (SIP), an emerging Internet Engineering Task
Force (IETF) standard. Whereas H.323ï¿½s focus is on voice, video, and data
conferencing, SIP is currently more oriented toward VoIP-only
implementations, in that it doesnï¿½t support video control capabilities. Even
so, SIP has targeted a range of enhanced applications for which H.323 was
not designed, including instant messaging, and multiplayer video games.
The choice of H.323 versus SIP often comes down to business and
application requirements. For example, if one wanted to build a
client/server multimedia application, SIP would be a clear winner as it has
the basic capabilities required for a closed client/server environment.
However, if one wanted to deploy a carrier network that supports video and
far-end camera control, SIP would not be the right choice, as it is still
missing some necessary features to perform that function. Specifically, SIP
does not include a video control channel nor a far end camera control
protocol, two features vital for video services.
In some cases, SIP and H.323 can complement each other. For example, a
call agent might use SIP when directing a call to voicemail and H.323 for
the trunking interface. This is an example where users can leverage the
strengths of each protocol and choose the protocol that is most suitable for
As it has matured, H.323 has established itself as the current
market-leading protocol for running voice and video over IP networks. In the
future however, as other protocols are introduced, network operators will
likely look at establishing interoperability between them to gain the
benefits of each protocol.
Paul Jones is rapporteur of the ITU-T Q.2/16 committee, which is
responsible for standardization of H.323, and is a voice systems architect
at Cisco Systems. Larry Schessel is president of the H.323 Forum and manager
of product marketing, protocols, and session applications at
To The August 2003 Table Of Contents ]
Internet Telephony Protocols: Is Any One
BY MICHAEL O'HARA
Engage any telecom expert in a discussion of packet telephony, and your
conversation will likely be peppered with references to a wide variety of
networking protocols. While there are many protocols important in deploying
a converged voice and data network, four protocols stand out as being most
popular in implementing voice over packet networks: H.323, Megaco/H.248,
Media Gateway Control Protocol (MGCP), and Session Initiation Protocol
(SIP). Each protocol has its strengths and weaknesses with respect to issues
such as ease of implementation, extensibility, and suitability for various
network applications, Quality of Service (QoS), and security. The typical
next-generation network may include one or more of these protocols.
In comparing these protocols, one main differentiator is the model in which
they distribute intelligence. H.323 and SIP operate between peer clients,
while MGCP and Megaco operate between ï¿½master and slaveï¿½ entities. With
master/slave devices such as MGCP and Megaco-based gateways, IADs, and
telephones, the control model is quite similar to traditional telephony
equipment: The call agent supplies all instructions to the ï¿½dumbï¿½ end
device, directing it to wait for signals, collect digits, play tones, open
ports, and release connections. This offers simple implementation, low-cost
end devices, and few interoperability issues. On the flip side, intelligent
SIP-based endpoints trade lower cost and ease of network implementation in
favor of a model that delivers much richer services.
Developed by the ITU, H.323 actually encompasses several protocols,
including H.245 for media control, H.225 for connection establishment
between endpoints, H.332 for large conferences, H.450 for supplementary
services and Real Time Protocol (RTP) for transport (SIP also uses RTP for
transport, which simplifies interworking between the two.) H.323 was
initially developed for multimedia conferencing over local-area networks
(LANs). Although it is the most widely deployed packet telephony standard
today, H.323 is rapidly losing ground to SIP, due to the perceived
inflexibility of the H.323 protocol suite. However, the standards continue
to evolve to try and accommodate the needs of Internet telephony, including
increasing efficiency and supporting additional services.
Developed by the IETF, SIP is a lightweight, text-based signaling protocol
used for establishing sessions in an IP network. It uses many of the
constructs and concepts of Internet protocols such as HTTP and SMTP. Based
on principles gained from the Internet community, SIP is an
application-independent protocol, which was designed at the outset to be
extremely flexible and extensible. As the name implies, SIP deals
generically with sessions, which can include voice, video, or data. The
sessions are described using a separate protocol called Session Description
Protocol (SDP). SDP is transported in the message body of a SIP message. The
media that is actually exchanged in a session is transparent to SIP. SIP has
been enthusiastically embraced for next-generation applications such as
telephony over packet services, voice-enabled e-commerce applications,
presence management, instant messaging services, and voice-controlled Web
MGCP and Megaco
While SIP and H.323 have distinct differences, MCGP and Megaco share many
similarities. Both operate in a master/slave configuration where a media
gateway controller instructs the media gateway to establish, control, and
release connections between one or more media streams. The similarities
between the two can be traced to the roots of the protocols. Megaco was the
final derivative of several draft standards, including MGCP. In general,
MGCP and Megaco have constructs to accomplish the same types of tasks.
However, because MGCP was a precursor to Megaco, Megaco has refined and
extended many of the functions. As such, Megaco is a more complex protocol.
The Megaco model allows for more flexibility and finer control by the
media gateway controller. In terms of resource reservation and control,
media processing and stream management, Megaco has greater capabilities as
well, which makes it a better protocol for applications such as multimedia
conferencing. Megaco is much more flexible when it comes to the underlying
transport type. While MGCP defines only UDP as a transport layer for
signaling messages, Megaco allows TCP, UDP, SCTP, and ATM. Megaco also has
better resource allocation and stream management mechanisms.
Because the functions of MGCP and Megaco concentrate only on the media
stream, both can enable the same types of telephony applications, although
the procedure for implementation may be much simpler in one protocol or the
other. Conversely, a simple implementation is often offset by added
Which Protocol Comes Out On Top?
In reality, there is no single protocol that wins out over the others.
H.323, SIP, MGCP, and Megaco are all integral protocols in the world of
Internet telephony. Similar services can be offered using various
combinations of these protocols. For example, a multimedia conferencing
application could be offered using H.323 or using a combination of SIP
between the media gateway controllers and MGCP between the media gateway
controller and media gateway.
For carriers and vendors, the decision as to which protocols to embrace
must begin with the decision as to where to locate the network intelligence
and control. Some carriers may be most comfortable with the ï¿½dumbï¿½ terminal
model of H.323, which allows for easier network management and upgrades.
Some may want the easer implementation for a rich class of services offered
by SIP endpoints. The MGCP/Megaco model is particularly well suited for
low-cost media gateways used for access such as IADs and IP telephones. What
is clear, however, is that despite all the debate, the consensus is that
H.323, SIP, MGCP, and Megaco will all be a part of the next-generation voice
Michael Oï¿½Hara is the vice president of marketing at
Sonus Networks. Sonus is a leading
provider of voice infrastructure products for the new public network. The
companyï¿½s solutions are designed to enable service providers to quickly and
effectively deploy an integrated network capable of carrying both voice and
data traffic, and to deliver a range of innovative, new services.
To The August 2003 Table Of Contents ]