April 1999
SIP Splashes Into Protocol Interoperability Scene
BY STEVEN MAYER
The impending acceptance of the Session Initiation Protocol (SIP) as an official IETF
standard marks an important milestone for the Internet telephony industry. This milestone,
poised to have a dramatic impact, is the merging of Internet-based distributed
technologies with traditional telephony. SIP is a true Internet protocol, patterned after
the HTTP Web-server protocol. This is important for two reasons: First, the Internet and
its associated technologies are known for rapid innovation and deployment. Secondly,
interoperability issues are easier to solve with open standards than with proprietary,
closed systems.
SIP will enable communications equipment manufacturers and service providers to deploy
systems and services that require less resources, complexity, and bandwidth than
traditional telephony protocols.
HOW IT WORKS
SIP is a lightweight, transport-independent, text-based protocol that is used for
multimedia call control and enhanced telephony services. It is lightweight in that it has
only six different method types. These methods, when combined together, allow for complete
control over a multimedia call session while limiting complexity. SIP is transport-layer
independent because it can be used with any datagram or stream protocol (UDP, TCP, ATM,
etc.). It is text-based - a method is formed via a textual header with fields that contain
call properties. This text-based approach is easy to parse, thin in terms of packet
overhead, and extremely flexible.
SIP clients, typically called user agents, communicate with SIP servers in a
client/server fashion. User agents also act as servers when the SIP request reaches its
final destination. These user agents contain the full SIP state machine and can be used
without intermediate servers.
SIP servers can act in two different modes - as proxy servers or as redirect servers.
SIP proxy servers forward requests to the next hop, SIP server, or user-agent within an IP
cloud. Redirect servers inform their clients of the address of the requested server and
allow for the client to contact that server directly. Any number of hops can be traversed
until the final destination for the request is found. SIP servers, on occasion, will need
to contact an external location server to determine routing or user policy information.
The SIP specification allows for maximum flexibility, as it does not bind the user into a
single scheme for locating users.
SIP servers can either maintain state information or simply forward requests in a
stateless manner. This simplifies the complexity of SIP servers and allows for greater
scalability than other protocols.
BENEFITS FOR GATEWAYS
SIP is well-suited to provide call control for Internet telephony gateways. The user
agents would typically be run on the gateway device, while the SIP servers (proxy or
redirect) can run anywhere within the IP cloud. This architecture allows for a clear
delineation of call control from the "media channel." SIP also allows for small
endpoint devices (devices with limited memory and CPU power) to have a thin call control
layer and powerful servers within the network. SIP is both mediastream and
codec-independent as it uses the Session Description Protocol (SDP) to specify the format
of the media to be transmitted. It is fully multicast ready and can support very large
multipoint conferences without suffering any performance degradation.
When using UDP for transport, SIP has a built-in reliability mechanism that utilizes
ACK methods that are sent in response to INVITE methods. The SIP state machine enables
reliability by specifying a retransmit mechanism with an exponential back-off strategy,
and a request timeout after the 11th packet has been attempted.
Call setup delays incurred by SIP are typically the time for 1.5 round trips from
client to server. The client will issue an INVITE, the server will respond and the client
will send an ACK back to the server. This is not only light on bandwidth but minimizes
call setup times. (See the sidebar entitled, "The Six Methods of SIP"
for more information.)
Responses to SIP methods are patterned after those of HTTP Web servers. The numerical
response code hierarchy is organized into six classes. The codes, in increments of 100
(starting at 100) allow for both informational and final codes. For example, a response
code in the 1xx range is informational only, while a code in the 4xx range indicates a
request failure.
INTEROPERABILITY STRENGTH
Today, interoperability between H.323 gateways and clients supplied by multiple vendor
products is not guaranteed. SIP, however, has been designed with interoperability in mind,
and with the expectation that multivendor interoperability should be achieved between both
clients and gateways.
SIP is also well-suited to provide interoperability among other emerging Internet
telephony protocols. SIP and the Media Gateway Control Protocol (MGCP, the proposed merger
of SGCP and IPDC) will work together extremely well. MGCP is designed to enable external
control and management of multiservice packet networks operating at the edge of the
network. In this environment, SIP would provide the call control model for endpoint client
to gateway session communication and the platform with which to build enhanced services.
The SIP call control model provides an ideal paradigm for adding enhanced services.
Since SIP has a very simple flow of messages, it becomes seamless for an intermediate
server or an endpoint to make decisions based on the current state. This is also made
possible because SIP is a text-based protocol. Due to this, servers can easily modify a
message before passing it to its next hop.
For example, a SIP server could receive a request, examine it, and determine that it is
an audio-based call. Next, the server could decide to route this call to the user's
regular phone. If that same call had contained a media type of video (as described in the
SDP portion of the message), the server could first try to route the call to the user's
PC. Similarly, if an INVITE arrived after 5:00 P.M. for the user's work number, the server
may decide to route that call to the user's home phone instead.
The concept, simple yet powerful, is that decisions can be made based on external
information without disruption of the flow of the call setup. The actions that are taken
can be dynamic, as with a Web server. This provides ultimate flexibility.
CONCLUSION
As the Internet telephony industry evolves, the enabling protocols must adapt to this
evolution without disrupting the already installed user base. A key evaluation metric of a
protocol is its ability to be rapidly extended. SIP can be extended in much to same way
that HTTP servers are extended, allowing a running SIP server to be dynamically extended
to support more advanced features and services. This level of flexibility is critical to
the rapidly moving Internet telephony industry.
Since SIP was designed from the beginning to be an Internet-based protocol, it offers a
high degree of flexibility, dynamic extensibility, and interoperability. Because of its
powerful set of methods and ability to handle dynamic actions based on current state, SIP
is exceptionally positioned to deliver enhanced services and provide rapid innovation to
the Internet telephony industry.
Steven Mayer is the director of technology for dynamicsoft. dynamicsoft is the
leading supplier of Java-technology based software for converged networks. A market-first,
dynamicsoft's Java technology-based jVoIP framework combines a standards-based open
architecture with innovative technologies to enable exceptional quality, scalability, and
configurability. Steven can be reached at [email protected].
For additional information, visit dynamicsoft's Web site at www.dynamicsoft.com.
|