Top Five Obsolete Design Decisions in SIP

The Voice of IP

Top Five Obsolete Design Decisions in SIP

By Jonathan Rosenberg, Chief Technology Strategist  |  May 22, 2012

This article originally appeared in the May 2012 issue of INTERNET TELEPHONY magazine.

Work on the session initiation protocol began in the Internet Engineering Task Force in 1996. It reached its first milestone in 1999 with the publication of RFC2543, followed by the publication of the more mature RFC3261 in 2002. SIP is therefore approaching its seventeenth birthday. When measured in Internet time, this is nearly an eternity.

To put things in perspective, IDC (News - Alert) reported that there were 36 million Internet users worldwide in December of 1996 (representing .9 percent of the worldwide population). In December of 2011, this number was 2.26 billion (representing 32.7 percent of the worldwide population). In 1996, Netscape Navigator (News - Alert) was at the height of its dominance, with almost 80 percent share. The dot-com bubble had yet to even start.

It’s a credit to SIP technology that it remains in widespread usage after so many years of existence. However, SIP is showing its age.

Several of its design choices, which made complete sense in the late nineties, no longer make sense. And so, as lead author of SIP, I bring to you my top five design choices made by SIP which are now obsolete.

NAT Traversal Not Built-In

SIP made a design choice early on to ignore the growth of network address translation in the Internet. This was a design choice that was already questionable in 1999. In the modern Internet, it has proven itself to be a terrible choice. With home routers everywhere and mobile Internet on the rise, NATted endpoints are the norm. Without a built-in solution, the market moved to fill the gap (e.g., session border controllers), and specifications were built after the fact (e.g., interactive connectivity establishment, or ICE), whose design is complicated by being an extension and not part of the SIP spec.

Stateless Proxies

SIP’s servers – called proxies – hold state just for the duration of a single transaction. This choice was made to achieve scale and high reliability. At the time, building out stateful servers in a data center was a challenge. Stateful servers were less important since most users had just a single VoIP endpoint. Fast forward to today, and everything has changed. The arrival of the cloud, along with services like Azure, have reduced dramatically the costs and complexities involved in server-side state management. Furthermore, the arrival of smartphones, tablets, and Internet-connected game consoles means that most users have multiple devices. They now expect these to be fully in sync with each other. In this kind of environment, it makes more sense to have more state in the servers to facilitate such synchronization. The market has realized this as well, and we’ve seen the rise of softswitches, B2BUAs and numerous other products that use SIP as an interface yet retain state in servers.

Registration and Connection Linked

In SIP, a user is registered to the service to receive calls. Practically speaking, that registration operates over a persistent connection, through which calls and chats are received. If the connection breaks, the user can no longer receive communications. They need to re-connect, and re-register. In essence, the user’s connection is isomorphic with their registration. This made a lot of sense in an era of always-on PCs. Now, in the modern mobile era, users are connecting with smartphones that don’t hold persistent connections. Their connectivity comes and goes. Furthermore, clients can be reached through push notification services offered by the mobile operating systems. SIP makes no provision for this kind of connectivity.

Notification-Centric Presence

The SIP presence protocol is built on the idea of a long-lived subscription. The client has a persistent connection to the service and subscribes to a buddy list. As the state of those buddies changes, notifications are sent to the client, indicating the new presence state. This model makes sense in a world of always-on endpoints with no constraints on battery life, where the user has a desktop application that runs all the time. This was a perfectly valid choice in 1996. In the modern Internet, where smartphones and tablets are becoming the focus, network connectivity comes at a cost – battery consumption. SIP’s subscription model is too chatty and consumes too much power when realized on a mobile device.

Flexibility over Performance

SIP is actually a large compendium of specifications, allowing implementers to pick and choose which ones they wish to use. This, among other reasons, has led to a protocol that is not optimized – in terms of message counts or message sizes – for any particular use case. Optimizing for flexibility over messaging volume (and, in general, performance) was the right choice in the late nineties, when mobile Internet was just a dream. But, in the modern Internet, it has a negative impact on the most important resource – battery life. User expectations around application performance also have grown by leaps and bounds, tipping the scales on this tradeoff.


Jonathan Rosenberg is chief technology strategist at Skype (News - Alert) (

Edited by Stefania Viscusi