ITEXPO begins in:   New Coverage :  Asterisk  |  Fax Software  |  SIP Phones  |  Small Cells

Feature Article
October 2003

Tony Rybczynski photo


Lessons from The Edge: VoIP In The WAN



 IP Telephony is an uncontested direction across the industry. What end users want is relatively well understood: high-quality voice, �always-on� dial tone, easy-to-use features, and new applications. Technically, this translates into �five nines� reliability, delays of less than 150 msec, short variable delays or jitter, and zero packet loss. Meeting these requirements across the WAN is exacerbated by the range of WAN connectivity options and their price/performance attributes. While multiple options exist including IP VPNs and optical Ethernet, the workhorses for inter-site connectivity are leased lines and Frame Relay. So how do we deliver quality IP telephony over these two technologies?

Bandwidth in the WAN has to be carefully engineered because it costs money -- how much depends on where the two endpoints are. In the TDM world, every voice call represents a 64 kbps connection (or 32 kbps with ADPCM). IP telephony uses G.711 and G.729, whereby voice is coded at 64 and 8 kbps respectively. But it�s not that simple. To explain this, it is important to understand the functions and operations of the protocol stack for VoIP. At a high level, each layer of the protocol stack has specific functions and adds header and trailer information to each �Protocol Data Unit� (PDU) to accomplish its functions. For VoIP speech path, the relevant protocols (layers) are, starting from the top, RTP (Real Time Protocol), UDP (User Datagram Protocol), IP (Internet Protocol), and the Link Layer protocol (Point-to-Point Protocol or PPP for leased lines, and Frame Relay).

In order to calculate bandwidth demand for a particular link, we need several pieces of information. We need to know:

S, the voice sampling rate; packets are typically formed every 20 msecs (a configuration option of many coding schemes and a tradeoff between delay and packet overhead);
P, the number of packets generated per second, typically 50 pps (=1/S);
C, the bit rate of the voice coding scheme (most commonly G.711, G.729).

As a result, we can calculate:

V, the voice payload in each packet calculated as C * S or typically 160 and 20 bytes for G.711 and G.729 respectively;
I, the IP/UDP/RTP packet overhead, or 40 bytes, including 20 for IP, 8 for UDP and 12 for RTP (without RTP header compression);
L, the link layer overhead for the specific link type (PPP for leased lines and Frame Relay).

L, the Link layer overhead is unique to each specific link type. The responsibility of the Link layer is frame delineation, initiation, control, multiplexing, and error detection. WAN links, such as PPP and Frame Relay, are based on HDLC, whereby each frame consists of an opening one-byte flag, a two-byte HDLC control field, the PPP or Frame Relay header, the packet payload, a two-byte checksum and a closing flag. The PPP and Frame Relay headers are four and two bytes respectively.

The bandwidth requirement for each call is P*(V+I+L). For G.729, 8kbps voice results in a bandwidth need of approximately 27 kbps. Use of speech activity detection (or equivalently silence suppression) can result in fewer packets per call, but it would not be wise to engineer less bandwidth for voice on the basis of this operation for a small number of calls. The bandwidth per call has to be multiplied by the maximum number of active voice calls on the WAN link, determined using traditional telephony engineering.

If a particular link only supported VoIP, engineering the bandwidth of the link would be straightforward and QoS would not add any value. But network convergence is all about bringing together voice and data on one network. Controlling delays and ensuring zero packet loss are key requirements.

The primary purposes for QoS are to:

� Minimize end-to-end delay through the network;
� Minimize the variability in end-to-end delay (jitter);
� Prevent packet loss.

QoS mechanisms, such as IP Differentiated Services (DiffServ), can work effectively over a properly engineered point-to-point leased lines. Proper engineering must address the �large packets on a slow link� problem. Let�s explain. The typical default maximum frame size on a PPP link is 1,500 bytes. Every frame sent down a wire consumes a finite amount of time; this is called serialization delay and is a function of the number of bytes in the frame, and the speed of the link.

QoS mechanisms are not pre-emptive. Therefore, a VoIP packet must wait for the complete transmission of a data packet that started to be transmitted before the arrival of the VoIP packet. If the data frame is relatively small or the link speed is fast (e.g., above 1 Mbps), this is not a problem. However, large frames on slow links require large serialization times, which result in significant delay for VoIP packets -- whether QoS is implemented or not. This is essentially a jitter problem. The recommended solution for large frames on slow links is Layer 2 fragmentation. Fragmentation breaks up the large frames and allows for VoIP frames to be transmitted sooner, and more consistently. A simple rule of thumb is to select the link fragment size (in bytes) equal to the numerical value of the link speed (in kbps). For example select 512 bytes for 512 kbps links. This rule always results in a maximum serialization delay of 8ms.

There are four sources for packet loss that must be dealt with. Firstly, packet loss can be due to transmission errors. Link clock slipping, when the end device isn�t properly synchronized to the network, also causes packet loss at regularly scheduled intervals. Fortunately, low error rate links are generally available domestically and over many international links. Secondly, packets can be lost during link and node failures. The solution rests in highly reliable switch design and in fully leveraging networking techniques to minimize the impacts of failures. Thirdly, packets can be lost due to network congestion, which can result if unplanned traffic patterns are generated by users and applications. Lastly, VoIP packets that arrive at their destination too late to be useful, are simply thrown away -- this is an underflow condition, and is sometimes referred to as �jitter buffer packet loss.� QoS is a solution to these last two areas. In addition, the specific implementation of the receiver�s jitter buffer has a strong bearing on whether a particular packet is usable or �just too late.� From the end user�s perspective, there is no functional difference between packets lost in the network, and jitter overflow. The difference is important, however, when trying to troubleshoot the network and fix the problem. In any case, packet loss, unless very infrequent, is a VoIP killing impairment.

Since Frame Relay is seldom used as purely a point-to-point link (hub and spoke configurations are much more common), and since Frame Relay QoS services have limited deployment, running VoIP over Frame Relay is somewhat problematic. The �large packets on a slow link� issue can only be addressed by constraining the frame size on particular virtual circuits on an end-to end basis. The multiplexed nature of Frame Relay and the limited availability of QoS means that a remote site with large data traffic can take bandwidth away from, and thus impact, VoIP traffic from another remote site -- the only common factor is the shared access link at the central site location. QoS can be used by the transmitting devices at the edge of the network with positive results, but this does not address QoS prioritization from the network towards the user, nor across the network.

Frame Relay services are based on assigning Committed Information Rates (CIRs) to each virtual circuit. The CIR is a guaranteed throughput, any packet above which can be discarded by the network. The principle is that the enterprise can implement traffic shaping (by delaying non-voice traffic) to smooth the transmitted traffic within the CIR -- but this is rarely done. Enterprises have learned to use Frame Relay very effectively for data, and often subscribe to zero CIR, on the basis that the service provider over-engineers the backbone network. Even when CIR is used, a common practice is to over-subscribe (i.e., under-engineer) access links at the central site. This is driven by the desire to save money, since CIR is a tariffed item. The laws of large numbers and temporal spacing of demand often make this a safe and wise game to play -- for data. The above Frame Relay realities can be a harsh environment for VoIP, because the voice frames have no way of being protected. But all is not lost.

The ideal method of supporting VoIP over Frame Relay is to overlay a separate set of virtual circuits for VoIP, with CIR set for each to support the voice traffic, and QoS (if available) enabled on a per virtual circuit basis. The enterprise would turn on traffic shaping, giving VoIP priority over data traffic. Local packet fragmentation (now defined but not yet widely available) could be used. An even more sophisticated bandwidth management approach uses traffic pacing, whereby the transmitter spreads out the transmission of data over a longer, measured time interval to prevent data traffic from bunching up at the remote slower speed location.

So there are some viable approaches for running VoIP over Frame Relay. And the market continues to evolve. Service providers such as Sprint are offering IP-enabled Frame Relay services, which replace a partial mesh of virtual circuits by a single virtual circuit per site into a QoS-enabled IP-routed cloud. Such a service would provide an interesting, likely more cost effective, converged network solution.

The bottom-line on running Frame Relay is to engineer the network carefully leveraging access link speeds, CIR, traffic shaping and pacing, and IP-enabled Frame Relay services, QoS, and fragmentation, where available.

Enterprises, deploying IP telephony for business advantage, need IP telephony-grade infrastructures to deliver the expected voice quality and reliability. In WAN environments, this means engineering the wide area physical and virtual connections to provide the basic bandwidth required for voice and critical data traffic, and to map QoS deployed across the LAN into WAN capabilities. In this way, enterprises can leverage their leased line and Frame Relay WANs to gain the advantage of converged networking. They can also leverage IP VPNs for telecommuters and road warriors with broadband access. In the longer term, extending their campus networks to metro sites through Optical Ethernet services is a very attractive opportunity.

Tony Rybczynski is Director of Strategic Enterprise Technologies in Nortel Networks. He has over 30 years experience in the application of packet network technology. Matthew F. Michels is a Senior Consulting Engineer in Nortel Networks Succession Network Design and Systems Engineering group. For more information visit

[ Return To The October 2003 Table Of Contents ]

Today @ TMC
Upcoming Events
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
Austin, Texas
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Austin, Texas
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Austin, Texas