VoIP In The Call Center: A Guide To Call
Quality And SLA Metrics
By Bob Massad, Telchemy
Use of Voice over IP (VoIP) is spreading. Not only are more people
turning to VoIP in general, but also it's moving into specific areas of
the enterprise. Call centers represent one such area. We at Telchemy
believe that applications in call centers will be a major market driver
for VoIP over the next several years. Industry forecasters have noted that
in a few years, call centers may account for a third of VoIP use. This is
all part of the 'IP-everywhere' disposition in the communications
world. To be sure, there is not a wholesale change out, but a steady
migration to IP-based services is evident.
The benefits of VoIP for the enterprise, and for the call center
specifically, are many. Not only does VoIP provide a quality experience at
low cost, but also it leverages and reduces existing infrastructure. No
longer is there a need for separate and parallel 'telecom' and 'data
comm' infrastructures and management teams, for example. IP networks
also provide a more open and efficient, less expensive and complex
environment, one which is quicker to deploy and provides richer
application potential than does the closed, circuit-switched world of the
PSTN.
Given the move to VoIP, how do we ensure that it is successful? That
is, how do we ensure that the quality is on par with the PSTN and that the
service expected (and paid for) is the service delivered?
Call Quality
Call quality measurement comes first. We need to understand how to measure
it and where to measure it. From that understanding, we can construct
service level agreements (SLAs) that focus on the metrics that are
important and accurately reflective of call quality.
Call quality, often misrepresented under the moniker of Quality of
Service (QoS), should be measured against real traffic and in real-time.
There are tools available that inject pre-recorded audio files in the
network, record the received file, either at the other end or looped back
to the source, and then do a mathematical comparison of the sent and
received files. This process is complex, slow and not necessarily
indicative of real traffic.
It is much better to implement call quality metering technology into
voice end points, such as a VoIP gateway, media server or IP phone, and/or
into 'on the wire' analyzer and probe devices that are normally
instantiated between domains and have access to live calls. It is often
beneficial to utilize both types of devices. For example, having metrics
available in both allows for quick 'problem isolation,' e.g., knowing
that the call was good at the access point, but not good at the end point
may isolate the problem to the local network. If the metrics show poor
quality at both points, then the problem may be in the access network.
Moreover, analyzers and probes can then be configured to capture traffic
at offending levels or from offending sources for detailed analysis and
problem resolution.
Once we've determined how (real traffic) and where to monitor (end
points and aggregation points between domains), we need to understand what
specifically to monitor. VoIP and video over IP traffic are of a different
nature than traditional packet network traffic, even though they are
packetized. Traditional IP traffic has been non-real-time, TCP-based
traffic (such as e-mail and file transfer) that is not 'perceptual' in
nature, whereas VoIP traffic is necessarily real-time, UDP-based and
totally perceptual in nature (i.e., 'could I hear what was said?').
Given that, the traditional utilization-based approaches and metrics do
not apply. Reporting the number of bytes or packets sent or lost, and
reporting the averages per call or per link are not of much help. Some
tools will extend the traditional data set to include delay and jitter,
but these are not too helpful either. None of the tools traditionally
available correlate these statistics so that a quick and clear indication
of call quality is possible, especially for nontechnical staff. And none
includes the end user's perception of the call. Let's look a bit
closer to see why.
Perception
Previously we noted that packet loss, delay and jitter metrics are
ineffectual indicators of call quality. End points such as IP phones or
media gateways employ a jitter buffer, sometimes called a de-jitter
buffer. Its task is to remove jitter from the listener's point of view.
To do that, it adds a reasonable amount of packet delay, on the order of
50 to 150 milliseconds ' the upper boundary on what would be considered
'toll quality.' It then plays out those packets at a 'constant'
rate, discarding those packets that have exceeded the jitter buffer's
delay boundary. Thus, the listener does not directly experience jitter.
The jitter has been turned into additional delay and excessive delay is
turned into packet loss.
Taking delay a bit further, we find that delay issues don't relate so
much to call quality, (i.e., the listener's ability to discern exactly
what was said) as much as it relates to conversational quality (i.e., the
ability of carry on a smooth conversation where the person speaking
changes based on some occasional 'cue'). When delay is too long,
people have a tendency to think they missed a cue and will start or resume
speaking. When the cue finally arrives, it collides with the started or
resumed speech of the other party, causing what is called 'double talk.'
The phonetics are fine, they are simply out of sync.
The last area of concern is packet loss. Here the issue becomes a bit
subtler. At a high level, packet loss is the issue that has the most
impact when determining call quality. But at a detailed level, it's a
specific kind of packet loss we need to be concerned about, and we need to
understand its specific source.
Packet loss is generally classified as randomly occurring (a packet
lost here or there) or as occurring in bursts (several consecutive packets
are lost). For VoIP, the real problem is burst loss. VoIP end points
employ codecs (coder/decoder) that typically implement packet loss
concealment (PLC) algorithms. The algorithms may, in the event of packet
loss, replay the last packet, insert comfort noise, interpolate, etc. In
effect, they render packet loss unnoticeable. For example, using the
values provided under ITU standard G.113, which defines network
transmission impairments such as packet loss and their impact on call
quality, a standard G.711 codec without PLC in a 2 percent loss scenario
has a 35 impairment rating, whereas with PLC in the same scenario, it has
only a 7 impairment rating. The moral: be sure to deploy codecs that are
PLC enabled.
One can see intuitively where these algorithms are only useful for
random or isolated packet loss. With burst loss, replaying the last packet
several times would produce a stuttering or 'rrrrrr' effect, or
continuous insertion of comfort noise would produce extended silence, or
frames of reference would be lost for interpolation. So clearly, the most
important network factor in call quality metering is burst packet loss.
It also turns out that most lost packets occur in bursts. Studies have
shown that approximately 50 percent of lost packets occur in the top 1
percent of burst loss occurrences. Other studies have shown that the
average burst consumes seven or eight packets, well beyond packet loss
concealment capabilities.
In traditional monitoring tools and agents, packet loss is presented in
terms of percentage loss, average loss and total loss. None of these
measurements contains the notion of 'burst.' Moreover, as a data
point, they go from being unenlightening in the total loss case, to being
misleading in the percentage or average cases. For example, if one assumes
5 percent packet loss in a 1,200 packet call, one would deduce that 60
packets were lost and then likely deduce that 1 in 20 was lost. From our
previous discussion, we know that PLC will handle this situation easily,
rendering the loss unnoticeable. So, this may be construed as being a 'toll
quality' call. However, the actual packet loss distribution may reveal
that there were 6 bursts of 8 lost packets and 2 bursts of 6 lost packets,
both beyond the scope of PLC, and therefore it was really a very poor
call. The point is that to understand and measure call quality, the
monitoring device or system must account for burst loss in order to be
accurate.
Other Perceptual Factors
There are other factors as well that contribute to call quality that must
be taken into account, i.e., correlated with burst loss, for accurate call
quality measurement. One such factor is codec type. Different codec types
offer different levels of call quality degradation for similar packet loss
levels. This is due to the fact that most codecs differ in compression
level. The greater the compression, which is implemented to save link
width, the more voice information is lost per loss event. For example, a
standard G.711 PCM codec is uncompressed. It consumes 64 Kbps. G.729A
consumes only 8 Kbps for the same amount of voice and therefore can pack
more voice into a packet. Thus, each packet lost with G.729 encoding loses
much more voice than does G.711. For example, according to G.113, a G.711
codec with PLC in a 2 percent packet loss environment registers an
impairment value of 7, while G.729A codec registers an impairment value of
19 in the same loss scenario.
In correlating the relevant factors, both with each other and in
real-time, the VoIP call quality monitor can provide a single-number,
accurate, clear and unambiguous quality rating of the call. These ratings
have been the subject of much study and have been standardized by
international standards bodies such as ITU and ETSI and are called 'R'
factors. They are very much like school test grades. For example, an 80
'R' is a toll quality or good call; a 60 'R' is a very poor
quality call. Measuring call quality should be that simple.
Summary
Three key points have emerged for making VoIP successful in the call
center, and across the enterprise in general. One is to implement voice
quality monitoring capabilities or agents into VoIP end points such as
gateways, IP phones and media servers, as well as into 'on the wire'
analyzers and probes so that live traffic can be monitored in real-time
and provide for problem isolation and resolution. The second key point is
that for call quality metrics to be an accurate and direct indication of
call quality, they must consider burst packet loss, i.e., the actual
packet loss distribution. The third key point is that the truly relevant
factors in measuring call quality must be correlated against each other
and in real-time to allow for a single-number metric of call quality.
Bob Massad is vice president of marketing for Telchemy. For more
information, please visit their Web site at www.telchemy.com.
[ Return
To The March 2002 Table Of Contents ]
|