This article originally appeared in the December 2010 issue of Unified Communications Magazine
Up until now, mobile operators have distinguished themselves through features such as geographic coverage, handsets with rich features, and low subscription cost. In the
U.S., Verizon’s (News - Alert) recent ads touting its coverage, AT&T’s exclusive contract with Apple, and Sprint’s announcement that it will continue to offer unlimited data plans are some examples
of the above. However, as mature markets reach saturation, and as operators find ways to offer devices and coverage comparable to their competitors, some have started to look at voice quality as a way to distinguish themselves and gain a competitive advantage, even if for a short period of time.
Initial experiments have shown great satisfaction on the part of subscribers. At the same time, Skype (News - Alert) and other IP telephony applications have given the general public a taste of the
benefits of high-definition audio. Thanks to Skype’s use of HD vocoders, calls placed from one Skype user to another exhibit much higher voice quality than landline-to-landline calls,
which have been the so-called gold standard for voice communication until very recently.
High-definition voice, standardized in 2001 by 3GPP and known as adaptive multi-
rate wideband, or AMR-WB, is an opportunity for wireless operators to leapfrog over landline telephony, offering subscribers voice quality comparable to Skype-to-Skype calls. In traditional telephony, the input voice is bandpass filtered to remove components below 200Hz, and above 3400Hz. The filtered signal is then sampled at 8,000 samples per second, and quantized at 8 bits per sample, to generate the well-known 64kbps pulse code modulation signal. In AMR-WB, the bandpass filter is set to 50-7000Hz, and the filtered signal is sampled at 16,000 samples per second. Subjective studies have concluded that addition of the 50-200Hz band at the low end gives the impression of being in the same room with the speaker, while extension of the high end from 3400Hz to 7000Hz provides higher intelligibility.
The AMR-WB codec specifies nine bit rates, from 6.6kbps up to 23.85kbps. Thanks to its wider frequency range, AMR-WB provides much better voice quality than the best narrowband vocoders operating at comparable or even higher data rates. For example, AMR-WB at 8.85kbps out performs traditional narrowband AMR at 12.2kbps in subjective voice quality tests.
AMR-WB is easy to implement when a call is placed from one mobile to another mobile, both of which are part of the same operator’s network. However, given the number of companies providing voice services – fixed line, mobile, over-the-net, corporate PBX (News - Alert)/Centrex, voicemail boxes, conference bridges, etc. – it will be years before all interconnections are based on standards that support high-definition voice.
In the meantime, many calls that originate as HD will pass through interconnections that are based on traditional 64kbps PCM transport. The near-end switch has no easy way to determine if the far end device supports AMR-WB. It will therefore have to filter the voice signal to 200-3400Hz and sample it at 8000 samples per second. The down sampling will negate all advantages of high-definition voice described above, even if the signal could later be converted back to AMR-WB. Since the AMR-WB signal, even at its highest data rates, would easily fit into a 64kbps PCM channel, this down sampling would be unnecessary, if there were a way for the
near-end gateway to query the far-end gateway as to its wideband capabilities.
Fortunately, there is a solution to this problem, based on an extension to a standard that was developed years ago to address a different problem – the rapid degradation in voice quality that results from passing a voice stream through various sets of narrowband encoders and decoders.
As mobile voice traffic is migrated over to packet-switched networks, HD voice can be carried end-to-end without the need for transcoding. However, in the short term, mobile operators that want to offer HD voice to more than a limited subscriber base have no choice but to enable tandem-free operation, or TFO, in their networks. In TFO, two mobile switches use in-band signaling to exchange information about the mobiles’ vocoder capabilities, select a common vocoder, and inform the two mobile devices. The narrowband voice can then bypass the network-side encoders and decoders, carried over the lowest significant bit(s) of the 64kbps PCM channel.
Most mobile vocoders use a block size of 20 milliseconds. Many, including narrowband AMR, also use a look-ahead buffer of 5 milliseconds. This means that every encoder must collect 20-25 milliseconds of data before applying the vocoder algorithm, which itself introduces a few milliseconds of delay. At the decoder, the entire 20-millisecond frame must arrive before the decoding process can begin. This introduces another delay, which could be significant or not,
depending on the speed of the interface and processor capabilities. Other sources of delay include interleaving, time division multiplexing, and queues and buffers in the transmission path.
Some users start to express dissatisfaction with speech quality when delay exceeds 150 milliseconds. In general, a one-way delay of 250 milliseconds is considered the absolute maximum that should be budgeted. While delay is a function of many factors, and varies from
network to network, having more than two pairs of encoders and decoders can easily bring the delay close to the ideal limit of 150 milliseconds. Every tandem vocoding stage that can be avoided reduces the overall delay by at least 25 to 50 milliseconds.
Another reason to avoid tandem vocoding is the cost of dedicated processing needed for real-time encoding and decoding of speech. Since real-time voice compression/decompression is very
processor intensive, the equipment manufacturer can realize significant cost savings by minimizing the need for this functionality in network equipment – costs savings that can be
passed to the operator. An AMR-WB vocoder requires twice the processing of a
narrowband AMR vocoder. Therefore, introducing AMR-WB without widespread use of TFO would require roughly a doubling of processing resources in the operator’s transcoder
rate adapter unit. On the other hand, TFO, even when applied to narrowband voice calls, will free up vocoder resources that can then be allocated to AMR-WB calls.
Enabling TFO in the network will bring with it additional benefits for the operator’s entire subscriber base. Once TFO is enabled, other equipment interfacing to the mobile network over circuit-switched connections can also take advantage of this feature by essentially pretending to be another TFO-capable mobile switch. This will allow the third-party equipment to receive the voice signal as it was originally encoded by the mobile handset, without the first-stage decoding. Since TFO signaling is in-band on the 64kbps PCM channel, any device connected to this channel can listen for TFO handshaking messages and respond accordingly.
In some applications, such as DCME on long-distance international links, the primary beneficiary will be narrowband voice users. Anyone who places frequent international calls to
mobile phones, especially in emerging markets, knows that voice quality can vary from barely acceptable to completely unintelligible. This is in part due to the third tandem vocoder on the international link, which can use quite aggressive compression rates, especially during peak hours. In other applications, such as conference bridges, HD voice users will benefit the most. With TFO, the conference bridge can signal the mobile network to not convert the voice stream
into PCM. If most of the conference participants are using HD voice-capable handsets, these participants will enjoy a much higher speech quality while one of them is speaking. A third application, and one where TFO can benefit both narrowband and HD voice users, is speech recognition. TFO will improve the reliability of speech recognition when the speaker is on a mobile handset.
These three cases are just some examples of the benefits of enabling TFO in the mobile network and adding this capability to third-party devices. Unfortunately, this has not been possible up until now because most operators have not yet enabled TFO in their networks. But with the introduction of high-definition voice in the form of AMR-WB, mobile operators will have no choice but to enable tandem-free operation. Once TFO has been enabled, even subscribers who don’t have AMR-WB handsets will benefit. Narrowband mobile-to-mobile calls will experience improved voice quality and lower delay. Manufacturers of third- party equipment such as DCME devices and media gateways, conference bridges, speech recognition platforms, and others can incorporate TFO into their products and offer an improved user experience. Finally, the mobile operators will benefit economically by making more efficient use of their existing codec hardware.
Aram Falsafi is a network consultant at Tellabs (News - Alert) (www.tellabs.com).
The above story is an excerpt of a longer piece, which can be found at http://voice-quality.tmcnet.com/topics/voice-quality/articles/120426-high-definition-voice-rollout-will-benefit-all-mobile.htm
TMCnet publishes expert commentary on various telecommunications, IT, call center, CRM and other technology-related topics. Are you an expert in one of these fields, and interested in having your perspective published on a site that gets several million unique visitors each month? Get in touch.
Edited by Stefania Viscusi