Audio Over IP and the SIP Environment

Feature Articles

Audio Over IP and the SIP Environment

By TMCnet Special Guest
Johannes Rietschel
  |  September 01, 2011

This article originally appeared in the Sept. 2011 issue of INTERNET TELEPHONY

The network-everywhere methodology has infiltrated our lives. Most of us are attempting to negotiate a network connection at some point every day. This is the very nature of session initiation protocol, or SIP as we fondly know it, which primarily exists for the purpose of negotiating a voice or video connection between two or more points. 

SIP is best known in the business world for the simple purpose of signaling phone calls in the IP domain. What is perhaps less often understood is the fact that SIP has little to do with the audio transfer itself. SIP carries out the call set up negotiation between two or more devices that wish to be in contact at the initiation phase, signaling call events (transfer requests, for example) during the connection, and call termination. SIP does not play a role in the actual audio/video delivery.

In the SIP environment, audio carriage is almost always carried out using real-time protocol, with parameters and details being negotiated using SIP. The combined power of SIP and RTP can support VoIP as well video connections and audio contribution to studio environments. From this perspective, SIP provides a flexible architecture to enable device communication, mainly for point-to-point connections.

The role of audio over IP in the SIP universe can be quite interesting, and very useful for mixing applications such as paging and public address with phone integration and value-added services like background music and priority communications. Audio over IP goes beyond session initiation to network-based audio distribution. It can be connection-oriented or always present for unending audio streams.

SIP experts are concerned about bridges, proxies and central servers, for the purpose of understanding how everything in the system talks to each other. This is instrumental in ensuring that temporary sessions are established via the protocol. Speaking strictly for voice, it is expected that SIP acts as the traffic cop for frequent connections, disconnections, busy signals and call re-routing. This is because phone systems must react to commands.

This is mostly not the case with audio over IP distribution.

Making Connections

Our company comes from the audio over IP space. In many cases, point-to-point connection initiation is trivial in our world: We switch on the send and receive devices to establish a connection, and switch them off when we wish to shut down the stream. The audio stream is configured, and audio is sent from one point to another. The sender is known, the receiver is known, with recognition of two valid IP addresses to support the connection. Quality of delivery may be higher rated than delay, especially when routing over the Internet.

Audio over IP distribution also fits well scenarios with delivery to multiple destination points. An office environment might want to have a paging console that can reach various groups, zones or buildings – while also supporting all-call scenarios. This is easily configured using an IP-based paging console, where voice and other audio (such as background music or music-on-hold services) can be streamed to multiple points.

Merging SIP and audio over IP in the office environment, or other multi-point scenarios such as hotels and retail settings, delivers a more powerful and comprehensive solution for communications. The integration of a SIP server into an audio over IP distribution network establishes a mediation point with central routing intelligence. This enables the system to recognize that a call is being targeted for a group, while also identifying destinations that are unable to accept such a request do to being busy, offline or otherwise unavailable.

On the other hand, adding dependency of a central server also has downsides. Call set up using a SIP server can be slow compared to direct streaming approaches.

Another strong point is that SIP can negotiate choice of codec within an audio over IP network. SIP devices typically allow negotiations of the intermittent codec used on a per-connection basis, chiefly because bandwidth may be more important than audio quality in telephone communications. In other cases, a communication partner may not support the standard codec used for most connections. Audio over IP devices are intended to transmit music of very high quality, intelligible announcements and other audio, such as tones and bells, that are quality-critical, and thus typically use different codecs than those used for VoIP/phone calls.

The session initiation protocol can automatically negotiate the use of a matching, VoIP-supported codec when SIP-capable devices are operating within the audio over IP domain. Although naturally inclined to use high-quality audio codecs, the SIP-enabled audio over IP device will accept 8kHz voice quality if that is what is provided by the VoIP system.

SIP Gateway (News - Alert) for Audio over IP

The introduction of a SIP gateway is the most effective way to merge audio over IP distribution with an already established SIP-based phone system. This is most useful when you are directing background music, paging and public address to loud-speaker systems as opposed to simply routing phone calls.

Let’s assume that you have 100 devices in an office-wide communications system, with background music streaming in high quality, MP3 encoding, to each device from 10 channels. This can be easily accomplished without any need for SIP by using 10 multicast senders and 100 receivers with local control for channel selection.

Let's further assume that you want to make announcements to these 100 devices from your SIP- based phone system. This traditionally means that each device must be subscribed to the PBX (News - Alert) system. The PBX needs to talk to each device to communicate it wants to send an announcement, while also negotiating the codec. This can take some time before all the devices in the system are online and willing to listen. And it can be costly to add 100 telephone licenses to the PBX.

The gateway solution allows one device to translate a unicast SIP call to a broadcast/multicast in the audio over IP domain, for purposes such as group addressing. This allows use of the same codec to distribute a multicast announcement that the receiving devices understand.  

A quality SIP gateway solution will also demand an access code to protect against unauthorized use. Security personnel roaming the building can use any office phone, dial the gateway and unlock the announcement capability on any capable device. Personnel can then select a group or zone, and the SIP gateway will translate the information into the audio over IP domain, enabling immediate, priority-based stream delivery to the audio over IP devices in the network.

The SIP gateway can also be programmed to prioritize announcements over other audio stream sources. The session initiation protocol will translate the priority announcement request only to the targeted groups or zones, allowing the general audio over IP streaming to continue uninterrupted in all other areas. This in essence is what it means when comparing the audio over IP and SIP concepts: a continuous stream, zoning and priority compared to temporary, exclusive connections.

Generally speaking, by using a SIP-based approach for telephone routing and communications, the user can simply add some audio over IP devices as extensions to the PBX, provided they speak SIP. Existing paging systems can easily be added using just one SIP-enabled audio over IP device, perhaps with intelligent relays to select target zones. Premium devices can even output MP3 or AACplus background music while giving priority to SIP-based announcements.  A SIP gateway makes sense as soon as group/zone announcements at various priority levels must be implemented into a (multi-channel) audio distribution system.

The combination of audio over IP and SIP is a perfect example of how users can take advantage of integrating multiple applications over a single network infrastructure.  Audio over IP allows offices, commercial businesses and hospitality applications to stream high-quality audio and use a zone/priority-based concept. SIP is ideal for temporary voice connections, "phone style.” Integrating both creates a solution that provides the best of both worlds.

Johannes Rietschel is CEO and founder of Barix AG (

TMCnet publishes expert commentary on various telecommunications, IT, call center, CRM and other technology-related topics. Are you an expert in one of these fields, and interested in having your perspective published on a site that gets several million unique visitors each month? Get in touch.

Edited by Stefania Viscusi


Sign up for our free weekly Internet Telephony Newsletter!

Get the latest expert news, reviews & resources. Tailored specifically for VoIP and IP Communications.