×

TMCnet
ITEXPO begins in:   New Coverage :  Asterisk  |  Fax Software  |  SIP Phones  |  Small Cells
 
July 2006
Volume 1 / Number 4
 

Identifying and Eliminating Common Single Points of Failure in Voice Services Architectures

The telecom industry continues its steady, impressive progress towards achieving the full potential of IP services deployment and adoption, but is it in danger of becoming a victim of its own success? With Tier 1s and Mezzanine providers increasingly scaling and evolving IP services, the goal of achieving TDM-like resiliency and recovery speeds is becoming an ever more urgent one to reach. Highly scaled, geographically disperse deployments are on the rise and providers are looking to price these services in competitive parity with other, more proven telco-class services. But, doing so means first rationalizing exposure to service outages and resulting SLA risks, a goal that hasn’t yet been realized. Until now, even mundane and comparatively common IP network component failures disrupt services and trigger penalties; and recovery from a major disruption event, such as the failure of an entire service point of presence (POP), triggers a recovery cycle of unacceptable duration.

Entry-level table stakes for IP voice services scalability is, of course, redundancy. Physical redundancy of servers and N+1 configurations have been dominant characteristics of TDM architectures since the world was discovered to be round. But, services are now abstracted out and physical redundancies aren’t sufficient. SIP enables an architecture that is flexible enough, because all the control logic is separate from the logic, so that availability of service features and preservation of call states can be separately programmed.

There is, of course, light years of difference between “can be” and “are.” There is no guarantee that service creation environment and IP voice services developers have implemented HA features to leverage this model. In fact, you may find out what’s missing in high availability capabilities only by finding out what’s not getting to your customers: their services. For most network operators, that’s a far too costly discovery process.




Whatever else it was, TDM was safe. If a TDM architecture had the necessary redundant components and failover capabilities, it was readily apparent. From a network operator’s viewpoint and the PSTN standpoint, it was closed and secure.

In contrast, the stability of IP voice service applications — and the services they drive — depends on the particulars of a given IP network configuration, which may or may not be secure, performance-optimized, or failure-resilient; individual architectures may or may not themselves be market-hardened. An individual IP service architecture’s potential service capability provides no indications as to whether it also has it has markethardened failure recovery procedures and use cases that one would encounter.

For example, a single-site deployment may want to protect against an application server session failure. Class 5 switches have redundant line card capabilities in which service providers may choose to invest. That level of redundancy is also required to ensure that subscribers can maintain server interaction in the event of any failure at the application service network level and is especially important for collaborative applications where the network must maintain multiple simultaneous subscriber session states to stay intact, such as conferencing or prepaid calls, once the subscriber is speaking to the called party. What is the true cost of allowing these sessions to terminate because the application server has failed?

The best IP call session recovery model is based on a single IP voice session strategy to protect all services. Once a services broker or any intermediary is introduced, state problems can arise between applications and fundamental, expensive service integrity interruptions if the service broker itself fails.

Separating Service Architectures: Is Session Integrity Lost in the Stovepipe?

Proposed architectures regularly feature multiple applications servers, running separate applications, controlled by a service broker or proxy. In principle, this model seems to be a good fit for best of breed solutions, but it doesn’t embody a unified, high-availability architecture. Heartbeat protocols between service brokers, application brokers, and databases maintained for singlesubscriber models also aren’t addressed in this architecture. Most importantly, separate service stovepipes architectures don’t incorporate essential cross-over reporting and recovery capabilities among service applications and network elements for one reason: They can’t.

Call state preservation is lost: Ensuring that SIP call signaling remains intact during an application server failure requires the duplication of call sessions and implementation of basic — but, until now, missing — notification and failover procedures in real time.

Unified awareness of network status is absent: In multisite networks — a common scenario for Tier 1 providers — there must be full application awareness of any failure, such as RTP streams, participants, call legs, which audio channels on the media server, who’s the moderator, etc.

Ongoing network integrity: Interoperability among various applications servers, applications, and proxies is an objective, but not yet a given reality. Is the SIP protocol sufficiently standardized to guarantee this interoperability? Not yet. Are standards bodies and competing vendor bureaucracies up to the task of solving and guaranteeing interoperability? Good questions, for which every vendor has standard assurances. But, why risk it, when a unified multi-service architecture enables both economical business model expansion and ongoing service stability and integrity?

So far, only the first tentative steps of vendor interoperability have been proven, but it’s too soon to become excited. For an example of how interoperability may progress, look at the evolution of database technology. Before the SQL standard drove various vendors’ access to data, every vendor had its own methods for accessing databases and maintaining transaction-level data integrity. More than two decades after the first major initiatives and shakeouts, interoperability at the SQL level is all that’s been achieved. History teaches us that we’re likely to achieve interoperability at the application level itself to enable issuance of service invites, but reliability and recovery are likely to remain unsolved for quite a while. This is why many established carriers have continued to balk at offering IP-based conferencing services, despite their obvious economic, flexibility, and innovation advantages.

What’s needed is 100% software-based high availability features for service applications and their underlying architecture, which will allow new, more intelligent management and resiliency across multiple, geographically dispersed service points of presence simultaneously. This service application model must:

  • Load balance service-enabling resources, including application server and media server resources, across multiple geographically-dispersed service POPs;
  • Re-route call traffic across multiple sites around media gateway, application server, and media server failures;
  • Provide intelligent, automatic failover of database processing to alternative service POPs; and
  • Re-route all call processing to alternative service POPs in the event of primary site failure.

While this may seem ambitious, IP voice service developers are achieving these levels of service-embedded intelligence, enabling service providers to sustain both overall service call capacity and individual call states in the event of a myriad of common and extraordinary IP component/network failure conditions. This smarter SIP service architecture also features several side benefits, such as allowing service providers to replace complex, delay-centric system management intervention with automated, real-time call re-routing It also enables the optimization of ongoing utilization of all service-enabling resources and investments.

Such smarter, unified SIP services architectures are counteracting IP’s inherent single points of voice service failure and enabling large carriers to eliminate their most likely sources of potential call state disruption.

Ken Osowski is VP of Product Management & Marketing at Pactolus Communications Software. (news - alert)For more information, please visit Pactolus online at http://www.pactolus.com.

Return to Table Contents


Today @ TMC
Upcoming Events
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
Austin, Texas
MSPWorld
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Austin, Texas
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Austin, Texas