TMCnet - World's Largest Communications and Technology Community




FeatureArticle.gif (14230 bytes)
November 1998

Carrier-Class IP Telephony: What Will It Take?


IP telephony has changed dramatically over the last two to three years. It has migrated from the realm of hobbyists making "free" phone calls via the Internet, to the corporate boardrooms of all major telecommunications service providers. Although the fundamental principles have stayed the same, this migration has imposed a paradigm shift with the IP telephony equipment vendors.

A visualization of this paradigm shift is the "Carrier-class" label attached to most IP telephony products and services. It creates images of big, rock-hard networks operating flawlessly day in and day out. But beyond these images, what does "carrier-class" really mean? This article examines the three quality characteristics that define a carrier-class system: Scalability, Interoperability, and Reliability.

Scalability is a figure of merit that describes how cost-effective a system is over a specified capacity range. Phrases like "It doesn't scale very well," or "It scales well on the low end," are common. In general, "high scalability" means a system supports a wide dynamic capacity range in a cost-effective manner. Therefore, cost-effectiveness does not simply mean low equipment cost. It also includes the cost of items such as installation, maintenance, operating expenses, and capacity changes.

Carrier-class IP-telephony systems must be highly scalable and are expected to cover capacities from a few thousand calls to potentially several hundred thousand calls. The dynamic property of scalability addresses the system's ability to cost-effectively change capacity after the initial installation. This is particularly important for IP telephony systems, since it is likely they will initially be installed with limited capacity, followed by rapid deployment.

During the introduction phase, the system's operation is verified, personnel are trained, and maintenance processes are implemented. The system's capacity is then rapidly increased during the deployment phase as older systems are removed from service. The system eventually enters the general availability phase where capacity changes are small and infrequent. Finally, when the system reaches the end of its useful life, its capacity is reduced as newer systems are introduced.

The rapid changes in system capacity during the deployment and end of service phases represent the greatest risks. This is when a system's scalability is most critical since thousands of customers can be affected as their service is switched from one system to another. Carrier-class scalability is much more than just the cost of adding equipment -- it considers all of the costs associated with safely and efficiently changing capacity with minimal or no disruption of service - and to effectively do that, a powerful network management system is needed.

Interoperability is sometimes defined as the ability of competing vendors' equipment to establish a call from a signaling perspective, and as such, is clearly a must for carrier-class solutions. However, we prefer a broader definition that encompasses all features needed for competing vendors' equipment; to not just coexist in the same network, but to successfully share the responsibility of providing services for a particular call. Thus, common network models must be developed that specify the network elements needed. And, well-defined protocols must exist for interconnecting these elements. Even more importantly, common network management models must exist, such that calls handled by one part of the network can seamlessly be transferred to another in case of network or equipment failure.

Network management is also crucial in handling the inevitable congestion problems, which, for large carriers using IP trunks in the core of their networks, is a particularly thorny issue. The IP protocol was designed to maximize the average utilization of the network, but puts no bounds on the worst-case performance, which is the gauging factor for voice communication. The new IPv6 protocol incorporates a first attempt at providing Quality of Service (QoS) features within the protocol by adding a priority and a flow-label field to the packet header - but effective use of these fields will require stringent agreement between vendors and thus presents new interoperability issues.

Security is yet another important component of interoperability. Unauthorized access, or worse -- directly pernicious attack - costs service companies untold amounts of lost revenue. To effectively combat these problems, each equipment element, or network segment, must be able to authenticate other elements or segments to spot unwanted intruders. Within the ITU H.323 standards umbrella, security issues are being addressed within the H.235 specification. This work is still in its infancy, but is supplying the framework for implementing specific security profiles.

Reliability is perhaps the most notable characteristic of carrier-class systems. It's the probability that a system does not fail during a given period of time. The scope of reliability can be extremely broad. It includes everything from component reliability, manufacturing materials, and processes to infant mortality rates, and software quality management systems. But taken as a whole, these seemingly endless component-level requirements contribute to one primary system-level objective: Minimize failures and their effects on call handling ability. System-level reliability requirements are specified as availability objectives, which define limits on the amount of time call handling ability can be affected.

When a system is originating, terminating, or carrying calls it's available. Availability is the probability that a system is available at any given time, and is normally expressed as a percent. Its complement, downtime, is the amount of time the system is not available. In contrast, downtime is expressed in minutes per year. Any event or activity that prevents the system from operating at its specified capacity reduces its availability. This includes hardware and software failures as well as maintenance activities.

Downtimes are weighted sums that include contributions from all failures (or potential failures) that affect service. For example, consider a 100-port system where a single analog line fails to operate for 9 minutes/year. The downtime objective for that line has been met (<18 min/year) and the contribution to the partial system outage time is 1 percent (1 line of 100 total) of 9 minutes or 5.4 seconds. If routine testing were performed on the same line, during which time calls can neither be originated or received, the test time would also contribute to its downtime.

Minimizing system downtime is clearly a key goal within a carrier-class system. If the system is down, customers are not being served and revenues are not being generated. In order to meet the availability objectives, carrier-class systems must be both fault tolerant and maintainable. Fault tolerance strategies focus on preventing failures from affecting service. Maintenance strategies assume service will ultimately be affected and network management models therefore focus on minimizing the time it takes to restore service and repair failures.

Fault Tolerance
Even some of the highest-quality products fail, often during times of stress when we need them the most. Fault tolerant systems seek to minimize the effects component failures have on service capacity. Carrier-class systems, not surprisingly, employ many traditional redundancy techniques to achieve high levels of fault tolerance, but the ones that are most unique are load sharing, load balancing, and diversity. Each of these techniques are, in some sense, forms of redundancy, and each are designed to minimize the number of ineffective call attempts and/or the call cutoff rate.

Load sharing is primarily focused on reducing ineffective call attempts. It does this by sharing service-related resources such that a failure in any one or more of the resources will not necessarily prevent a call from being completed. An example of load sharing is an IP telephony system that includes a pool of codecs that are shared among all of its analog lines. If there were enough codecs to handle the system's peak load, a single codec failure would only affect service during peak load periods. During average load periods, service would remain unaffected, since any one of the remaining functional codecs may be used. Clearly then, if there were more codecs than necessary, the system could suffer the loss of several codecs before service was affected. However, if the system is designed such that a particular codec serves a single line, a failure in that codec would prevent all calls on that line from being completed.

Load balancing attempts to minimize the call cutoff rate by balancing the number of active calls across independent platforms. If a particular platform were to fail, only a portion of the active calls would be affected. An IP telephony system for example, could balance calls among multiple "media gateways" that contain the codecs of the previous example. One of the side effects of load balancing is "graceful degradation," which simply means that the system gracefully loses capability as failures occur rather than being "dragged to its knees."

With typical availability targets of 99.999 percent or better (less than six minutes of downtime per year), a carrier-class system must include robust support for maintenance activities such as problem detection, notification, isolation, and repair, as well as service recovery. Maintainability is a reliability attribute that describes how well a system supports these maintenance activities.

Given that failures are inevitable, carrier-class systems must be able to detect problems quickly -- ideally before service is affected. Therefore, any equipment failures that can potentially cause 1 percent or more of a system's capacity to be affected must be continuously monitored. This includes operational as well as standby equipment. A 10,000-port IP telephony gateway for example, would have to continuously monitor a line interface unit if it supported more than four T1 facilities - a scenario, which is highly likely.

Once a problem is detected, maintenance personnel must be informed in order to effect repair. The network management system does this by communicating with an operations system or "OS" that in turn notifies them of the problem. In addition to notification, the network management system must also automatically isolate the problem and provide a prioritized list of replaceable components where the problem is most likely located. The problem must then be repaired and service restored within some set Mean Repair Times.

The IP telephony industry has come a long way in a short period of time, but there are still many unresolved issues. Customers have come to expect that their phone works with no "ifs, ands, or buts" and are going to expect the same service in the future. To provide such a solution, the equipment vendors must solve the problems addressed in this article, and a solid network management foundation is clearly the nucleus of the overall solution.

Scott McNutt, systems engineer and Dr. Henrik Sorenson, VP of Advanced Technology are part of the advanced technology products team at elemedia, a wholly-owned software venture of Lucent Technologies. elemedia is a leading provider of H.323-based software toolkits that enable high-quality solutions for Internet telephony and multimedia communications. Developed by engineers with years of experience in the technologies required for sophisticated telephony networks, elemedia's products link today's networks with tomorrow's while promoting standards and interoperability. For more information on elemedia, visit the company's Web site at www.elemedia.com. 


Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: tmc@tmcnet.com.
Comments about this site: webmaster@tmcnet.com.


© 2020 Technology Marketing Corporation. All rights reserved | Privacy Policy