A fundamental issue when building a PC-based telephony system is the question of sizing: What type of host server do I need to handle X number of users on VoIP and PSTN?
Very often, you end up by using a server that you know is oversized, and once up and running, you quickly confirm this by executing vstat and noting that the system never runs at over 20% of capacity. And what is wrong with that? After all, PC computing power is cheap, and for little more than $1,000, you can buy a powerful, multiprocessor machine.
It is true that the server cost is overshadowed by other costs, such as the cost of installation and configuration, so for those who are installing single, complex systems, the answer is pretty universal: Bigger is Better!
But if you are in the business of replicating products, for instance as a producer of standard Asterisk (News - Alert)ï¿½-based PBX (News - Alert) systems, then it does make a good deal of difference if you can save a few hundred dollars per system by right-sizing the server.
So, what is the nature of the load on, say, an Asterisk PBX or call center system, particularly as far as the PSTN access is concerned?
One of the characteristics of telephony, in general, is the huge amount of data that flows continuously through the system. For example, an 8-port E1 carries 240 calls when fully loaded. This amounts to about 32Mbps of changing data, in a continuous, never-ending stream. All of this is flowing through the system cache, which, in any case, is full of look-up tables and other fixed structures. So the cache hit rate is rather low, and the evidence is that the size of the cache has little bearing. So a Xeon and a Celeron may have very similar performance.
The result is that to a significant extent, the very fast processor that you purchased is throttled by the process of accessing the RAM on cache misses. And RAM access is rather slow. It is slow for a 500MHz PIII. It is almost as slow for a 3.5GHz Xeon.
So if your 1 GHz machine has a certain average system load when running a PBX application, do not expect that a 2GHz machine will have half the load. It will likely have a load that is significantly higher than half.
The figure below shows the main loads that you will find on a PC-based telephony project.
Assuming that there is a PSTN component to the system, there exists a base idling interrupt load that depends on the system size (see the figure accompanying this article). This base load occurs in the interrupt handler for the PSTN board and consists of the overhead of handling several hundred to several thousand interrupts per second, and moving the PSTN data on and off the board, plus some PSTN housekeeping functions.
The base load is proportional to the number of channels being handled, so that the base load for an Octal E1 card will be about 8 times that for a single E1. The base load is highly dependent on the way that the PSTN drivers are organized. Older drivers such as Zaptel were optimized to run perhaps a single T1 span, so it is not surprising that newer drivers have far lower base loads.
If the telephony system is making use of software-based echo cancellation, such as the routines provided in the Zaptel drivers used by Asterisk, then there is a very significant extra load inside the PSTN interrupt handler that is proportional to the number of active calls with echo cancellation. Of course, not all systems require echo cancellation. If there is no VoIP component, then usually echo cancellation of any kind is unnecessary. If the PSTN card itself has on-board echo cancellation, then the echo cancellation load on the server vanishes.
The vital point about the base load and the echo cancellation load is that they occur inside the PSTN interrupt handler. In most soft telephony applications under Linux, the interrupt service routine is written as a single, very long hardware and traffic handler. This type of interrupt handling is frowned upon, and, in fact, is illegal in the Windows world. One is supposed to do the absolute minimum in the interrupt handler and set a flag so that the bulk of the data handling is done in a deferred procedure by the operating system. The architecture of the current Linux drivers means that even if you have multiple processors, you get no help at all with the interrupt handler: One processor handles the interrupt all by itself.
The combination of poor cache performance and single processor interrupt handling can really limit performance. For instance, the system load with an Octal E1 is simply too large for a single processor, unless there is no software echo cancellation required, and thus, no matter how many processors you have or how fast they are, this arrangement is impractical. The result is that almost all 8-port T1/E1 cards are supplied complete with onboard echo cancellation.
In contrast, the application load of switching, controlling, repackaging as VoIP and mixing of voice streams (pink area on the figure) is a ï¿½normalï¿½ program that multiprocessor systems can deal with very well. Typically it will be handled by a separate processor, if available.
The transcoding load (green area on the figure) is also one that can be distributed amongst processors by the operating system. This is a good thing, because software compression/decompression between G.711 and G.729, G.723, GSM, etc. can absorb enormous system resources, often comprising the major load on a system.
In short, the problem of sizing is not straightforward, and many of the results of testing are non-intuitive. In the future we expect to be able to do quite a bit more with less, simply by writing better device drivers and getting a little help from the hardware. IT
David Mandelstam is President of Sangoma (News - Alert) Technologies (news - alerts).