VoIP Logic (News - Alert) provides hosted PBX platform solutions to more than 30 service providers and, by and large, they all have the same leading concern – up time. Keep services to business customers running and you are very likely to keep business customers. Become associated with outages or other forms of poor service quality – whether these are your fault or not – and businesses rapidly go shopping for alternatives.
So what causes disrupted services, and what can you do ahead of time to minimize these problems?
Here is a brief primer on the top four causes of service problems and some common sense considerations to engineer into how you deliver the services.
Security:There is an increasing list of vulnerabilities that can be exposed by technology. Telephones and other devices or software can be cloned, passwords can be attacked with brute force login attempts, servers can be bombarded with denial of service attacks, media and signaling can be hi-jacked with man-in-the-middle attacks. There are technical answers to mitigate each of these gaps, though tightening security on your network can cause users to encounter frustration as they login from a new location, misremember passwords, or otherwise end up locked out by the systems meant to protect them.
Suggestion: Buy protection for current problems – phones that allow secure HTTP (HTTPs) authentication, session controllers that shut down denial of service, fraud monitoring software to identify and limit toll fraud exposure. Create a working group that meets quarterly to determine new threats and responses. Communicate what you are doing and why to your business customers – security mitigation should be a competitive advantage for your customers, not an annoyance.
Redundancy: Every component you use to deliver hosted PBX (News - Alert) services should have the technical option to be redundant, and you should always choose that option. Power does fail, servers stop working mysteriously and, as we are learning, catastrophic weather and other large scale natural (and man-made) phenomenon can impact your up time.
Suggestion: Buy three or four servers (or whatever type of equipment you are using) for every one that you want to use – one for primary live functionality, one for backup in the same geographic location, and one or two for failover to another geographic location. Operate dual power feeds, dual data center locations (geographic redundancy), and dual BGP Internet connections. Test that your redundancies work by performing failover during maintenance intervals, confirming your dual power sources do not both feed from the same generator, and make sure phones on your network respond properly to failures by using the redundancies available.
Network:The most common point of interruption is the network. There are increasingly expensive ways to protect your network from the core all the way out to the premises of the business you are serving. In addition, the carriers you choose to handle toll call origination and termination can undermine your best efforts to keep prices low by providing high latency, jittery, or otherwise unacceptable-for-business voice quality.
Suggestion: Choose quality network providers as your vendors. Historical quality and transparency to issues that might occur are important to understand. Operators that tell you they have had 100 percent up time should be prepared to back it up with detailed empirical data. Managed network beats public Internet if your business customers can afford it. If not, make sure they understand the trade-off. Quality of service monitoring is crucial to identify and resolve network issues so they do not recur and leave you powerless. Multiple options for access and egress call routing will serve you well both in operating your network and when explaining it to your customers.
Maintenance:The single largest cause of outages is self inflicted during software upgrades, hardware maintenance, and other scheduled events necessary to keep your machinery and services running with optimal performance.
Suggestion: Run lab systems of all of the hardware/software that deliver your hosted PBX services so you can practice and identify problems before touching production systems. Use a thorough process – or standard operating procedure – that double and triple checks potential primary and ancillary issues that could occur. These are complex integrated systems that require checklists and meticulous attention to detail. Take it slowly as you plan and execute software and hardware changes. Do not do this work under duress or a compressed timeframe, if at all possible. Choose trusted vendors and experienced personnel to be in command during maintenance.
I use the word defensible often when I consider the hosted PBX platform decisions that we make to maximize up time. Are we taking every step possible to minimize expected technical problems, network problems, problems of malicious intent, and problems of human error? If all of us are essentially perfect in our preparation, execution, and delivery, 99.999 percent is the goal (not 100 percent).
Edited by Maurice Nagle