Today's latency-sensitive applications and services - such as IPTV (News - Alert), VoIP, mobile video streaming services - impose stricter network requirements for low latency, as a slight increase may be harmful to productivity and customer satisfaction. For instance, overall latency allowed for transport and processing should be less than 50 milliseconds, meaning that latency growth of 5 milliseconds can equal 10 percent (!) of allowed latency.


Therefore, operators of broadband networks currently work to reduce network latency and improve quality of service primarily because of competitive necessity and consumer expectations.

The Origin of Latency: Back to the Roots

Considered a "silent killer," latency is not a local problem caused by a single device or application - but rather the result of a systemic, cumulative effect that can wreak havoc on networks when left unattended. To understand latency's origins, one needs to take a holistic look at the transaction processing path - spanning servers, networks and clients.

Servers: latency may be formed at the application-level as large amounts of user requests overload CPU, memory access time and disks. Server processing capacity becomes "heavier" - resulting in significantly slower response time. Additionally, latency may further increase due to the servers' operating systems as well as underlying hosting machines. Specifically, customers deploying server virtualization technologies can benefit from improved server utilization and greater cost savings. Still, these hosted applications compete for the same computing resources, which means increased latency.

Networks: today's network capacity is increasing - thanks to higher-capacity ISP links as well as the spread of 1GE, 10GE and even faster port speeds offered in the network infrastructure. In terms of latency, there is no clear improvement as it depends on the combined latency characteristics of the overall network infrastructure devices.

Clients: we see today more Web 2.0 and other complex, modern applications employing modern technologies such as AJAX, JSON, ATOM and invoking remote APIs. This means that a transaction is considered completed not only in terms of server processing, as some processing takes place explicitly on the client side -i.e. the collection of multiple objects presented to the user - or embedding several frames or panes in the same browser window that invoke additional requests - might increase transaction completion time and extend latency.

Recommended Practices to Effectively Reduce Latency

In spite of recognizing the importance of maintaining low-latency networks, one may think that increasing network bandwidth will address this necessity. However, to effectively reduce latency and shorten response time, the following practices - relevant to all of the layers discussed above - are recommended:

Servers: latency originating from either the server's application, OS or network stack, can be significantly reduced by offloading the processing functions from the server to dedicated devices - such as an application delivery controller (ADC (News - Alert)). By offloading SSL processing and serving cached content, ADCs reduce the number of requests needing to be processed by the server, reducing overall server-layer latency.

Network: employing application acceleration services such as compression, caching and TCP optimization, are vital to reduce the amount of traffic sent via the network. Specifically, for the remote branch offices of an enterprise and users, it is recommended to deploy a WAN optimization controller (WOC) and Content Delivery Networks (CDN) bringing the content "closer" to end-users - reducing latency. Also, smart activation of proxy services -typically deployed in mobile content networks -are only relevant to specific portions or types of traffic, is most important to control latency. Unlike web browsing, there's no value in searching for viruses in videos, so bypassing some inspection proxies can reduce latency and guarantee the quality of service of the video service.

Additionally, it is highly recommended to deploy low-latency network infrastructure devices - including switches, routers, firewalls, application delivery controllers - that guarantee low-latency even in extreme cases where traffic surges occur, whether throughput, transaction per second or session concurrency, ensuring the lowest latency.

Clients: it is highly recommended to leverage client-smart techniques allowing client browsers to run some of business logic, locally saving the need to communicate with the network. Additionally, storing application data on client machines and smart modifications of content sent out to slow clients - such as mobile devices - can also reduce processing thus reducing latency.


Amir Peles is Chief Technology Officer at Radware (News - Alert). To read more of his articles, please visit his columnist page.

Edited by Michael Dinan