May 1999
Use And Misuse Of Call Center Performance Measures
BY ED POWERS, QUALITY ALLIANCES
In most call centers, data are abundant and inexpensive. Modern CTI tools churn out
reams of statistics on call volumes, length of calls, number of open call records, wait
times and various other measures. A common management practice is to use these data to
evaluate individual agents, basing compensation and incentives on their relative
performance. While intentions are honorable in this pursuit, practical applications lead
to high costs, bad morale and lowered performance. The assumptions upon which this
practice is founded are deeply flawed.
To illustrate this point, Figure 1 shows data
gathered on individual agents. In this case, the data are the number of monthly closed
orders, but any statistic deemed important can be used. Results are compared on a monthly
basis. For example, agents with top results in May (such as Carol and Henry) are
commended, while agents with lower numbers (Judy and Dorothy) receive extra
"coaching." Over a period of time, averages are computed for each agent, and on
a yearly basis, wage increases are awarded based on relative ranking. The top performers
are invited to attend a prestigious company awards dinner.
This "carrot and stick" approach is widely practiced in management circles.
The reasoning is simple: improve business results by rewarding people who perform well and
punish those who do not. After all, motivation drives results. This method is assumed to
be the best way to manage people, consistent with the well-known concept of management by
objective (MBO).
In the trenches, however, things are much different. Some agents compete fiercely with
one another for top honors, working independently and hoarding their knowledge. Others who
cannot make their numbers are demoralized, dreading invitations to meet in the
supervisor's office. Still others are confused and buffeted by a system that seems one day
to reward and other days to punish employees who are working at a consistent pace.
Quality guru W. Edwards Deming described schemes such as these as, "simple,
obvious and wrong." While intentions are good, the results are chaos, confusion and
ill will. Deming pointed to a lack of what he called "profound knowledge": a
deeper understanding of the true nature of systems, variation, learning and psychology.
Quality theory asserts that the management system described above is based on errant
thinking and reinforced with poor mathematics. Implicit in the system is the belief that
overall results are achieved through the sum of individual efforts, when in fact, numerous
studies have shown that synergistic teams dramatically outperform individuals. Also
assumed is that variation is caused exclusively by differences in people, when processes
also vary due to other factors, such as differences in equipment, information, methods,
inputs and environment. Finally, expecting to identify meaningful differences between
people by simply averaging and ranking the results shows that managers lack a basic
knowledge of statistics.
To prove that the math is wrong, it is simple to conduct an experiment using a coin.
One expects that flipping a coin will result in either heads or tails, and if the coin is
"fair," the probability of getting either one is 0.5, or 50 percent. If one were
to flip the coin 50 times, one would expect to get heads 25 times. By conducting the
experiment, however, the results can vary, with the outcome being perhaps 16, 22 or 30
heads. What does this mean? According to the binomial distribution, heads can occur
between 14 and 36 times before there are grounds to suspect anything unusual with the coin
or the manner in which it is being tossed. Outcomes are due to natural randomness in the
system from minute changes in airflow, muscle contraction, spin velocity and any number of
other known and unknown factors. The probability of obtaining heads is 0.5 only if the
coin is flipped an infinite number of times, allowing minor differences on each flip to be
"averaged out."
Of course, it would be silly to blame outcomes on an inanimate object such as a coin.
Getting more heads does not mean the coin worked any harder to get them, just as getting
fewer does not mean the coin was slacking on the job. One would not pay a coin a
compliment or dock its pay for the results it gives. The laws of physics are clear on this
point - variation is a natural part of the universe. All measurements must be considered
in the context of the system that produced them to avoid confusing the "signal"
with the "noise."
Let's refer back to the example in Figure 1. Rather than computing
individual averages, each data point is plotted using a statistical tool called a control
chart, shown in Figure 2. On the left is the frequency distribution,
showing the shape of a standard normal distribution, the "bell curve." This
shape, along with the fact that none of the data points exceed the +/- 3 sigma limit
lines, proves with 99 percent confidence that the variation in the system is due to
randomness. Even though computing averages for each agent is easier to do, any comparisons
between agents are meaningless; chance is the best explanation for any differences. Just
like the coin flipping experiment, conclusions about human behavior must be taken within
the context of normal systemic variation. Data-driven reward systems that do not
accommodate for chance are as effective as holding lotteries.
A better system of management is one based on "profound knowledge."
Recognizing that there is a common process used by all agents, the task is then to
systematically improve the process. Causes of variation are examined and steps taken to
improve the process, increasing the overall performance and making results more consistent
among agents. Teams are formed to share best practices on a regular basis, using synergy
to brainstorm and experiment with new ways to sell more. All agents, not just those who
devise better ways on their own, improve performance. Using quality methods and building a
culture of cooperation instead of competition leads to improved quality, higher employee
morale and better business results.
Even after applying quality principles, there will always be differences in individual
skills and motivation. Some people may be in jobs that are just not suited for them. On
the other hand, some agents occasionally deserve special recognition. If data are to be
used for making these decisions, more advanced techniques should be employed before
drawing conclusions. Analysis of Variation (ANOVA) compares means and
reduces the effects of random error. Figure 3 shows the results of proper
ANOVA, and once again, there are no significant differences between agents, even between
Bob and Dorothy, who represent the high and low averages. The level of uncertainty for
this data set is p=0.396 or only 60 percent confidence that there are any real
differences. Looking at average scores is only marginally better than flipping a coin to
determine who is the better performer, Bob or Dorothy.
Call center managers need to look more closely at their underlying assumptions and how
they use performance data. Doing anything less means leaving success up to chance.
Ed Powers is president of Quality Alliances, a firm that provides quality
management, outsourcing and supplier management services tailored for marketing, sales and
support organizations. He has 12 years of sales, marketing and quality management
experience with Hewlett-Packard.
|