Finding The Holy Grail Of Truth In Your Recorded Call Data

By Anna Convery
Nexidia

Advanced recording technologies are a key investment for today’s businesses. These technologies provide a way to capture customer interactions in their entirety, in context. Within this recorded audio is some of the most valuable business performance intelligence any business can have: the truth of your customer interactions.

Whether you measure upsell or cross-sell percentages, call wait times, average call handling time or any of dozens of other performance indicators, your goal is to obtain an objective measure of success. Ultimately, the real truth about the value of your customer interactions can be gleaned only from the spoken words that constitute the actual interaction. All other interpretations are subjective; if you really want to understand what’s going on, you have to go back to the audio data.

Take, for example, average call handling time. For many companies, this may be relied on as a key metric, but it can be a poor performance indicator in situations such as online procurement. A company with short call handling times, for example, might decide it’s not seeing the sales performance it expected. It’s not until that company analyzed its auditory data that it found out why. In order to perform to the metrics, the company’s reps were taking the sale and then hanging up to shorten the call time. In the process, the agents were failing to verify the sale while they were on the call. As a result, when a sale didn’t go through, the rep would have to call back and reacquire the customer data. Although this mode of operation might result in shorter call handling times, it significantly raises the overhead associated with each sale and erodes profitability. It took an analysis of the actual customer interactions to reveal the disconnect.

While audio data are much richer than what’s captured in traditional reporting metrics, the information’s value is a two-edged sword: The more comprehensive the recordings, the more actionable intelligence these recordings can provide for making strategic business decisions. At the same time, the more audio data that are gathered, the more critical and challenging becomes the task of extracting and analyzing that intelligence; that is, of getting to the real truth of customer interactions — the “holy grail” of call recording — in a way that is effective and efficient. The only way to unlock the value and get a good return on your recoding investment is through audio mining, or speech analytics.

There are a number of different definitions, aspects and approaches to speech analytics, but the key to making this technology pay off for maximum return on investment is to be sure the method you use can access all of your recorded data, accurately, quickly and in a cost-effective and flexible manner.

Evaluating Speech Analytics
There are two main categories of audio mining technologies that employ different types of speech recognition. The first, speech-to-text (STT), relies on a dictionary-based approach. This method maps all words or phrases from the recorded audio into lexicon entries, converting them into text to create a searchable index. This mapping is based on a predefined dictionary of key words and phrases, incorporating advance decisions about appropriate word bindings. The key to effective STT analytics is to ensure that your dictionary contains the right key words and phrases and all of their variations. The inherent flexibility of the human language can make this challenging, and proper names or unusual phrases can cause inaccurate or incomplete results. Some speech-to-text systems address this issue by introducing semantics-based constraints; i.e., the probability of word sequences. While this improves the accuracy of the dictionary-dependent approaches, it can extend processing time — a significant impact in a situation where you are indexing and searching hundreds or even thousands of hours of recorded audio.

Unlike the speech-to-text approach, the second type of speech analytics, phonetic audio mining, processes recorded audio with a phonetic recognizer to generate the index file. With this method, search terms or phrases are converted into a phonetic sequence (phonemes are the individual sounds that make up the spoken word) and matches for this phonetic sequence are retrieved from the phonetic index files. In general, this method offers a number of advantages over STT. First, it’s faster. STT systems can pre-process audio data (create a searchable index) at two to three times faster than real time (the time it would take to play the audio at normal speed). The best phonetic search technologies, on the other hand, can pre-process the data at a rate more than 60 times faster than real time.

There is another drawback to the complex language model required for STT approaches to generate good search results: the need to re-index recorded audio every time a new word or term is introduced. If the word was not incorporated into the lexicon against which the recorded audio was processed — even if it was spoken many times throughout the recordings — it cannot be located by subsequent STT searches. Therefore, in order to add the term to the searchable index, it will have to be added to the lexicon, and all of the recorded audio re-processed. Phonetic speech analytics, on the other hand, maintain an open vocabulary, since they operate at the level of spoken sounds, not words. This means you search indexed audio for new terms as needed, without having to re-process the original recordings.

Are We There Yet?
As with any technology investment, one of the key determinants of time-to-value for speech analytics is how quickly you can go from installation to productivity. This involves the tasks that must happen in terms of implementation, customization and learning curve before the solution delivers you at your destination — that “aha!” moment when the audio you have recorded actually emerges as actionable business intelligence.

With a phonetic approach, you can begin indexing your audio immediately, because there is no need to pre-define special terms or phrases such as proper names, brand names, acronyms or slang terms to be able to find them in the processed audio. STT systems, on the other hand, require that you add such terms to the lexicon prior to processing if you wish to be able to search for them. If you choose the STT approach, be sure to evaluate what is required in terms of a user learning curve in order to manage this customization and if it will be necessary to engage the STT vendor each time you update the lexicon.

Accents and dialects are another area where a phonetic-based solution will give you an advantage. Because this type of software searches on sounds, not text, it is typically able to search effectively regardless of the speaker’s accent, dialect or speaking style.

The Needle In The Haystack
The relevance of returned search results has a significant impact on their usefulness. The surest way to find a needle in a haystack is to remove everything that isn’t a needle. Look for the ability to search for specific words and phrases in proximity to other content to generate the most relevant results. For example, a manufacturer of pharmaceuticals may want to identify any occurrence where “package” or “bottle” occurs in proximity to “open” in all toll-free number calls that make reference to the company’s arthritis medication. Such a contextual search will help them spot and correct a packaging problem that makes it hard for their customers to use their product before it translates into a drop in market share. Robust contextual search capabilities will help narrow the range of returned data and avoid unintended and insignificant results. Specific contextual searches are most often required by product managers, quality assurance personnel, compliance officers, manufacturing and marketing managers. Therefore, in selecting a speech analytics system, be sure that both power users and casual users can quickly and easily perform ad-hoc contextual searches on large sets of audio data to maximize the usability and accessibility of the results.

Usability will also be greatly enhanced by a speech analytics solution that provides sophisticated user reporting capabilities. Metadata — the data about the information that helps to classify and structure search results — should be presented through a flexible reporting interface that enables users to easily manipulate search results for the most meaningful presentation of the data. This allows recorded calls to be easily grouped for analysis based on your specific business model. The interface should facilitate the sorting of the information to receive true multi-dimensional views of the intelligence contained in the recorded audio.

The next step in your search for the truth of your customer interactions will ultimately ensure the accurate analysis of the data: human listening. Your goal with the implementation of advanced recording technologies is to capture the actual spoken interactions between your customers and your representatives. Speech analytics gives you the ability to pinpoint the precise audio data you are looking for within hundreds or thousands of hours of recorded calls. In addition to creating reports and metrics based on those data, your speech analytics system should also make it easy for users to drill down to the referenced audio files at any stage of the analysis. This way, the audio can immediately be played and the result can be listened to within the context of the original file. This brings into play the uniquely discerning human listening skills that no technology can completely duplicate. For this reason, it is very important that your system enables easy access to the original recorded files.

It is also important to look for a search engine that can deliver relevant, accurate data even from poor-quality audio such as cell phones. There is little benefit in faithfully recording every customer call if you can ultimately analyze only those recordings that are clear and crisp.

Putting Analysis Into Action
There are a lot of reasons why you should record your customer interactions, but the leading driver for this is to evaluate and improve customer service and call center performance. For this purpose, analysis of the recordings will illuminate hard call center metrics — such as transfer of calls, first-call resolution, average handle time and lost sales opportunities — with root-cause intelligence, providing insight into apparent trends and performance levels. This intelligence enables customer service and call center managers to implement more effective improvement strategies. With the ability to identify, drill down and listen to specific problem calls, they gain an understanding of problem areas that enables them to develop targeted training and mentoring plans, streamlining the evaluation process and reducing agent turnover.

But with the right speech analytics, an even broader range of strategic business value can be derived from these recordings.

For example, product and brand management can benefit from unlocking the intelligence contained in spoken customer interactions. In the retail world, brand value, product lifecycle management and consumer knowledge are the foundations of success. The recordings of targeted customer interactions, such as customer surveys and focus groups, as well as customer calls to product hotlines, contain crucial information about customers, products and projected buying patterns. Speech analytics turn this captured audio into a major intelligence asset for the retail company’s analysts, product and brand managers, quality assurance and liability managers, and customer satisfaction managers. Eliciting relevant contextual information from captured audio enables them to conduct detailed, complex analyses of product groups and extensions, and provides the necessary intelligence to perform automated trending of product lifecycle indicators. Speech analytics give product managers an effective tool to automate reporting of key product launch metrics (competitors, pricing, product directions, etc.), as well as performing predictive analyses on new product ideas.

For purposes of market analysis and competitive positioning, recorded audio from product hotlines and market research focus groups can be a gold mine of information. With targeted, contextual search capabilities, analysts can identify and search on key product elements to narrow and focus the retrieved data for deep, detailed analysis. This will allow them to ascertain the real story of how the market views their products or services. With the right speech analytics system, it should be easy to translate the elements of a successful product launch into audio search terms, and thereby identify, quantify and act on unsolicited consumer feedback. This feedback — even simple suggestions regarding flavor or packaging — can translate into a multimillion-dollar market share.

Obviously, the bottom-line value of speech analytics is its impact on...well, the bottom line. All of the various commercial uses and applications of the technology are ultimately focused on maximizing financial performance, with revenue generation as the leading initiative. Recorded customer interactions enable companies to identify and maximize revenue opportunities based on the most important component in any market analysis: the voice of the customer. Regardless of whether you initially recorded your customer interactions with revenue generation in mind, these recordings offer an unparalleled information resource to develop and tune your sales strategies. Good speech analytics will unlock the information in recorded audio to provide a more comprehensive view of the organization. This allows you to mine patterns, trends and cause/effect relationships to help uncover new revenue opportunities, identify upsell and cross-sell opportunities, analyze lost sales and replicate successful agent methods.

By Anna Convery
NexidiaOnce you have a strong speech analytics system in place, you will discover more opportunities to gain actionable insight into your business performance, derived from the truth of your customer interactions. From basic customer service initiatives to product roll-outs to compliance and standards enforcement, leveraging the intelligence within your audio data with speech analytics will take you beyond recording to responding and beyond analysis to action that has a positive impact on the bottom line. CIS

Anna Convery is senior vice president, marketing and product management, for Nexidia (www.nexidia.com), a provider of highly scalable, highly accurate rich media search and speech analytics software.

>> CIS Table of Contents

| More

The World's Largest Communications And Technology Community

Technology Marketing Corporation,
2 Trap Falls Road Suite 106, Shelton, CT 06484
Ph: 800-243-6002, 203-852-6800; Fx: 203-866-3326
General comments: [email protected]. Comments about this site: [email protected].

» About » Contact » Advertise

Technology Marketing Corp. 1997-2017 Copyright . Privacy Policy Sitemap