Call Center Scheduling Featured Article
Chatbots Are Confused by Figurative Language and Metaphors
Many companies are increasingly relying on artificial intelligence-driven (AI) chatbots to help handle routine customer inquiries. These chatbots can answer common questions – “Are you closed on Sundays?” or “What’s my balance?” – and route more complex queries to the right human agents. Chatbots do have their limits however, and one of them is people’s propensity to use figurative language.
Harsh Jhamtani, a Ph.D. student at Carnegie Mellon University, and Taylor Berg-Kirkpatrick, a faculty member in the UC San Diego Department of Computer Science and Engineering, recently completed a study that examines the limits of chatbot technology to cope with idioms and metaphors. Jhamtani, a native Hindi speaker, drew on his own experience of learning English quirks.
“We want to enable more natural conversations between people and dialog systems,” Jhamtani, the paper’s first author, was quoted on Newswise.
In an example provided, a customer using a fashion-based chatbot declared that a garment’s pattern might be too “loud,” which prompted the chatbot to advise turning down the volume. The chatbot, of course, lacks the data that the phrase “loud” can sometimes be used to describe an overly complex or bright fabric pattern.
In the study, researchers tested five different systems designed to “speak” with humans, including GPT-2, which is trained to predict the next word in 40GB of Internet text and was developed by research company OpenAI.
Researchers first ran the dialog systems through a dataset of 13.1K conversations on colloquial topics such as tourism and health. They then extracted the conversations that included figurative language from the dataset and ran the systems through those only. They observed a drop in performance ranging from 10 to 20 percent.
They then wrote a script that allowed the systems to quickly check dictionaries that translate figurative speech into literal speech. This is faster and more efficient than re-training systems to learn the complete content of these dictionaries. Researchers observed that performance improved by as much as 15 percent. The researchers noted that while the results were interesting, further study will be required.
The research was presented at the at the 2021 Conference on Empirical Methods in Natural Language Processing, which took place from November 7 to 11, 2021.
Edited by Luke Bellos