Home > Insights

How Natural Language Processing can augment voice of customer research

7 minute read


Artificial intelligence (AI) has become increasingly popular and embedded in our daily life. Natural language processing (NLP), a branch of AI, is an interdisciplinary field of linguistics, computer science, and artificial intelligence that deals with how computers understand and generate natural language, for both text and speech. For example, it is used to understand various queries we enter on a search website or to understand tasks we ask a voice assistant to do on the phone. The recently released ChatGPT has attracted more attention to NLP and sparked conversations about AI again.

When it comes to user experience, AI has been widely used for personalization, such as personalized recommendations based on content from previous interactions one has made with a computer. Recently, we have been seeing the rise of NLP in analysing text responses received in customer feedback and reviews. This article will explain three popular ways in which language technologies can be used to facilitate the process of analysing text data and assist humans with the research process. However, researchers are still needed to connect the dots and make sense of the data.

Language technology and applications

Thematic analysis and code framing

Thematic analysis is a common qualitative approach to analyse text data, such as user interviews, wherein researchers identify themes in different pieces of text and annotate them accordingly. There are usually two ways of conducting thematic analysis: inductive analysis, a bottom-up approach where researchers create codes based on the text data itself, and deductive analysis, a top-down approach where researchers use a pre-defined code frame to annotate each piece of text. Although manual thematic analysis may be performed at a high level of accuracy, it can be labor-intensive and time-consuming when there is a large amount of data to process.

In contrast to traditional thematic analysis that is done manually, automated thematic analysis uses AI to tag topics to texts either according to pre-defined topics input by human researchers or based on topics identified using machine learning techniques. For example, a retail company wanted to understand what people talked about when they mentioned the company online during COVID-19. Twitter posts where the retail company was mentioned were collected for analysis. Without prior research in the new context, automated thematic analysis was used to identify common topics among the posts, such as shipping, delivery, product availability, staffing, and waiting time. With the eye of human researchers, shipping and delivery could be linked together, and so could staffing and waiting time. Then, researchers could regroup the topics into shipping and delivery, product availability, and staffing management. In this case, AI was able to identify the common topics of a large text dataset, which saved researchers a lot of time and effort. Then, researchers made connections of different pieces of information and constructed a better picture of the data.

Sentiment analysis

Sentiment analysis is the use of NLP to analyse and determine the affective state in a piece of writing. It is commonly used in social listening tools to understand how people talk about a brand or product, in combination with automated thematic analysis. The output of sentiment analysis is quantitative scores, which are then categorized into positive, negative, and neutral sentiments.

Take the retail company mentioned above as an example. In addition to understanding what customers talked about when they mentioned the brand online, the company also wanted to capture real-time feedback on how people perceived their brand. After gathering the topics, social listening tools conducted sentiment analysis and tagged topics to each post. Then, the tool analysed the sentiment of each topic in a post by looking at words that expressed affective states. Using this methodology gave brands more understanding that customers showed negative sentiment toward product availability and shipping and delivery because of supply shortage. By doing this, brands can gain more understanding of what they have done well and what they would need to improve in the future.

Sentiment analysis is helpful when there are large datasets or continuous data. It is an efficient way to get a feel for customer reviews and track changes and trends. However, it stills needs human researchers to double check and make sense of the data. For example, researchers need to carefully examine how sentiment is tagged. Theoretically, a neutral sentiment is generally regarded as something is neither positive or negative or the lack of sentiment. The sentence “I had the toilet paper shipped to me” was tagged as “neutral” because it did not indicate any sentiment. “The shipping took forever but that was cheap” was automatically tagged as either positive or negative, depending on the software. However, human judgement might perceive it as something more neutral.

Detecting sarcasm in text might be even more challenging for computers when performing sentiment analysis due to the nature of interpreting sarcasm. Even human would need different cues, such as semantics and context, to determine if a comment is sarcastic. For example, human can understand that “I just LOVE it when my online orders turn up late” is a sarcastic comment on shipping and delivery. However, a sentiment analysis tool tags the sentence as positive with high confidence because of the uppercase word “love.” Therefore, in sentiment analysis, researchers may need to review ambiguous cases or the output and modify the models if necessary to improve the accuracy.

Word cloud

Lastly, a word cloud is a visual representation of words based on their frequency and relevance in a set of text data. For example, in an open-ended question about people’s perception toward a brand, a word cloud could be used to help researchers quicky grasp an idea of what customers talk about in the feedback. Researchers can also decide on relevancy and what words to exclude to improve the visualization. However, it is worth noting that a word cloud is not the result of the analysis—it is only a tool to assist analysis of text data. Since word clouds will only show the most relevant words and their weight (by frequency of use), human analysis is still needed to filter the results and interpret the data in context.

Take the feedback in a satisfaction survey of a professional body as an example. A word cloud showed researchers the most common areas that were commented on, such as, “exam”, “study”, “fee”, “qualification”, and “support”. However, this high-level analysis is still too broad to come up with meaningful insights and actionable advice. After seeing a preliminary idea of the responses, researchers needed to dive deeper to interpret what these words mean—to discover the context of the word and what words collocate with these topics. By narrowing the field to examine only the five most frequent words, researchers could identify the potential subtopics around exam: “study for exam”, “exam fee”, “exams required for a qualification”, or “support for exam”. After quickly examining the responses that contained the key word “exam”, ‘researchers were able to identify the main theme was around “exam fee”— specifically customers complaining about the high cost of the exams. The combination of AI and human input provides a more holistic understanding of text responses.


To conclude, language technologies based on NLP and AI can be efficient in helping researchers to conduct both qualitative and quantitative analysis, especially with large amounts of text data. More and more case studies have proven that AI has great potential in assisting market and design research. In the future, the use of NLP in customer research may extend to the analysis of paralinguistic features, such as intonation and rate. Sentiment analysis of customer calls may help customer contact centers to understand customer satisfaction with their experiences. Future available NLP tools and models in the market may also allow more customization, which would help businesses to gain more insights into the voice of customers by analysing language data that is specific to their context or business. However, we should always be careful when incorporating AI into research and understand that AI is not able to replace human researchers. Instead, it should be used to augment our capacity to conduct better, human-centered research.