Back to top

Master's Thesis Dilara Ademoglu

Last modified Feb 15, 2023

Early Detection of Issues in New Car Models using Natural Language Processing on Customer Call Data

Abstract

Early detection of quality issues has been an important topic that can give a significant competitive advantage to businesses. Many automotive companies, including BMW Group, conduct tests during production to detect potential production errors. Even though a vehicle passes all the tests, some issues can go unnoticed and occur on the customer's side after the purchase. BMW Group seeks to detect possibly unnoticed issues to improve production quality by inspecting customer call data. The aim is to improve the conducted tests along with the vehicle quality and prioritize the known issues on their frequency and importance, with the end goal of improving overall customer satisfaction. Customer call data is collected regularly through call centers to get feedback on customer satisfaction; this type of data is considered unstructured data. The annotation of this type of data is expensive and time-consuming, making it challenging to apply supervised machine learning methods.

This thesis aims to implement NLP techniques as part of the root cause analysis. The NLP techniques used for the project are topic modeling and context-based text matching. We compare two topic modeling approaches, one traditional and one embedding-based approach, to get topic representations of customer call data and production data. Furthermore, we discuss which representation is preferable. Selected topic representations are later compared using similarity measures to match similar topics between the two datasets. An analysis is conducted on the findings of the topic matching task to discover the recurring issues in customer call data that are not stated under the production error data. We evaluate our findings with topic coherence measures and with a user study. With the help of state-of-the-art NLP models and data analysis, we demonstrate that potential issues reported by customers can be detected and mapped to the corresponding issues in the production error data, which is crucial for prioritizing issues and customer satisfaction.

 

Research Questions:

  1. Which state-of-the-art topic modeling approaches would provide better insight into BMW customer feedback datasets?

  2. Which text similarity techniques give better matches between BMW customer feedback datasets?

  3. How to support quality control departments with interactive topic visualizations of BMW customer feedback datasets?

Files and Subpages