Guided Research Rajna Fani

Last modified Jul 2

guided research

A Human Assessment of Reference-Free and Reference-Based Evaluation Approaches in the HR Domain

Abstract and Motivation:

In the era of Large Language Models (LLMs), assessing the quality of generated text presents an ongoing challenge. This study explores the effectiveness of reference-free metrics in evaluating text quality produced by advanced language models, comparing them with traditional evaluation methods.

This research finds its practical application in addressing prolonged waiting times for employees seeking information from the Human Resources department through SAP HR Chatbots. By harnessing advanced text generation models, conversational agents have the potential to expedite responses and reduce the HR department's workload.

Moreover, the study examines the reliability of reference-free evaluation metrics and compares them to traditional reference-based metrics. It also assesses the performance of automatic metrics versus human evaluation by domain experts. The research evaluates two approaches, the Fine-tuned Language Model (LM) Approach and the LLM-Powered Approach, using a question-answering dataset that includes FAQs and user utterances from chatbot logs to gauge generative model performance.

Research Questions

1. What are the emerging state-of-the-art metrics in the evaluation of generative conversational agents, and how do they compare to traditional metrics?

2. Are reference-free evaluation metrics, especially those leveraging advanced language models, a more reliable indicator of a generative model's performance compared to traditional reference-based metrics?

3. How effectively do automatic metrics perform in assessing generative model performance when subjected to human evaluation by domain experts?

References

Incoming references

Files and Subpages

Name	Type	Size	Last Modification
Rajna Fani Final_presentation_GR.pdf	File	5,79 MB	02.07.2025
Rajna Fani Kick-off Presentation GR.pdf	File	3,38 MB	02.07.2025
Rajna-Fani_Guided-Research-Report.pdf	File	581 KB	02.07.2025

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian