Back to top

Master's Thesis Andrei Staradubets

Last modified Jan 29
   No tags assigned

Investigating Fact Checking Approaches for Faithful Text Generation based on Structured Knowledge Bases

 

Some of the most impressive recent advancements in the field of Natural Language Processing (NLP) are centered around generative language models, that can produce complex and human-sounding text. Despite being highly structured and grammatically correct, generated text can often contain factual errors and made-up claims not supported by evidence. This phenomenon is also known as model hallucination and is a common problem in most Natural Language Generation (NLG) applications such as dialogue systems, machine translation, and text summarization.

 

Abstractive text summarization deals with models which generate shorter versions of large source document. Generating summaries that are faithful to the original text and factually consistent is an open research problem [1]. Some approaches to tackle this problem include introducing new training objectives and factuality metrics to the pre-training process of models. Another promising direction lies in post-editing with fact correction – methods that fact-check the claims in candidate summaries and try to edit the detected errors by grounding them to background knowledge.

 

The aim of this master thesis would be to develop and test approaches for post-editing and fact correction in generated summaries to make them more faithful to the original text and factually correct. Some approaches include retrieving evidence from external knowledge bases [2], using iterative editing with in-filling from masked language models [3], or some combination thereof. The developed methods would be evaluated using factuality metrics on common summarization datasets like X-SUM or CNN/DM, as well as domain-specific biomedical datasets.

 

Files and Subpages

Name Type Size Last Modification Last Editor
24015 Andrei Staradubets Master Thesis.pdf 2,77 MB 19.02.2024
Staradubets Final Presentation.pdf 1,38 MB 29.01.2024
Staradubets Kick-off Presentation.pdf 1,20 MB 29.01.2024