Back to top

Master's Thesis Meruyert Zhakpekova

Last modified Nov 13, 2023
   No tags assigned

Abstract: 

The thesis investigates the use of Differential Privacy and Text Simplification techniques to protect the privacy of individuals while allowing for the further analysis or modeling of text data. The study addresses authorship attribution problems in domains where anonymity is critical, such as privacy-preserving data sharing or whistleblower protection. The main assumptions are that integrating Text Simplification models and Differential Privacy techniques improve privacy guarantees without compromising the original text's accuracy and minimizes the risk of re-identification. The study seeks to improve the understanding of the relationship between text simplification and privacy, thus enabling the development of more effective and robust models in the future.

Research questions: 

1. How can the fine-tuning of large text simplification models be leveraged as a basis for authorship obfuscation?

2. To what extent is it feasible to integrate Differential Privacy techniques into the proposed pipeline, and at which stage is noise addition optimal?

3. How can the effectiveness of the proposed approach be evaluated from a privacy standpoint through both manual and automatic means?

 

 

Files and Subpages