Back to top

Master's Thesis Endrit Jashari

Last modified May 26
   No tags assigned

Assessing the Effect of Post-processing on User Acceptance in Differentially Private Text Rewriting

Abstract

Differential Privacy (DP) has long been the standard for protecting structured data. However, as AI advances and unstructured data becomes more frequent, applying DP to fields like Natural Language Processing (NLP) presents new challenges. In NLP, DP can take different forms, such as using privacy-preserving model training with differentially private weight updates or changing text to transform sensitive input into a private version while maintaining its utility.

This master thesis will focus on the latter approach, proposing various mechanisms for post-processing the outputs of differentially private language models to enhance utility while preserving privacy in the resulting text. Building on “Just Rewrite It Again: A Post-Processing Method for Enhanced Semantic Similarity and Privacy Preservation of Differentially Private Rewritten Text“ (Meisenbacher & Matthes, 2024), which demonstrates that iterative rewriting improves both semantic alignment and empirical privacy, we explore different approaches on post-processing DP text. The fundamental importance of post-processing mechanisms in differentially private text generation is highlighted in “Investigating User Perspectives on Differentially Private Text Privatization“ (Meisenbacher, Klymenko, Karpp, & Matthes, 2025), which reveals a paradox in user preferences: while people claim to value privacy, when shown real examples, they often prefer outputs that offer higher utility, even if it means compromising some privacy.

To further explore this, a user study will be conducted to assess how different post-processing techniques impact both text quality and privacy satisfaction. The objective is to identify the optimal combination of methods that keep a balance between utility and user satisfaction, all while maintaining DP. In this thesis, the same techniques as in “Investigating User Perspectives on Differentially Private Text Privatization” will be used to push towards a uniform evaluation process for post-processing mechanisms in differential privacy.

Research Questions

RQ1: Can post-processing techniques improve the privacy-utility trade-off in differentially private text
rewriting?
RQ2: How does post-processed DP rewritten text affect user willingness to share sensitive data, particularly in comparison to non-post-processed outputs?

Files and Subpages

There are no subpages or files.