Master's Thesis Şükrü Can Gültop

Last modified Jun 17, 2024

No tags assigned

Experimental Analysis of the Interaction of Methods for Post-hoc Interpretability and Differential Privacy in Natural Language Processing

Abstract

Recent developments in deep learning and the widespread application of big data in natural language processing (NLP) tasks have significantly heightened the importance of privacy and explainability. As NLP models become increasingly powerful and pervasive, they are often trained on vast amounts of sensitive textual data, raising serious concerns about data privacy and confidentiality. Moreover, as these models grow in complexity, understanding their decision-making processes and ensuring transparency has become crucial, particularly in applications with high-stakes consequences, such as healthcare and finance. Consequently, there is a pressing need to strike a delicate balance between harnessing the potential of NLP technologies and safeguarding individual privacy rights while also providing meaningful insights into how these models arrive at their predictions. In this context, our primary emphasis will be on leveraging differential privacy as our chosen privacy-enhancing technique, coupled with post-hoc interpretability methods valued for their formal and quantifiable characteristics. Our study sets out to comprehensively investigate how different post-hoc interpretability methods interact with differential privacy techniques, with a specific focus on NLP tasks. The core objective of this research is to provide an experimental analysis of the effects of differential privacy methods on explanations generated by post-hoc interpretability methods, and uncovering potential synergies, conflicts, and trade-offs.

Our methodology involves systematically applying various post-hoc interpretability techniques and differential privacy mechanisms to NLP models. We then rigorously evaluate their combined effects on model performance and interpretability. By quantifying these interactions, our findings aim to shed light on the feasibility of achieving both transparency and data protection in the NLP domain.

Research Questions

RQ1: What existing word-level Differential Privacy and Post-hoc Explainability methods can be systematically evaluated in one pipeline?

RQ2: In what ways does the variation of the differential privacy parameter and choice of word-level DP mechanism affect post-hoc model interpretations?

RQ3: How can we evaluate the trade-off between differential privacy and post-hoc model interpretations?

Incoming references

Files and Subpages

Name	Type	Size	Last Modification	Last Editor