Back to top

Guided Research Stephen Meisenbacher

Last modified May 11, 2021

Investigating the Application of Differential Privacy to Mitigate Privacy Issues in Natural Language Processing

The notion of privacy has certainly become a hot topic in recent times, especially when it concerns anything related to technology. The dawn of big data and the techniques used to harness it have without a doubt provided promising and meaningful results, but at the same time, the increasing utilization of data that often contains private information has created concern for the protection of privacy. To address this issue, many novel solutions have been formulated to ensure some degree of privacy, particularly in the fields of Machine Learning and Deep Learning. One such technique, Differential Privacy, boasts a quantifiable privacy guarantee, with the added properties of composability and robustness to attack type.

Although Differential Privacy has gained growing attention in recent years, the focus mainly remains on Machine Learning and Deep Learning in general. One might imagine, however, that this privacy-preserving technique is also quite applicable and relevant to Natural Language Processing (NLP) techniques, and a brief survey of current research indeed indicates this. The goal of this Guided Research, therefore, is to gain insight into the current state of Differential Privacy in Natural Language Processing, particularly its addressability to privacy vulnerabilities in said techniques. Using this knowledge, the next goal is to examine the application of Differential Privacy to NLP techniques, including an evaluation of its merits and possible limitations. To accomplish these goals, the following research questions have been defined:

  1. What vulnerabilities to current NLP techniques is Differential Privacy capable of preventing?
  2. What are the foundations of Differential Privacy, and how can it be applied to NLP tasks?
  3. What are the distinct benefits and limitations of applying Differential Privacy to NLP tasks?

The structure of this Guided Research will take the form of a systematic literature review. In this light, the main method of answering the stated research questions will be to seek out relevant academic literature and previous research, which will serve as the primary source for data synthesis. In addition, “grey” literature as well as interviews with experts in the field will serve to supplement the primary data. Ultimately, the systematic literature review will provide readers with a clear motivation for the application of Differential Privacy to NLP, and furthermore, how and to what degree privacy preservation can be achieved.

Files and Subpages