Master's Thesis Christoph Gebendorfer

Last modified Nov 20, 2018

No tags assigned

Multi-Task Deep Learning in the Legal Domain

The revival of deep learning yielded astonishing results in many tasks from computer vision, machine translation to speech recognition in the last years. This advancement is favored by the increasing availablity of datasets and computational resources. On the other side, the legal domain with its serious demand for natural language processing applications cannot benefit in equal measure from it, since appropriate preprocessed legal datasets are highly limited or barely exist at all. In contrast to using datasets from other domains, we propose the usage of multi-task deep learning in order to exploit task-independent commonalities and overcome the dataset shortage in the legal domain.

As part of this work, we have created six different corpora for legal translation, legal text summarization and legal document classification. Five out of the six corpora descend from the DCEP¹, Europarl² and JRC-Acquis³ corpus provided by the European Union which we processed for the immediate use with neural network based models. The last corpus is a collection of 42k documents containing court decisions of the seven federal courts of Germany scraped from their official website.

Based on these newly created corpora, various multi-task combinations within a task family (e.g. only translation tasks) and across task families (e.g. translation, summarization & classification) were trained on the state-of-the-art multi-task deep learning model, the MultiModel⁴. In addition, we compared the single & multi-task performance of the MultiModel on two different sets of hyperparameters to the state-of-the-art translation model, the Transformer⁵. The MultiModel trained on joint tasks is a match for the Transformer. We show that multi-task deep learning is advisable in situations where training data is sparse through experiments in which a jointly trained MultiModel is able to outperform a single-task trained MultiModel and the Transformer. Surprisingly, a combination across task families surpasses several combinations within task families. Finally, we trained a combination which beats the JRC EuroVoc Indexer JEX⁶ in the german multi-label classification task by 14 points on the F1 metric.

Links:

¹https://ec.europa.eu/jrc/en/language-technologies/dcep

²http://www.statmt.org/europarl/

³https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis

⁴https://arxiv.org/abs/1706.05137

⁵https://arxiv.org/abs/1706.03762

⁶https://ec.europa.eu/jrc/en/language-technologies/jrc-eurovoc-indexer

Corpora:

https://mediatum.ub.tum.de/1446648
https://mediatum.ub.tum.de/1446650
https://mediatum.ub.tum.de/1446651
https://mediatum.ub.tum.de/1446653
https://mediatum.ub.tum.de/1446654
https://mediatum.ub.tum.de/1446655

Incoming references

Files and Subpages

Name	Type	Size	Last Modification
Final_Gebendorfer.pdf	File	2,14 MB	07.08.2018
Kickoff_Gebendorfer.pdf	File	796 KB	14.02.2018
Thesis_Gebendorfer.pdf	File	2,10 MB	16.07.2018