Back to top

Master's Thesis Christoph Gebendorfer

Last modified Nov 20, 2018
   No tags assigned

Multi-Task Deep Learning in the Legal Domain

The revival of deep learning yielded astonishing results in many tasks from computer vision, machine translation to speech recognition in the last years. This advancement is favored by the increasing availablity of datasets and computational resources. On the other side, the legal domain with its serious demand for natural language processing applications cannot benefit in equal measure from it, since appropriate preprocessed legal datasets are highly limited or barely exist at all. In contrast to using datasets from other domains, we propose the usage of multi-task deep learning in order to exploit task-independent commonalities and overcome the dataset shortage in the legal domain.

As part of this work, we have created six different corpora for legal translation, legal text summarization and legal document classification. Five out of the six corpora descend from the DCEP1, Europarl2 and JRC-Acquis3 corpus provided by the European Union which we processed for the immediate use with neural network based models. The last corpus is a collection of 42k documents containing court decisions of the seven federal courts of Germany scraped from their official website.

Based on these newly created corpora, various multi-task combinations within a task family (e.g. only translation tasks) and across task families (e.g. translation, summarization & classification) were trained on the state-of-the-art multi-task deep learning model, the MultiModel4. In addition, we compared the single & multi-task performance of the MultiModel on two different sets of hyperparameters to the state-of-the-art translation model, the Transformer5. The MultiModel trained on joint tasks is a match for the Transformer. We show that multi-task deep learning is advisable in situations where training data is sparse through experiments in which a jointly trained MultiModel is able to outperform a single-task trained MultiModel and the Transformer. Surprisingly, a combination across task families surpasses several combinations within task families. Finally, we trained a combination which beats the JRC EuroVoc Indexer JEX6 in the german multi-label classification task by 14 points on the F1 metric.

 

Links:

1https://ec.europa.eu/jrc/en/language-technologies/dcep

2http://www.statmt.org/europarl/

3https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis

4https://arxiv.org/abs/1706.05137

5https://arxiv.org/abs/1706.03762

6https://ec.europa.eu/jrc/en/language-technologies/jrc-eurovoc-indexer

 

Corpora:

https://mediatum.ub.tum.de/1446648
https://mediatum.ub.tum.de/1446650
https://mediatum.ub.tum.de/1446651
https://mediatum.ub.tum.de/1446653
https://mediatum.ub.tum.de/1446654
https://mediatum.ub.tum.de/1446655

Files and Subpages

Name Type Size Last Modification Last Editor
Final_Gebendorfer.pdf 2,14 MB 07.08.2018
Kickoff_Gebendorfer.pdf 796 KB 14.02.2018
Thesis_Gebendorfer.pdf 2,10 MB 16.07.2018