Back to top

Extending Full Text Search for Legal Document Collections using Word Embeddings

Last modified Oct 18, 2016



Traditional full text search allows fast search for exact matches. However, full text search is not optimal to deal with synonyms or semantically related terms and phrases. In this paper we explore a novel method that provides the ability to find not only exact matches, but also semantically similar parts for arbitrary length search queries. We achieve this without the application of ontologies, but base our approach on Word Embeddings. Recently, Word Embeddings have been applied successfully for many natural language processing tasks. We argue that our method is well suited for legal document collections and examine its applicability for two different use cases: We conduct a case study on a stand-alone law, in particular the EU Data Protection Directive 94/46/EC (EU-DPD) in order to extract obligations. Secondly, from a collection of publicly available templates for German rental contracts we retrieve similar provisions.


Files and Subpages

Name Type Size Last Modification Last Editor
La16c.pdf 1,58 MB 18.10.2016