The phrase "I have read and understood the terms of service" is often referred to as the biggest lie on the internet. Every time we buy something online, we are confronted with Terms of Services (ToS). However, only a few people actually read these terms, before accepting them, often to their disadvantage. General terms and conditions included in standard form contracts are of significant economic value, as most companies use these terms when entering into contractual relationships with their customers.
The interdisciplinary computer and legal science research project SaToS (Software Aided Analysis of ToS) from the chair of Software Engineering for Business Information Systems (sebis) at TU Munich aims to automatically identify Terms of Services and summarise them with regard to their lawfulness and customer friendliness, in a simplified language. In this way, SaToS aims to empower customers to make educated decisions about where to buy or not within seconds, directly addressing the imbalance of powers and fostering the constitutional principle of legal clarity.
SaToS preforms six processing steps to assess and summarize ToS:
Fig. 1: SaToS prototype architecture
SaToS uses a hybrid approach for identifying pages with Terms of Services by combining machine learning and rule-based approaches. While the main classification is done with a Naive Bayes classifier, rules are applied to parse URLs and identify possible candidates for ToS pages based on regular expressions. The applications of the rules minimizes the processing time and allows SaToS to immediately show the ToS, as soon as a customer visits the main page of an online shop.
The Information Extraction is based on rules which are generated by legal experts. From the sentence "The customer can return purchased goods within 30 days from delivery", the rule engine would extract the following information:
{ "topic": "withdrawal", "dataType": "timespan", "value": 30, "unit": "day" }
In order to assess the extracted information, SaToS uses a knowledge-base with structured information about the Distance Selling Act and other relevant laws and judgements. However, the database can also be used to store personal preferences, in order to compare ToS not only to the minimum legal standards, but also to personal expectations from the customers. During the assessment, it is assessed, whether the extracted information do meet the standards defined in the knowledge-base, do not meed the standard or even surpass it.
Based on the extracted information and the generated assessment, SaToS summarizes the individual clauses of the ToS in a simplified language and presents them together with the assessment. Moreover, SaToS also shows the original sentence from which the information was extracted and highlites the words which were key for the drawn conclusion.
2019 | |
---|---|
[Br19d] | Braun, Daniel; Scepankova, Elena; Holl, Patrick; Matthes, Florian The Potential of Customer-Centered LegalTech Datenschutz und Datensicherheit - DuD, 43(12), 760-766, doi: 10.1007/s11623-019-1202-7 |
[Br19b] | Braun, Daniel; Scepankova, Elena; Holl, Patrick; Matthes, Florian Consumer Protection in the Digital Era: The Potential of Customer-Centered LegalTech INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft. Bonn: Gesellschaft für Informatik e.V.. (p. 407-420), doi: 10.18420/inf2019_58 |
2018 | |
[Br18c] | Braun, Daniel; Scepankova, Elena; Holl, Patrick; Matthes, Florian Customer-Centered LegalTech: Automated Analysis of Standard Form Contracts IRIS 2018 - Proceedings of the 21st International Legal Informatics Symposium (nominated for LexisNexis best paper award) |
2017 | |
[Br17b] | Braun, Daniel; Scepankova, Elena; Holl, Patrick; Matthes, Florian SaToS: Assessing and Summarising Terms of Services from German Webshops INLG 2017 - Proceedings of the 9th International Natural Language Generation Conference |