Every day, zettabytes of data are generated, with 80% being unannotated, unstructured text. Developing AI applications on such data can be challenging, as manual annotation is needed, which, although being precise and domain-specific, is also costly, inefficient, and not scalable. Data scientists often spend up to 80% of their time preparing these datasets. CD4AI (CreateData4AI) is a digital platform under late-stage construction that aims to assist data scientists by reducing the amount of manual work required in that process to a fraction of the time. Its core consists of a human-in-the-loop workflow model, where some manual action is needed to trigger extrapolation algorithms, automating most of the data annotation. An overlaying user interface makes it possible to organize projects, tasks and collaboration. This thesis aims to investigate how the functional platform design and user experience (UX) can be optimized to shift the platform from a sandbox environment to the real world and how Human-Computer Interaction (HCI) theory can be leveraged in that context. The approach will be chronological and threefold, covering the following overall research questions (RQ):
RQ1: What efforts can be taken to reduce human labor on the platform to a minimum?
This RQ takes an endogenous perspective, as no validation against real users takes place in this stage. First, the current state of required manual action in the platform’s workflow model will be assessed. Also, a check for compliance with HCI theory will be useful to identify current human-in-the-loop antipatterns. Based on these findings, UX improvements, mainly automation features, are proposed and implemented.
RQ2: How can the platform be iteratively optimized for perceived UX?
This RQ takes an exogenous perspective as validation against real-world user requirements is made. In this stage, the platform is iteratively evaluated with real users from different backgrounds. The goal is to identify further UX-related challenges and weaknesses in realworld usage scenarios. In each iteration, solutions are proposed and implemented for these challenges before advancing to the next user evaluation round.
RQ3: What general technicalities must be considered to roll out CD4AI?
This RQ aims to identify the technicalities and formalities it takes to shift CD4AI from the sandbox to the real world. This involves artifacts like a blueprint for the deployment architecture, an operational cost assessment, and a compliance assessment with the European Union’s General Data Protection Regulation (GDPR).
Beyond that, the thesis aims to make theoretical and practical contributions. Regarding theory, potential outcomes will be to what extent the findings generalize to other contexts and new UX design patterns, and how users respond to the implemented HCI theory. Regarding practice, it might be interesting to see how CD4AI is useful in the real world, how key user requirements in this research-focused digital platform differ from other domains (e.g., e-commerce), and which tradeoffs in user and system requirements had to be made.
There are no subpages or files.