Advisory in cooperation with Sebastian Sartor.
Artificial intelligence (AI) has become a primary focus of both academic institutions and industry research laboratories. Between 2018 and 2022, the number of AI-related publications increased significantly, rising from approximately 145,000 to over 240,000 [1]. The development of foundational models, used for example, in large language models (LLMs) such as ChatGPT, has substantially enhanced AI’s capabilities and broadened its range of applications. New AI research is published each week across diverse domains, including robotics, medicine, and even Earth observation [6]. Given the rapid expansion of AI research and the overwhelming volume of publications, systematically identifying trends and comparing findings across domains has become increasingly difficult. Comprehensive summaries are, therefore, crucial for guiding future AI advancements and improving the evaluation of model performance.
AI technologies now enable the efficient summarization of vast numbers of research papers within a comparably short time. Moreover, AI can be leveraged within the scientific process to facilitate large-scale data collection, aggregation, and analysis, thereby enhancing research efficiency and knowledge dissemination [5].
Prior studies have analyzed the history and development of AI [2], examined its opportunities and risks [3], and extensively surveyed AI advancements to continuously abstract and summarize findings [9], [10], [11]. These surveys offer a broad overview of AI adoption in specific application domains, structuring findings, and recent developments in the rapidly changing field of AI and foundational models [6] [7]. Additionally, some meta-reviews have been published to compare and provide common benchmark tests and evaluation methods for specific AI technologies across different domains [8].
The existing surveys are often highly specific in their scope – either focusing on a certain application domain or a certain model type. Therefore, systematically comparing the adoption and performance of different model types across various application domains is currently methodologically complex due to the fragmented nature of existing studies.
Furthermore, most existing surveys rely on manual reviews of a limited number of publications (typically a few dozen to a few hundred), restricting their ability to capture broader trends in AI adoption. One of the most extensive summaries is provided by the Epoch AI research group. Their flagship dataset currently contains information on over 900 notable models. Nevertheless, their methodology is also centered around manual paper review and information retrieval. [12]
This study aims to bridge these gaps by leveraging AI to conduct a large-scale, systematic literature review of Foundation Model development across various application domains.
The objective is to create a comprehensive overview of models and their characteristics - like the Epoch AI dataset - by developing a software tool, which automates the manual paper review and utilizes LLMs for the information retrieval process to extend the currently existing overviews:
1. “Build a fully automated systems to identify papers introducing FMs"
2. “Are NLP tools able to extract relevant objective parameters from previously identified papers? What is the quality of the results from the automated solution compared to the Epoch AI dataset”
3. “What is the status quo in FM development for the specific field of Robotics and how to the models compare in their characteristics to other fields?”
4. Are there statistically relevant differences between the robotics domain and other domains (such as language)
This research is relevant both theoretically and practically. Theoretically, it will provide insights into how well large language models can support and automate the scientific process. Practically, the findings will be valuable for AI practitioners and organizations to understand how Foundation Models are developed and adjusted for a specific application domain.
This study will be conducted in five phases, combining both quantitative analysis and AI-powered tools to provide a comprehensive and systematic review of AI research, with a specific focus on foundational models.
There are no subpages or files.