The constantly increasing rate of new scientific publications and thereby also new research areas brings innovations and at the same time, raises new challenges [1]. One of the challenges is to organize scientific knowledge in a way that researchers can easily find relevant research results and discover new scientific findings. Given that scientific knowledge usually is available in large quantities as unstructured texts, it is very difficult for researchers to obtain an overview of research fields or scientific domains. Similarly, it is difficult for researchers to gain insight into topics being researched at research institutions. Information about conducted research often only exists in unstructured texts on homepages or intranet pages. In addition, the websites are usually designed according to organizational structures of research institutions rather than a logical structure based on research areas. Therefore, a research area based navigation through the institution websites is hardly possible. This makes exploration and navigation of topics being researched in an institution very difficult for external as well as for internal users. Structuring the scientific knowledge of research institutions and linking semantically related scientific domains offers researchers the potential for enhanced exploration of research areas.
To address the issues mentioned above, this project aims at developing a Knowledge Graph based framework that links semantically related scientific domains of research institutions, called Research Institution Knowledge Graph (RIKG). This will enhance the exploration of research areas studied in research institutions and provide an overview of their quality and quantity. Furthermore, in research institutions not only research areas are important but especially the people working on different research topics. Therefore, to enable researchers to identify other relevant researchers from a scientific domain within a research institution, the RIKG will link institution members to scientific domains. In addition, the RIKG computes the relevance of people to scientific domains at research institutions using a variety of metrics.
To construct a RIKG, the following steps are required:
To link semantically related scientific domains, the RIKG needs an ontology of scientific concepts. First, this ontology must include hierarchical relationships between scientific domains to enhance knowledge exploration and navigation for users. Second, the ontology should contain advanced relationships to model the knowledge of research domains in a structured format. The current generation of Knowledge Graphs still lack this explicit representation of knowledge [1]. To model this knowledge in the ontology, aspect-based modeling of research domains is considered. Thereby, the scientific domains are linked by relations that depend on different aspects such as the research problem addressed, the application domain, the methods used for experiments, or evaluation methods used. The ontology is learned from scientific publications of different domains. Therefore, the database consists of unstructured texts that are transformed into a structured format by a pipeline of various NLP algorithms from the "ontology learning layer cake" [2]. Previous attempts to construct an ontology of scientific concepts were based on supervised approaches and are domain specific [3] or based only on keywords from publications [4, 5, 6] making the ontology unable to model advanced relationships. By using unsupervised NLP algorithms for ontology learning and the incorporation of publication texts, advanced, domain-independent relations between scientific concepts should be obtained.
After the ontology is created, the RIKG must be populated with information about researchers and their work. The data sources include internal systems that contain information about institution members as well as external sources such as homepages or Google Scholar profiles to obtain information about their work. This data is parsed and linked in the RIKG using the learned ontology.
To identify the relevance of researchers to scientific domains and the relevance of scientific domains to a research institution, various metrics need to be computed. On the one hand, simple externally obtained metrics like number of citations or h-index might be involved. On the other hand, advanced inferred metrics for importance or trending scores must be considered. Examples of algorithms to compute such advanced metrics could be different centrality graph ranking algorithms or the page rank algorithm.
The limitation of many search engines for finding relevant results is based on a purely syntactic search approach. This often involves simple fuzzy string matching without taking into account the intent of the user's search query. In contrast, a graph-based semantic search offers the possibility not to restrict search results to simple string matching. Thereby, search results can be obtained that do not necessarily correspond to the syntactic search query, but have a high semantic similarity to it. Additionally, a graph-based recommendation engine (e.g., for related researchers or research topics) can enhance knowledge exploration even further.
To visualize the RIKG and enhance navigation and knowledge exploration for users, a Web App Front-End with an appropriate UI/UX is needed.
Main RQ: How to enhance exploration of conducted research in research institutions?
2023 | |
---|---|
[PDF] |
Schopf, Tim; Machner, Nektarios; Matthes, Florian A Knowledge Graph Approach for Exploratory Search in Research Institutions, Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KMIS, Rome, Italy, 2023 |
2022 | |
[PDF] |
Schopf, Tim; Klimek, Simon; Matthes, Florian PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction, In Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR (IC3K 2022), Valletta, Malta, 2022 |
[PDF] |
Schneider Phillip; Schopf, Tim; Vladika, Juraj; Galkin, Mikhail; Simperl, Elena; Matthes, Florian A Decade of Knowledge Graphs in Natural Language Processing: A Survey, In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2022), Online, 2022 |