Back to top

Research Institution Knowledge Graph (RIKG)

Last modified May 20

Knowledge Graph-based Exploration of Research Institutions by Semantic Linking of Scientific Domains

Motivation

The constantly increasing rate of new scientific publications and thereby also new research areas brings innovations and at the same time, raises new challenges [1]. One of the challenges is to organize scientific knowledge in a way that researchers can easily find relevant research results and discover new scientific findings. Given that scientific knowledge usually is available in large quantities as unstructured texts, it is very difficult for researchers to obtain an overview of research fields or scientific domains. Similarly, it is difficult for researchers to gain insight into topics being researched at research institutions. Information about conducted research often only exists in unstructured texts on homepages or intranet pages. In addition, the websites are usually designed according to organizational structures of research institutions rather than a logical structure based on research areas. Therefore, a research area based navigation through the institution websites is hardly possible. This makes exploration and navigation of topics being researched in an institution very difficult for external as well as for internal users. Structuring the scientific knowledge of research institutions and linking semantically related scientific domains offers researchers the potential for enhanced exploration of research areas.

 

Focus and Goals

To address the issues mentioned above, this project aims at developing a Knowledge Graph based framework that links semantically related scientific domains of research institutions, called Research Institution Knowledge Graph (RIKG). This will enhance the exploration of research areas studied in research institutions and provide an overview of their quality and quantity. Furthermore, in research institutions not only research areas are important but especially the people working on different research topics. Therefore, to enable researchers to identify other relevant researchers from a scientific domain within a research institution, the RIKG will link institution members to scientific domains. In addition, the RIKG computes the relevance of people to scientific domains at research institutions using a variety of metrics. 

To construct a RIKG, the following steps are required:

1. Ontology Learning for Scientific Concepts

To link semantically related scientific domains, the RIKG needs an ontology of scientific concepts. First, this ontology must include hierarchical relationships between scientific domains to enhance knowledge exploration and navigation for users. Second, the ontology should contain advanced relationships to model the knowledge of research domains in a structured format. The current generation of Knowledge Graphs still lack this explicit representation of knowledge [1]. To model this knowledge in the ontology, aspect-based modeling of research domains is considered. Thereby, the scientific domains are linked by relations that depend on different aspects such as the research problem addressed, the application domain, the methods used for experiments, or evaluation methods used. The ontology is learned from scientific publications of different domains. Therefore, the database consists of unstructured texts that are transformed into a structured format by a pipeline of various NLP algorithms from the "ontology learning layer cake" [2]. Previous attempts to construct an ontology of scientific concepts were based on supervised approaches and are domain specific [3] or based only on keywords from publications [4, 5, 6] making the ontology unable to model advanced relationships. By using unsupervised NLP algorithms for ontology learning and the incorporation of publication texts, advanced, domain-independent relations between scientific concepts should be obtained.

2. Knowledge Graph Population

After the ontology is created, the RIKG must be populated with information about researchers and their work. The data sources include internal systems that contain information about institution members as well as external sources such as homepages or Google Scholar profiles to obtain information about their work. This data is parsed and linked in the RIKG using the learned ontology.  

3. Relevance Metrics

To identify the relevance of researchers to scientific domains and the relevance of scientific domains to a research institution, various metrics need to be computed. On the one hand, simple externally obtained metrics like number of citations or h-index might be involved. On the other hand, advanced inferred metrics for importance or trending scores must be considered. Examples of algorithms to compute such advanced metrics could be different centrality graph ranking algorithms or the page rank algorithm.

4. Semantic Search & Recommendation

The limitation of many search engines for finding relevant results is based on a purely syntactic search approach. This often involves simple fuzzy string matching without taking into account the intent of the user's search query. In contrast, a graph-based semantic search offers the possibility not to restrict search results to simple string matching. Thereby, search results can be obtained that do not necessarily correspond to the syntactic search query, but have a high semantic similarity to it. Additionally, a graph-based recommendation engine (e.g., for related researchers or research topics) can enhance knowledge exploration even further.

5. Web App Front-End

To visualize the RIKG and enhance navigation and knowledge exploration for users, a Web App Front-End with an appropriate UI/UX is needed.

 

Research Questions 

Main RQ: How to enhance exploration of conducted research in research institutions? 

  1. How to construct an ontology of scientific concepts from research publication texts and metadata?
  2. How to construct a Research Institution Knowledge Graph from an ontology of scientific concepts and researcher profiles?
  3. What are useful metrics for scientific concepts and entities within a scientificknowledge graph?
  4. How to construct a semantic search for the Research Institution Knowledge Graph?
  5. How to visualize the Research Institution Knowledge Graph?

 

References

  1. Dessì, D., Osborne, F., Recupero, D.R., Buscaldi, D., & Motta, E. (2021). Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain. Future Gener. Comput. Syst., 116, 253-264.
  2. Asim, M.N., Wasim, M., Khan, M.U., Mahmood, W., & Abbasi, H.M. (2018). A survey of ontology learning techniques and applications. Database: The Journal of Biological Databases and Curation, 2018.
  3. Luan, Y., He, L., Ostendorf, M., & Hajishirzi, H. (2018). Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. EMNLP.
  4. Osborne, F., & Motta, E. (2012). Mining Semantic Relations between Research Areas. SEMWEB.
  5. Osborne, F., & Motta, E. (2015). Klink-2: Integrating Multiple Web Sources to Generate Semantic Topic Networks. SEMWEB.
  6. Salatino, A.A., Thanapalasingam, T., Mannocci, A., Birukou, A., Osborne, F., & Motta, E. (2020). The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas. Data Intelligence, 2, 379-416.