Back to top

Master's Thesis Simon Klimek

Last modified Feb 28, 2023

Learning Hierarchical Relations between Research Concepts from Abstract and Titles of NLP Publications

Extended Abstract

According to the Association of Scientific, Technical and Medical Publishers the average growth rate of published scientific articles ranged from 5 to 6.5% and for journals from 2 to 3% between 2015 and 2020 [1]. The higher the number of available publications, the harder and more time-consuming it becomes to get a quick overview of current active research areas. Thus, a fast solution to visualize the research space is needed.

One way to achieve that is by establishing a hierarchy between research concepts. A concept represents an area of research. The same area might have various names. For example, the task of finding information in a text using natural language processing (NLP) techniques is described by the terms “text mining” as well as “text analytics”. Whether to include a term in a concept or not might be opaque and open for discussion. In the end, a domain expert is needed in order to decide if the assignment is correct. Research concepts have relations with each other. When we talk about hierarchies, we often refer to the hypernym-hyponym relation. This is a whole-part relation with the hyponym being part of the hypernym. If we take the example from earlier, “text mining” would be a hyponym of “text processing”, at least according to the computer science ontology portal [2]. A large hierarchy between research concepts can be beneficial to explore a research domain as we will see in the following examples.

The concept hierarchy can be used to answer various questions a researcher might have. Getting the bigger picture of an area of interest is the most obvious use-case. The hierarchy shows how topics are bound together and how they can be aligned. If we go down the hierarchy looking for hyponyms, we can find sub-areas and techniques which are being used in the originating concept. If we look the other way, i.e. going up the hierarchy, we can see where our chosen topic is being used for. This might give us a motivation and use-cases for conducting research in this topic. Another benefit is the possibility to find niche topics which might be overseen otherwise. Hierarchies give almost no detail about research concepts. This makes it easy to see similar topics without having to dive deep into the literature itself.

Research Questions

How to learn a hierarchy of NLP research concepts from publication abstracts and titles?

  1. How to model domain research concepts from scientific texts?
  2. How to infer hierarchical relationships between domain research concepts?
  3. How to deduct a hierarchy based on the extracted research concepts?
  4. How can this approach be transferred to other research domains?

 

References

  • STM global brief 2021 - economics & market size, 2021
  • Salatino, A. A., Thanapalasingam, T., Mannocci, A., Osborne, F., and Motta,
    E. The computer science ontology: a large-scale taxonomy of research areas. In International
    Semantic Web Conference (2018), Springer, pp. 187–205.

Files and Subpages

Name Type Size Last Modification Last Editor
Simon KIimеk Final Presentation.pdf 7,90 MB 15.02.2023
Simon KIimеk KickOff.pdf 1,18 MB 15.02.2023
Simon KIimеk Thesis.pdf 1023 KB 15.02.2023