Back to top

Anum Afzal

Last modified by Anum Afzal Sep 11

Faculty of Informatics
Chair of Informatics 19
Software Engineering for Business Information Systems (sebis)    

Technical University of Munich
Boltzmannstraße 3
85748 Garching, Germany

 

anum.afzal [at] tum.de

Room FMI 01.12.055

Office hours: by appointment

  

 

 

I do not supervise industry thesis. 

I currently don't have any thesis topics.

Curriculum Vitae

Anum is Ph.D. candidate at the Technical University of Munich with a focus on topics such as Efficiency and Domain Adaptation of Large Language Models. She has been a researcher at the chair for Software Engineering of Business Information Systems (sebis) at the Technical University of Munich (TUM) since September 2021. 

She is also a part of the Industry on Campus initiative and support SAP @ TUM Collaboration Lab. Apart from the theoretical research on domain-specific text summarization, she works with SAP on a research project on application of LLMs in a business context. She also collaborates with Holtzbrinck Publishing Group on a Domain-specifc Text Summarization project as a part of the Software Campus initiative.

She holds a master's degree in Computer Science from TUM and wrote her master thesis about Topic Modeling for Employee Objectives using Word Embeddings with Merck Group. She has also worked as a research assistant at chair of Information systems at TUM and also did a student internship at Munich Re during her Master's.

 

Research Interests

  • Text Summarization
  • Domain-Adaptation
  • LLM Performance Prediction
  • Efficient Transformers, Parameter-efficient fine-tuning
  • Natural Language Generation Evaluation

 

Research Projects

                                         

Enterprise AI at SAP

SAP and TUM collaborate in strategic areas, where applied research can provide a positive impact to business and people. The portfolio of joint activities cover a broad set of areas as SAP‘s solutions is used in various aspects. "Enterprise AI" is a crucial part of that portfolio under which several research projects are being carried out. There is currently two on-going projects and one completed project.

Current Project:

AutoRAG - Leveraging Bayesian Optimization for Accelerating RAG Pipeline Optimization

 

ATESD: Abstractive Text Summarization for Domain-Specific Documents

Large Language Models work quite well with general-purpose data and many tasks in Natural Language Processing. However, they show several limitations when used for a task such as domain-specific abstractive text summarization. This paper identifies three of those limitations as research problems in the context of abstractive text summarization: 1) Quadratic complexity of transformer-based models with respect to the input text length; 2) Model Hallucination, which is a model's ability to generate factually incorrect text; and 3) Domain Shift, which happens when the distribution of the model's training and test corpus is not the same. Along with a discussion of the open research questions, this project addresses these research gaps in the context of Abstractive Text Summariztion.

 
 

 

Teaching (in reverse chronological order)

Term Level Title Type Role
 SS 25  Master  Natural Language Processing - Methods and Applications  Seminar  Organizer
 WS 24/25  Master  SEBA Lab Course  Lab Course  Advisor
 SS 25  Master Software Engineering for Business Applications (SEBA Master)  Lab Course  Advisor
 SS 24 Master Natural Language Processing - Methods and Applications Seminar Organizer
SS 24 Master SEBA Lab Course (NLP) Lab Course Advisor
SS 24 Master Software Engineering for Business Applications (SEBA Master) Lab Course Advisor
 SS 23  Master Natural Language Processing - Methods and Applications  Seminar  Organizer
SS 23 Master SEBA Lab Course (NLP) Lab Course Advisor
 WS 22/23  Master  SEBA Lab Course  Lab Course  Advisor
SS 22 Master Natural Language Processing - Methods and Applications Seminar Organizer
SS 22 Master / Bachelor  Conversational AI workshop Certificate Course Organizer
SS 22 Master Software Engineering for Business Applications (SEBA Master) Lab Course Advisor
WS 21/22 Master SEBA Lab Course Lab Course Advisor

 

Publications (in reverse chronological order)

2025
[Link TBD]  Anum Afzal, Florian Matthes, and Alexander R. Fabbri. DA-Pred: Performance Prediction for Text Summarization under Domain-Shift and Instruct-Tuning.In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025). Suzhou, China. Association for Computational Linguistics, 2025.
[Link TBD] Anum Afzal, Ishwor Subedi, and Florian Matthes. 2025. Candidate Profile Summarization- A RAG Approach with Synthetic Data Generation for Tech Jobs. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing (RANLP 2025). Varna, Bulgaria. Association for Computational Linguistics, 2025. 
 [Link] Anum Afzal, Juraj Vladika, and Florian Matthes. FActBench: A Benchmark for Fine-grained Automatic Evaluation of LLM-Generated Text in the Medical Domain. In Proceedings of the 8th International Conference on Natural Language and Speech Processing (ICNLSP 2025). Odense, Denmark. Association for Computational Linguistics, 2025.
  [Link] Anum Afzal, Florian Matthes, Gal Chechik, and Yftah Ziser. 2025. Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion. In Findings of the Association for Computational Linguistics (ACL 2025). Vienna, Austria. Association for Computational Linguistics, 2025.
 [Link] Anum Afzal, Alexandre Mercier, Florian Matthes. JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry. In the Proceedings of the 9th International Conference on Innovation in Artificial Intelligence (ICIAI 2025). Singapore. Springer 2025.
 [Link] Anum Afzal, Juraj Vladika, Gentrit Fazlija, Andrei Staradubets, and Florian Matthes. Towards Optimizing a Retrieval Augmented Generation using Large Language Model on Academic Data. In Proceedings of the 8th International Conference on Natural Language Processing and Information Retrieval (NLPIR 2024). Okayama, Japan. Association for Computing Machinery, 2025.

2024

[Link]

Anum Afzal, Ribin Chalumattu, Florian Matthes, Laura Mascarell Espuny. AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization. In Proceedings of the 1st Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (EMNLP 2024). Miami, USA.  Association for Computational Linguistics, 2024.

[Link] Afzal, Anum & Fani, Rajna & Kowsik, Alexander & Matthes, Florian. Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using LLMs with Human-in-the-Loop. Proceedings of the Workshops on Data Science with Human in the Loop in North American Chapter of the Association for Computational Linguistics (NAACL 2024), Mexico City, Mexico [Best Paper Award]
[Link] Afzal, Anum & Xiang, Tao & Matthes, Florian. A Semi-Automatic light-weight Approach towards Data Generation for a Domain-Specific FAQ chatbot using Human-in-the-Loop. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence (ICAART 2024), Rome, Italy. SCITEPRESS - Science and Technology Publications.
2023
[Link] Schneider, P.; Afzal, A.; Vladika, J.; Braun, D.; Matthes, F. Investigating Conversational Search Behavior For Domain Exploration. In European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland. Springer.
Link]  Afzal, A.; Vladika, J.; Braun, D.; Matthes, F.  Challenges in Domain-Specific Abstractive Summarization and How to Overcome Them. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence (ICAART 2023), Lisbon, Portugal. SCITEPRESS - Science and Technology Publications. [Best Paper Runner-up]