Master's Thesis Ishwor Subedi

Last modified May 9

master's thesis masterthesis master thesis

Candidate Profile Evaluation- A RAG Approach with Synthetic Data Generation for Tech Jobs

Introduction & Motivation

Applicant Tracking Systems (ATS) have been integral to candidate profile evaluation for many years. These systems assist companies by parsing and analyzing resume content, often with the help of machine learning techniques, to provide insights into candidates' profiles. However, a significant challenge has been the lack of sufficient resumes to either train the ATS or test its functionality after deployment. Moreover, existing systems often struggle to provide a comprehensive overview of resumes and effectively compare them to the requirements outlined in job descriptions.

To address these issues, we propose the use of synthetic data generation through Large Language Models (LLMs) to create resumes for technical roles (e.g., software developers, data scientists, machine learning engineers) that closely resemble real-world resumes. To assess the quality of this synthetic data, we suggest evaluation methods that compare the generated resumes to actual resumes in the field. Additionally, we propose the implementation of a Retrieval-Augmented Generation (RAG) system to enhance the comparison of resumes against specific job descriptions, offering deeper insights into the alignment between candidates' qualifications and job requirements.

Research Questions

R1: How to generate synthetic data matching the real-world distribution of the resumes overcoming the privacy barriers?
R2: How do we evaluate the quality of synthetic resume data?
R3: Could a RAG-based approach for Candidate Selection perform better than the Named Entity Recognition baseline?
R4: Which open-source or proprietary models perform well on candidate summarization and matching?

References:

[1] Krohn, A. (2023). Evaluating Text Summarization Models on Resumes : Investigating the Quality of Generated Resume Summaries and their Suitability as Resume Introductions (Dissertation, KTH Royal Institute of Technology). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-332407

[2] Mercan, Ö. B., Cavsak, S. N., Deliahmetoglu, A., & Tanberk, S. (2023). Abstractive Text Summarization for Resumes With Cutting Edge NLP Transformers and LSTM. arXiv [Cs.CL]. Retrieved from http://arxiv.org/abs/2306.13315

Incoming references

Files and Subpages

Name	Type	Size	Last Modification
20241028_Kickoff_Presentation_Ishwor_Subedi.pdf	File	1,15 MB	09.05.2025
20250428_Master_Thesis_Final_Presentation_Ishwor_Subedi.pdf	File	2,18 MB	09.05.2025
Master_thesis_Ishwor_Subedi.pdf	File	1,86 MB	09.05.2025