Master's Thesis Gentrit Fazlija

Last modified Jun 3, 2024

masterthesis

Toward Optimizing a Retrieval Augmented Generation Pipeline using Large Language Model

Introduction & Motivation

Hello! Welcome to my project page :)

Currently, I'm working on my master's thesis. As a student of Mathematics in Data Science, I'm deeply interested in data and how to maximize the inherent value within it. Ever since I was introduced to NLP, I was immediately hooked. Currently, I'm focusing on an information retrieval model, which aims to assist both current students and students-to-be in understanding the different study programs that TUM offers.

Through this, I am leveraging the reasoning capabilities of Large Language Models to extract current data about the study programs at TUM. The goal is to build a model pipeline that answers a variety of questions one might have about this subject field.

Join me on this journey either by checking back on this page around mid-February or connecting with me on LinkedIn.

Research Questions

Q1: Would a multi-query formulation system improve the performance?

Q2: Would an optimization approaches, such as ensamble retriever in combination with a child-parent chunking imporove the performance of the passage retriever?

Q3: How much will few-shot promping help us with respect to zero-shot prompting?

Q4: How does the performance change when using a free open-source model compared to a paid closed source model? How can open-sourced models be optimized?

References

tba

Incoming references

Files and Subpages

Name	Type	Size	Last Modification
Checkliste_Masterthesis_Gentrit Fazlija.pdf	File	473 KB	03.06.2024
Master Thesis_Gentrit Fazlija_signed.pdf	File	724 KB	03.06.2024
Masterthesis Final_Gentrit Fazlija.pdf	File	11,11 MB	03.06.2024
Masterthesis Kick-Off_Gentrit Fazlija.pdf	File	6,83 MB	03.06.2024

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian