Alle Kurse | Moodle-Start

Balahari Vignesh Balu

Allgemeine Infos Balu

Allgemeine Informationen rund um die Kurse von B. V. Balu

Dozent/in: Balahari Vignesh Balu

Weitere Kurse

Project: Building Reliable LLM workflows with Structured Data

While large closed-source LLMs perform reasonably well on widely used open-source programming languages and publicly available repositories, their effectiveness decreases in industrial environments. Companies rely on proprietary codebases, internal documentation, architectural models, and domain-specific artifacts that are largely absent from public training data. Consequently, directly applying generic LLMs often leads to limited reliability and contextual understanding. To make LLMs genuinely useful in such environments, smaller, adaptable models and agent-based workflows must be developed that can integrate structured and domain-specific knowledge in a controlled and reliable manner.

This project explores how structured software-engineering artifacts can enhance LLM-based systems for code understanding. Students will work with two datasets: a UML corpus containing heterogeneous diagrams and multilingual descriptions, and a dataset of vulnerability–fix pairs collected from programming-language repositories. The project focuses on two main challenges: (1) preparing high-quality structured datasets from noisy sources, including deduplication, normalization, multilingual processing, and semantic preservation, and (2) evaluating methods for integrating this information into LLM-based systems, such as parameter-efficient fine-tuning, retrieval-augmented generation, and agent-based approaches with external knowledge.

The project will follow an Agile/Scrum framework to ensure iterative development and integration. Students will be organized into a self-managing sub-teams for planning, tracking and version controls. This structure also includes weekly sprints, milestone reviews and MVP demonstrations to manage and achieve the defined goals. Students will develop reproducible experimental pipelines and empirically analyze how structured knowledge improves the reliability and interpretability of LLM-supported analysis.

Dozent/in: Balahari Vignesh Balu
Dozent/in: Gokul Srinivasagan

Project: THI Code Assistant

Dozent/in: Sebastian Apel
Dozent/in: Balahari Vignesh Balu
Dozent/in: Raviteja Boddu