Diplomarbeit/Master thesis
Supervisor: Martin Kampel
Status: open
Motivation
Real estate management systems like metamagix.ICRS handle increasingly complex and diverse data streams, including rent rolls, accounting data, contracts, and unstructured documents. Traditional data processing methods often struggle with the heterogeneity of this information, leading to inefficiencies in data classification, normalization, and analysis. The initial implementation of RAG (Retrieval-Augmented Generation) and LLM (Large Language Model) technologies within metamagix.ICRS has demonstrated promising capabilities, but significant opportunities for improvement and extension remain unexplored.
Recent advances in LLM capabilities, particularly through models available on AWS Bedrock, have shown remarkable potential for understanding complex financial and contractual data, detecting patterns, and generating accurate forecasts. However, challenges persist in data normalization across different sources, accurate anomaly detection in financial streams, and generating reliable forecasts that incorporate both historical trends and macroeconomic factors. As real estate management becomes increasingly data-driven, the need for sophisticated AI-powered solutions that can process, analyze, and forecast from diverse data sources becomes critical for competitive advantage.
This thesis aims to build upon the existing RAG/LLM foundation in metamagix.ICRS to create a more comprehensive, accurate, and adaptable system for real estate data processing that addresses these challenges while maintaining enterprise-grade reliability and performance.
Goal of the Work
The goal of this master thesis is to enhance and extend the existing RAG/LLM solution within metamagix.ICRS to improve three key capabilities:
- Intelligent classification and normalization of incoming data (rent rolls, accounting data, and documents) through REST API/MCP interfaces
- Advanced anomaly detection in financial data streams to identify outliers and potential errors
- Sophisticated forecasting of income and cost data based on historical patterns, contract information, and macroeconomic parameters
The solution will leverage Java Spring.AI framework and state-of-the-art LLMs via AWS Bedrock, ensuring enterprise-level performance, scalability, and security. The improved system should demonstrate measurable enhancements in data processing accuracy, anomaly detection precision, and forecast reliability compared to the existing implementation.
Tasks
- Conduct a comprehensive analysis of the current RAG/LLM implementation in metamagix.ICRS, identifying strengths, limitations, and opportunities for improvement
- Review recent literature on LLM applications in financial data processing, document classification, anomaly detection, and time-series forecasting
- Design and implement enhanced data classification and normalization pipelines for diverse input types (rent rolls, accounting data, contracts, and unstructured documents)
- Create advanced anomaly detection mechanisms that can identify outliers in financial data streams with high precision and explainability
- Implement a forecasting system that integrates historical data, contract information, and macroeconomic parameters (used in the existing cashflow model of the application) to generate income and cost projections
- Document the architectural improvements, integration patterns, and best practices for enterprise-grade LLM implementations in Java environments
- (Optional) Develop visualization components to improve the interpretability of anomaly detection and forecasting results
Required Skills
- Java programming skills with experience in Spring Framework and API development
- Knowledge of Spring.AI framework
- Understanding of data processing pipelines, REST APIs, and enterprise integration patterns
- Familiarity with machine learning concepts, especially classification, anomaly detection, and forecasting
- Basic understanding of prompt engineering and LLM capabilities/limitations
- Interest in real estate management systems and financial data processing
Contact: martin.kampel@tuwien.ac.at