Master Thesis
Status: open
Supervisor: Florian Kleber & Rafael Sterzinger
Active Learning for Segmentation with Vision Foundation Models
The goal of this thesis is to develop and evaluate Active Learning methods for semantic segmentation using modern Vision Foundation Models (VFMs) such as SAM, DINOv3, or RADIO. The focus is on reducing annotation effort, especially in Out-Of-Distribution (OOD) settings, where the target domain differs significantly from the source data used during pretraining.
The work investigates how representation-space information from foundation models can be used to identify informative samples for annotation. In contrast to classical uncertainty-based Active Learning approaches, the thesis focuses on feature-space novelty, segmentation instability, and representation diversity as acquisition signals for sample selection.
The data used may include natural image datasets as well as specialized domains such as medical imaging or cultural heritage data. A particular emphasis is placed on evaluating how Active Learning strategies behave under domain shift and how efficiently foundation models can be adapted to new segmentation tasks with limited labeled data.
The work includes a literature review, the preparation of suitable datasets, the implementation of Active Learning pipelines, and the evaluation of different acquisition strategies. A particular focus is placed on the development and investigation of novel acquisition functions tailored to Vision Foundation Models and out-of-distribution segmentation settings. Finally, the results should be documented and analyzed with respect to annotation efficiency, segmentation performance, and robustness under OOD conditions. Additionally, interest in scientific writing and the potential publication of the developed methods is highly encouraged.

Core Related Work
Revisiting Active Learning in the Era of Vision Foundation Models
Tasks
- Conduct a literature review on Active Learning, semantic segmentation, and Vision Foundation Models
- Prepare and analyse of (cross-domain) segmentation datasets
- Implement segmentation pipelines using pretrained Vision Foundation Models
- Develop and evaluate Active Learning strategies based on feature-space novelty and segmentation instability
- Compare representation-based acquisition methods with classical uncertainty-based approaches
- Assess annotation efficiency and robustness under domain shift
- Document results and discuss practical applicability and future research directions
Contact
To apply or for further enquiries, please send an email to:
Rafael Sterzinger: rafael.sterzinger@tuwien.ac.at