Secure Technologies for Reintegration

Digital technologies, such as videotelephony and controlled internet access, have expanded inmates’ opportunities to maintain social ties, access educational resources, and prepare for reintegration. Monitoring these interactions is resource-intensive, particularly for multilingual communication that often requires interpreters, and staff shortages make supervision challenging. The STeRn project addresses these challenges by implementing AI-based monitoring that intervenes only in case of violations, ensuring secure, efficient, and legally compliant communication while supporting rehabilitation and reducing staff workload.

To achieve this, the project develops a multifunctional system combining advanced AI technologies with social-scientific evaluation. It provides real-time multilingual speech interpretation with automatic transcription, facial recognition for secure identity verification, and content control through symbol/logo and explicit content detection. The system is designed to preserve privacy, intervene selectively, and enable iterative testing in prison settings to ensure usability, acceptance, and operational effectiveness. In parallel, legal, ethical, and social considerations guide the design to align with human rights and rehabilitation goals.

Multilingual Speech Recognition

A key component of STeRn is its advanced multilingual speech recognition, which converts spoken language into digital text using state-of-the-art models, including massively multilingual language models trained on datasets covering over 100 languages. OpenAI’s Whisper, for example, provides robust transcription and translation capabilities from multilingual audio. The pipeline enables real-time communication between individuals who do not share a common language, such as an inmate and a social worker, by transcribing, translating, and synthesizing the audio into a natural-sounding signal in the recipient’s language, and then reversing the process for responses. Beyond direct communication, the system also supports multilingual monitoring: as illustrated in the figure below, a third party, such as a prison officer, can follow the conversation in another language while the original participants speak freely.

Pipeline diagram — Adapted STeRn Pipeline for Multilingual Speech Interpretation and Automatic Transcription

The main goals of this project are:

Expand secure video calls and internet access in prisons.
Provide multilingual speech interpretation with automatic transcription.
Ensure identity verification via facial recognition and content control through symbol/logo detection.
Evaluate technologies in terms of acceptance, effectiveness, and usability in real-world settings.
Conduct legal, ethical, and social assessments to ensure compliance and user acceptance.

Project Partners

This project is funded by the Austrian security research program KIRAS of the Austrian Research Promotion Agency FFG (grant 999926276).

Computer Vision Lab

STeRn

Project Details

Secure Technologies for Reintegration

Multilingual Speech Recognition

Project Partners