Document Understanding

Status: open
Supervisors: Florian Kleber, Markus Diem, Stefan Fiel

Problem Statementdocument-understanding

Analysis of the logical layout of documents allows for assigning the content of a document image into a marked-up electronic representation on which higher-level functionality, liked advanced searches (e.g. limiting search to titles, fetching all document images with one specific layout), can be developed.



The goal of the Master thesis is the development of a method which maps the physical layout structure in a logical structure. Additionally, these logical structures should be compared to each other for searching similar documents.


  • Literature research
  • Implementation in Matlab or C++ with OpenCV
  • Development and evaluation of the system
  • Master thesis (in English) and final presentation


  • Matlab, or C++ and OpenCV