Table Structure Recognition

Supervisor: Florian Kleber

Start: as soon as possible

Problem Statement

Document Image Analysis (DIA) deals with the analysis and recognition of document images. Tasks range from skew estimation, layout analysis to Handwritten Text Recognition (HTR). Tables in documents contain structured information which can allow a deeper insight into specific data. Table Structure Recognition (TSR) deals with the recognition of the table content including the structure (cell, table row/table column).

Goal

The goal of the practical course (Informatik Praktikum) / Bachelor thesis is to train a State of the Art Neural Network to recognize tables in document images. Datasets with annotated tables (GT) are available.

Workflow
Literature research
Implementation in Python or C++
Evaluation of the system
Written report or thesis (in English) and final presentation

Requirements
Python or C++
Basic knowledge in Computer Vision
Basic knowledge in Deep Learning (Tensorflow, PyTorch) and Machine Learning