Offline Handwritten Text Recognition (HTR) describes the task of transcribing handwritten text into digital texts. Compared to Optical Character Recognition (OCR), HTR is much more challenging and still an open problem.Recently, a transformer based framework named TrOCR was suggested in [1].
Goal
The aim of this internship is to fine-tune existing HTR models in [1] on different data sources. As a preprocessing step, the layout information should be determined with the LayoutReader framework proposed in [2]. Therefore, an additional model should be fine-tuned.
Workflow
Literature review
Data preparation
Implementation
Written Report/Thesis and final presentation
References
[1] Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei A. F. Florêncio, Cha Zhang, Zhoujun Li, Furu Wei: TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models. CoRR abs/2109.10282 (2021)
[2] Zilong Wang, Yiheng Xu, Lei Cui, Jingbo Shang, Furu Wei: LayoutReader: Pre-training of Text and Layout for Reading Order Detection. EMNLP (1) 2021: 4735-4744