Offline Handwritten Text Recognition (HTR) describes the task of transcribing handwritten text into digital texts. Compared to Optical Character Recognition (OCR), HTR is much more challenging and still an open problem.Recently, a transformer based framework named TrOCR was suggested in .
The aim of this internship is to fine-tune existing HTR models in  on different data sources. As a preprocessing step, the layout information should be determined with the LayoutReader framework proposed in . Therefore, an additional model should be fine-tuned.
Written Report/Thesis and final presentation
 Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei A. F. Florêncio, Cha Zhang, Zhoujun Li, Furu Wei: TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models. CoRR abs/2109.10282 (2021)
 Zilong Wang, Yiheng Xu, Lei Cui, Jingbo Shang, Furu Wei: LayoutReader: Pre-training of Text and Layout for Reading Order Detection. EMNLP (1) 2021: 4735-4744