Writer Adaption for Handwritten Text Recognition of Historical Documents

Status: taken
Supervisors: Marco Peer

The digitization and preservation of historical documents rely on accurate transcription of handwritten text. However, historical documents often present unique challenges due to variations in writing styles and deteriorated conditions. This thesis should explore the concepts of writer identification and writer-specific style extraction within Handwritten Text Recognition (HTR) systems, focusing specifically on the writer as a crucial handwriting attribute. The objective is to develop techniques that enhance the accuracy and reliability of recognizing and transcribing handwritten text from historical documents by leveraging writer-specific characteristics. An example of the variation across different writers is shown in Figure 1.

Figure 1: Two line images of two different writers of the Bullinger dataset [1].

To evaluate the proposed writer adaptation techniques, benchmark datasets such as IAM or CVL, widely used in Handwritten Text Recognition (HTR) research, should be employed. Additionally, the evaluation will incorporate a historical dataset known for its challenges, such as the Bullinger Dataset [1]. The existing methodology for writer adaptation in HTR systems primarily relies on meta-learning techniques [2] or involves incorporating networks to extract writer styles [3]. However, recent advancements in the field have demonstrated the effectiveness of diffusion models for generating synthetic data based on specific writer styles [4]. This approach holds promise for augmenting the training data and improving the performance of HTR models.

References

[1] A. Scius-Bertrand, Bullinger Dataset for Writer Adaptation (BullingerDB) , 1, ID:BullingerDB_1, URL:https://tc11.cvc.uab.es/datasets/BullingerDB_1
[2] A. Bhunia et al. “MetaHTR: Towards writer-adaptive handwritten text recognition.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
[3] Z. Wang and J. Du, “Fast writer adaptation with style extractor network for handwritten text recognition”, Neural Networks, Volume 147, 2022.
[4] K. Nikolaidou et al. “WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models”, accepted for ICDAR2023, 2023.

The thesis consists of

  • Literature Review – getting to know the methods
  • Implementation & Evaluation
    • Evaluate state-of-the-art methods on the datasets chosen
    • Develop and apply your writer adaption algorithm for HTR
      • Code for writer identification/retrieval methods can be provided
    • Comparison and thorough evaluation (e.g., improvement of CER/WER)
  • Written Report/Thesis and final presentation
  • Summarize your work in a publication (optionally).

Helpful experience

  • Python
  • Basic/Good understanding of deep learning
  • Machine Learning frameworks (preferably PyTorch)
  • Interest in deep learning, document analysis, historical documents and/or handwritten text recognition