Ruling Database

ruling-hp-index The CVL ruling dataset was synthetically generated to allow for comparing different ruling removal methods. It is based on the ICDAR 2013 Handwriting Segmentation database [1]. It was generated by synthetically adding four different ruling images resulting in a total of 600 test images. The pixel values are:

  • 255 background
  • 155 ruling
  • 100 text
  • 0 ruling and text (overlaping)

For processing, a binary image must be generated which sets all pixels to 0 that are not 255. When evaluating, the line GT image can be found by setting all pixel having value 155 to one (e.g. linImg = img == 155). The text GT image can be extracted by setting all values below 155 to zero (e.g. txtImg = img < 155). Then, true positives (tp), false positives (fp) and false negatives (fn) are defined as:

  • tp = result & linImg & !txtImg
  • fp = result & !txtImg
  • fn = !result & linImg & !txtImg

The database ships with a Matlab that gives evaluation results if all images are already processed.




Robert Sablatnig, CVL, Vienna University of Technology:



[1] N. Stamatopoulos, B. Gatos, G. Louloudis, U. Pal and A. Alaei. ICDAR 2013 Handwriting Segmentation Contest. In Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013, 1402-1406 doi
[2] Markus Diem, Florian Kleber and Robert Sablatnig. Ruling Analysis and Classification of Torn Documents. In ACM Symposium on Document Engineering. Colorado, USA, pages 63 – 72 2014.