Advanced search


Emilio Granell, Carlos D. Martínez-Hinarejos. The Rodrigo corpus (Version 1.0.0). Zenodo 2018. Data set

The Rodrigo corpus was obtained from the digitisation of the book “Historia de España del arçobispo Don Rodrigo”, written in ancient Spanish in 1545. It is a single writer book where most pages consist of a single block of well-separated lines of calligraphical text. This dataset is free available for research purposes. It contains 15,010 images of text lines with their paleographic transcription. It is divided into three partitions: 9000 text lines for training, 1000 for validation and 5010 for testing.