There are huge amounts of documents stored at libraries and archives with enormous value not only for being masterpieces but also for their contents. These historical handwritten documents are an important source of cultural knowledge over the centuries with valuable information useful for researchers, business, educational organizations and public at large. In recent times, considerable effort has been devoted to digitizing very significant amounts of these historical documents. Once these documents are digitalized, one of the most important challenges in the near future is to facilitate the access to their contents. Any system aiming at making easier the access to this information must take into account the next considerations: heterogeneity in the different kind of documens, large-scale amount of data, cross referencing and users ubiquity.
The SearchInDocs project takes a step into this direction studying the previously mentioned aspects in the processing of ancient documents. Although during the project different document collections will be considered, SearchInDocs will focus on a particular scenario on historical social network. Ancient manuscripts documents will be the heart of the research in SearchInDocs. The complexity of the issues described above requires efficient solutions from different research areas: document image analysis, recognition of handwritten documents, document augmentation and multimodal systems.
SearchInDocs will pay special attention to the user interaction in order to improve the performance of the developed system. Therefore, interactive recognition of anciend documents techniques will be studied.
SearchInDocs is a coordinated project with two subprojects:“Search in Transcribed Manuscripts and Document Augmentation” (STraDA) and “Contextual Recognition of Ancient Documents” (Co-READ).
STraDA (TIN2012-37475-C02-01) aims at developing word spotting techniques in transcriptions of ancient handwritten texts. This project will also explore techniques of document augmentation in historical handwritten documents.
Co-READ (TIN2012-37475-C02-02) will study information spotting techniques to search words and shapes in ancient document images. In addition, contextual search in documents will be investigated. Context means the structural and syntactic relations between basic components (words and symbols).
Please send your comments to Joan Andreu Sánchez.