During this project, the card index of the Retrospective Bibliography of Czech Literature, 1775–1945 was completely digitised and made accessible online using special RETROBI web software.
This CLB solution involved developing tools to process scanned cards along with web software to display the data. The digitisation tools were used to scan the index cards and ensure their compatibility with the web software. Empty images were deleted and any entries of two or more card sides were combined into single database units. CLB staff then linked the data to the OCR transcripts of individual images to enable full-text data searches.
The digitised cards were made available to web users using special software with additional support features.
Users have the options of searching index cards based on specific criteria, tailoring their own research datasets and exporting edited data for their own projects. Other tools enable them to correct and improve the data displayed. Any registered user may transcribe individual cards or correct existing transcriptions, an option particularly useful for handwritten cards and those whose illegibility prevents a useable OCR conversion. Users with more privileges may add a feature to a defined set of cards or authorise its conversion to a structured database format. Newly added data are immediately available for structured searches. By late 2020, some 15,000 cards had been transcribed in this way with sub-features added semi-automatically to over one million cards.
This solution also includes a robust administration system to ensure data integrity. The system supports different data display operations (addition of new cards or image files, transfers of cards and integrated series within the card index, etc.).
The entire system is now published under an AGPL licence. The code is written in Java using a noSQL CouchDB database. The source codes are available here (.zip file).
More detailed information about the RETROBI system and its technical features can be found here.
The RETROBI system has been used in other ICL card index catalogues. Other institutions have successfully adopted it to make their card indexes web accessible.
The RETROBI software can be found in these resources:
This project was undertaken between 2009 and 2011 with the support of the Czech Ministry of Education, Youth and Sport under the Information Resources for Research (INFOZ) programme.
|Title:||Digitization of the Retrospective Article Bibliography of Czech Literary Studies (1775–1945) card index catalogue|
|Duration:||1. 10. 2009 – 31. 12. 2011|
|Investigator:||Mgr. Vojtěch Malínek, Ph.D.|
|Investigating team:||Vojtěch Malínek, Jan Anděl, Daniel Řehák, Petra Černohorská, Pavel Mika|