default search action
DATeCH 2019: Brussels, Belgium
- Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, DATeCH 2019, Brussels, Belgium, May 08-10, 2019. ACM 2019, ISBN 978-1-4503-7194-0
Evaluation and improvement of OCR
- Matthias Boenig, Konstantin Baierer, Volker Hartmann, Maria Federbusch, Clemens Neudecker:
Labelling OCR Ground Truth for Usage in Repositories. 3-8 - Anna-Maria Sichani, Panagiotis Kaddas, Georgios K. Mikros, Basilis Gatos:
OCR for Greek polytonic (multi accent) historical printed documents: development, optimization and quality control. 9-13 - Hsiang-An Wang, Pin-Ting Liu:
Towards a Higher Accuracy of Optical Character Recognition of Chinese Rare Books in Making Use of Text Model. 15-18 - Tobias Englmeier, Florian Fink, Klaus U. Schulz:
A-I-PoCoTo: Combining Automated and Interactive OCR Postcorrection. 19-24
Applications
- Emad Mohamed, Zeeshan Ali Sayyed:
Arabic-SOS: Segmentation, Stemming, and Orthography Standardization for Classical and pre-Modern Standard Arabic. 27-32 - Christian Reul, Sebastian Göttel, Uwe Springmann, Christoph Wick, Kay-Michael Würzner, Frank Puppe:
Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification: A Case Study on Daniel Sander's Wörterbuch der Deutschen Sprache. 33-38 - Juri Opitz, Leo Born, Vivi Nastase, Yannick Pultar:
Automatic Reconstruction of Emperor Itineraries from the Regesta Imperii. 39-44 - Karin Hofmeester, Ashkan Ashkpour, Katrien Depuydt, Jesse de Does:
Diamonds in Borneo: Commodities as Concepts in Context. 45-50
OCR and HTR in practise
- Clemens Neudecker, Konstantin Baierer, Maria Federbusch, Matthias Boenig, Kay-Michael Würzner, Volker Hartmann, Elisa Herrmann:
OCR-D: An end-to-end open source OCR framework for historical printed documents. 53-58 - Kimmo Kettunen, Teemu Ruokolainen, Erno Liukkonen, Pierrick Tranouez, Daniel Antelme, Thierry Paquet:
Detecting Articles in a Digitized Finnish Historical Newspaper Collection 1771-1929: Early Results Using the PIVAJ Software. 59-64 - Christian Clausner, Apostolos Antonacopoulos, Christy Henshaw, Justin Hayes:
Towards the Extraction of Statistical Information from Digitised Numerical Tables: The Medical Officer of Health Reports Scoping Study. 65-71 - Arnau Baró, Jialuo Chen, Alicia Fornés, Beáta Megyesi:
Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts. 73-78
Digitisation of historical languages
- Bruno Bon, Krzysztof Nowak, Laura Vangone:
Challenges of Mass OCR-isation of Medieval Latin Texts in a Resource-Limited Project. 81-85 - Eliese-Sophia Lincke, Kirill Bulert, Marco Büchler:
Optical Character Recognition for Coptic fonts: A multi-source approach for scholarly editions. 87-91 - Thomas Milo, Alicia González Martínez:
A New Strategy for Arabic OCR: Archigraphemes, Letter Blocks, Script Grammar, and shape synthesis. 93-96 - Senka Drobac, Pekka Kauppinen, Krister Lindén:
Improving OCR of historical newspapers and journals published in Finland. 97-102
Access to data
- Anne Gorter, Rutger van Koert, Ismee Tames, Edwin Klijn, Marielle Scherer:
From Tribunal Archive to Digital Research Facility (TRIADO): Exploring ways to make archives accessible and useable. 105-110 - Tom Derrick, Nora McGregor:
Cross-disciplinary Collaborations to Enrich Access to Non-Western Language Material in the Cultural Heritage Sector. 111-116 - Georg Rehm, Martin Lee, Julián Moreno Schneider, Peter Bourgonje:
Curation Technologies for Cultural Heritage Archives: Analysing and transforming a heterogeneous data set into an interactive curation workbench. 117-122 - Evagelos G. Varthis, Marios Poulos, Ilias Yarenis, Sozon Papavlasopoulos:
Implementation of a Databaseless Web REST API for the Unstructured Texts of Migne's Patrologia Graeca with Searching capabilities and additional Semantic and Syntactic expandability. 123-129
Natural language processing
- Helmut Schmid:
Deep Learning-Based Morphological Taggers and Lemmatizers for Annotating Historical Texts. 133-137 - Jeremi K. Ochab, Holger Essler:
Stylometry of literary papyri. 139-142 - Sandra Young:
Using lexicography to characterise relations between species mentions in the biodiversity literature. 143-148 - Giuseppe G. A. Celano:
Standoff Annotation for the Ancient Greek and Latin Dependency Treebank. 149-153
Metadata
- Liviu-Ovidiu Pop:
Hidden Metadata in Plain Sights: Romanian Folklore Catalogues. 157-159 - Péter Király:
Validating 126 million MARC records. 161-168 - Katrien Depuydt, Hennie Brugman:
Turning Digitised Material into a Diachronic Corpus: Metadata Challenges in the Nederlab Project. 169-173
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.