Yesterday at TPDL 2023 Herman Kroll and Mirjam Cuper presented “Aspect-Driven Structuring of Historical Dutch Newspaper Archives”
The authors discussed the challenges work with a corpus with unreliable OCR, inconsitent metadata, and the licensing restrictions.
Ref:
doi.org/10.1007/978-...