DH explorations

When machines read texts

Created by Jan Odstrcilik, 16th December 2025 Lecture at UniWien

December 16, 2025

Github repository for the lecture:https://github.com/ciceronianus/lecture-when-machines-read

Bluesky of DigiLab IMAFO: https://bsky.app/profile/digilab-imafo.bsky.social


How HTR works (simulators)
Warning: They only simulate how various technologies work.

How HTR works (simulators)

Warning: They only simulate how various technologies work.

Retro OCR Simulator
https://pattern-ocr-simulator-626729193090.us-west1.run.app/
  • Made by Jan Odstrčilík

NeuroScript | CRNN Visualizer
https://how-htr-works.vercel.app/
  • Made by Martin Roček

The Algorithmic Scribe
https://how-llm-work.vercel.app/
  • Made by Martin Roček


Overview of ATR/HTR systems
Made for: AI Meets Humanities and Social Sciences conference, 23rd-25th June 2025, Vienna, Austria
https://leaflet.pub/8a9f8f63-8c33-4f6b-926c-c314417a0337
Tools to try out
Images:https://github.com/ciceronianus/lecture-when-machines-read

Tools to try out

Images:https://github.com/ciceronianus/lecture-when-machines-read

Latin – Carolingian Minuscule
This model is based on 46 different manuscripts from ca. 800 until ca. 1100, mainly written in the Carolingian minuscule.The manuscripts cover different genres such as biblical texts, theological treatises, penitentiary books, histories, mathematical treatises, Canon Law and capitularies and many more.A complete description of the model will be soon published by Tim Geelhaar in AMAD.org.
https://www.transkribus.org/en/model/latin-carolingian-minuscule
Google AI Studio
The fastest path from prompt to production with Gemini
https://aistudio.google.com/
SmartSkriptor
https://smartskriptor-859641874612.us-west1.run.app/


Character error rate
Calculation of CER:
S = Substitutions

Character error rate

Calculation of CER:

CER=NS+D+I​

S = Substitutions

D = Deletions

I = Insertions

N = Number of Characters

CER Simulator
https://vibe-cer-simulator.vercel.app/
  • Made with AI by Jan Odstrčilík

CERberus -- guardian against character errors
https://cerberus.humanities.tools/
  • For independent calculations of CER

  • Created by Wouter Haverals

  • Instance hosted by Martin Roček


Processing results
Working with PageXML

Processing results

Working with PageXML

HTR Reader
https://htr-reader.humanities.tools/
  • creating a simple edition from PageXML

  • Made by Martin Roček


Learning HTR
Annual Winter School at IMAFO, ÖAW

Learning HTR

Annual Winter School at IMAFO, ÖAW

HTR Winter School 2025 | Call for Application!
Handwritten Text Recognition of Historical Sources
https://www.oeaw.ac.at/imafo/veranstaltungen/detail/htr-of-historical-sources

TranscriboQuest

TranscriboQuest 2025 - Sciencesconf.org
https://transcriboquest.sciencesconf.org/


atr
htr
ocr

DH explorations