Made for: AI Meets Humanities and Social Sciences conference, 23rd-25th June 2025, Vienna, Austria
Author of the overview: Jan OdstrΔilΓk
Last update: 16th December 2025
Drag and drop systems
No configuration, no custom models, intended for processing of single pages.
Transkribus.ai
π URL: https://transkribus.ai/
πΆ Costs: π
ποΈ Note: Use of large AI models by Transkribus. Suitable only for single pages.
Transkribus public models
π URL: https://app.transkribus.org/models/public
πΆ Costs: π
ποΈ Note: Possibility to test any public model for free for single pages. Very limited export possibilities.
Example - Carolingian Minuscule Model
π URL: https://app.transkribus.org/models/public/text/51210
Simple systems
Simple installation,no manual creation of ground truth, no training of custom models.
Full systems but without custom model training
Manual creation of groud truth and manual corrections of automatically recognized text but no custom models.
Projekt PERO
π URL: https://pero.fit.vutbr.cz/
πΆ Costs: π
β οΈ No possibility of training custom models.β οΈ
Full Integrated Transcription Environments
Possibility to train (and use) custom models in the graphical user interface (GUI).
Transkribus
π URL: https://www.transkribus.org/
πΆ Costs: 50 credits/month for free, subscription model
Pros:
very large community
easy to sign-up and use
a lot of public models
part of an ecosystem
ScanTent, Transkribus, Transkribus Sites
Cons:
advanced features (field training, advanced export options. etc.) require subscription
some solutions could be cheaper
not open-source
not possible to import/export models
eScriptorium
π URL: https://escriptorium.inria.fr/
πΆ Costs: π but you need your own infrastructure
Pros:
large community
open-source
suitable for non-Latin scripts
possibility to import/export Kraken models (to be found on HTR-United and Zenodo)
privacy of the data
Cons:
need for own servers/infrastructure
less user friendly (a new version should appear in 2025)
OCR4All
π URL: https://www.ocr4all.org/
πΆ Costs: π but you need your own infrastructure
Source of the image: https://www.ocr4all.org/about/ocr4all
Calfa Vision
π URL: https://vision.calfa.fr/
πΆ Costs: 3500 Euro per 3500 pages, following pages cheaper
π― Focus: non-Latin-alphabet based scripts
Scribblesense
π URL: https://scribblesense.cz/
πΆ Costs: Currently π
From the same developers as Projekt Pero.
Command-line tools
Programming skills required.
Kraken
π URL: https://kraken.re/main/index.html
πΆ Costs: π but you need your own infrastructure.
ποΈ Note: Used by eScriptorium.
Other similar command-line tools: PyLaia, Tesseract, Calamari.
Visual language models
Uses Gemini 3
Made by Anna MichalcovΓ‘