Optimizing Crowdsourced Transcription using Handwritten Text Recognition


ACLS Digital Extension Grants


History, University Libraries


This project explores the use of machine learning within online crowdsourced text transcription projects. In this collaboration, University of Minnesota researchers will train a machine-learning model for handwritten text recognition (HTR) using data from Anti-Slavery Manuscripts, a crowdsourced transcription project hosted on Zooniverse developers at the Adler Planetarium will create a new online workflow to combine machine-generated transcriptions with crowdsourced effort, using existing Zooniverse tools for collaborative text transcription. The Adler and Minnesota teams will test the HTR model on similar datasets from Minnesota's Archives & Special Collections. The output from this effort will be a data pipeline for uploading machine transcription data into the Zooniverse platform, and an evaluation of best practices for combining human and machine effort in the production of high-quality transcription data.