2020
Darryl Wright
- Research Associate
- University of Minnesota, Twin Cities
Abstract
This project explores the use of machine learning within online crowdsourced text transcription projects. In this collaboration, University of Minnesota researchers will train a machine-learning model for handwritten text recognition (HTR) using data from Anti-Slavery Manuscripts, a crowdsourced transcription project hosted on Zooniverse.org. Zooniverse developers at the Adler Planetarium will create a new online workflow to combine machine-generated transcriptions with crowdsourced effort, using existing Zooniverse tools for collaborative text transcription. The Adler and Minnesota teams will test the HTR model on similar datasets from Minnesota's Archives & Special Collections. The output from this effort will be a data pipeline for uploading machine transcription data into the Zooniverse platform, and an evaluation of best practices for combining human and machine effort in the production of high-quality transcription data.