Program

ACLS Digital Justice Seed Grants, 2022

Project

The Personal Writes the Political: Rendering Black Lives Legible Through the Application of Machine Learning to Anti-Apartheid Solidarity Letters

Department

History

Abstract

Project Narrative - This project uses machine learning (ML) models to extract data from an archive of anti-apartheid solidarity letters predominantly written by Black South African women. This project intends to utilize newly developed optical character recognition (OCR) and handwritten text recognition (HTR) methods to render images of handwritten letters into machine readable text. Once processed, we will then train custom ML models to produce triplets, meaning two or more nouns related via a verb that indicate a qualitative relationship between two categories of data. A knowledge base derived from entity triplets will permit us to better understand the lives, struggles and contributions of Black women in South Africa by collecting data on relations embedded in their own words.

Abstract

The Personal Writes the Political (PWP) is a digital humanities project that applies advanced machine learning (ML) models to anti-apartheid solidarity letters predominantly authored by Black South African women. We created a software pipeline called Careful Recall (Caracal) which automates the transcription of handwritten materials into machine readable text and conducts named entity recognition to distinguish between private identifying information from relevant research data in highly sensitive handwritten archives. We will use Caracal to extract modest datasets from selections of thematically united letters we call focal clusters. In addition we will conduct skills transfer to this collections' home archive, the Mayibuye Centre Archive.