Full Citation
Title: Link-Lives, Historical Big Data: Reconstructing Millions of Life Courses from Archival Records Using Domain Experts and Machine Learning
Citation Type: Miscellaneous
Publication Year: 2021
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: The Danish archives comprise some of the world's most comprehensive source coverage but despite large-scale digitization and transcription projects by diverse actors, there are no shared standards or possibilities for data linkage. The Denmark-based Link-Lives research project (2019-2024) is tackling this disparity by linking individual-level Danish records in census and parish record sources from 1787-1968 to create a multigenerational database for research using a combination of domain expertise and machine learning techniques. In contrast to small-sample linking or fully automated processes, Link-Lives is creating its own manually-linked data to train machine learning as well as exploring the impacts of different approaches to linking. Due to personal data protection legislation and propriety agreements, the data cannot be fully open access, but data outputs will be made available to both researchers and the general public via a website. The project's interdisciplinary team is based at the Danish National Archives and the University of Copenhagen, in partnership with Copenhagen City Archives, and funded by Carlsberg and Innovation Fund Denmark.
Url: http://ceur-ws.org/Vol-3019/LinkedArchives_2021_paper_9.pdf
User Submitted?: No
Authors: Revuelta-Eugercios, Bárbara A.; Robinson, Olivia; Løkke, Anne
Publisher:
Data Collections: IPUMS USA
Topics: Population Data Science
Countries: