Full Citation
Title: The problem of false positives in automated census linking: Nineteenth-century New York’s Irish immigrants as a case study
Citation Type: Journal Article
Publication Year: 2024
ISBN:
ISSN: 19401906
DOI: 10.1080/01615440.2024.2312293
NSFID:
PMCID:
PMID:
Abstract: Automated census linkage algorithms have become popular for generating longitudinal data on social mobility, especially for immigrants and their children. But what if these algorithms are particularly bad at tracking immigrants? This study utilizes a database on nineteenth-century Irish immigrants, generated from the most widely used algorithms, created by Abramitzky, Boustan, and Eriksson (ABE). Our objective is to assess the extent to which different individuals are erroneously linked together across census years and the consequences of these “false positives” for calculating social mobility. Our findings raise serious questions about the quality of the matches generated by the “first generation” of automated census linkage algorithms. False positives range from about one-third to one-half of all links. These bad links lead to sizeable estimation errors when measuring Irish immigrant social and geographic mobility.
Url: https://www.tandfonline.com/doi/abs/10.1080/01615440.2024.2312293
User Submitted?: No
Authors: Ó Gráda, Cormac; Anbinder, Tyler; Connor, Dylan; Wegge, Simone A.
Periodical (Full): Historical Methods: A Journal of Quantitative and Interdisciplinary History
Issue:
Volume:
Pages: 1-21
Data Collections: IPUMS USA - Ancestry Full Count Data
Topics: Migration and Immigration, Population Data Science
Countries: