Full Citation
Title: Simple strategies for improving inference with linked data: a case study of the 1850–1930 IPUMS linked representative historical samples
Citation Type: Journal Article
Publication Year: 2020
ISBN:
ISSN: 0161-5440
DOI: 10.1080/01615440.2019.1630343
NSFID:
PMCID:
PMID:
Abstract: New large-scale linked data are revolutionizing quantitative history and demography. This paper proposes two complementary strategies for improving inference with linked historical data: the use of validation variables to identify higher quality links and a simple, regression-based weighting procedure to increase the representativeness of custom research samples. We demonstrate the potential value of these strategies using the 1850–1930 Integrated Public Use Microdata Series Linked Representative Samples (IPUMS-LRS)—a high quality, publicly available linked historical dataset. We show that, while incorrect linking rates appear low in the IPUMS-LRS, researchers can reduce error rates further using validation variables. We also show how researchers can reweight linked samples to balance observed characteristics in the linked sample with those in a reference population using a simple regression-based procedure.
Url: https://www.tandfonline.com/doi/full/10.1080/01615440.2019.1630343
User Submitted?: No
Authors: Bailey, Martha; Cole, Connor; Massey, Catherine
Periodical (Full): Historical Methods: A Journal of Quantitative and Interdisciplinary History
Issue: 2
Volume: 53
Pages: 80-93
Data Collections: IPUMS USA - Ancestry Full Count Data
Topics: Methodology and Data Collection, Other
Countries: