IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Predicting Death Using Random Forests

Citation Type: Conference Paper

Publication Year: 2019

Abstract: Machine learning methods have become very popular in various scientific disciplines. Using Breimann’s random forests and data from the National Health Interview Survey (NHIS) and its mortality follow-up, we wanted to know 1) Could these methods be used to predict the occurrence of death? 2) Which variables are important for these predictions? We checked the accuracy of the forests by estimating the area under the ROC curve (AUC) for test data and showed that they perform relatively well, with an AUC from 0.83 to 0.87. To indicate the predictive power of every variable we estimated the mean decrease in accuracy (MDA). Not surprisingly ”age” is by far the most predictive, followed by ”mobility limitations” and ”self-rated health”. Typical sociodemographic mortality determinants like ”sex”, ”education”, and ”income” seem to be very weak in their predictive ability in each of the six selected intervals.

Url: http://paa2019.populationassociation.org/uploads/191620

User Submitted?: No

Authors: Sauer, Torsten; Rau, Roland

Conference Name: PAA 2019

Publisher Location: Austin, TX

Data Collections: IPUMS Health Surveys - NHIS

Topics: Fertility and Mortality, Health

Countries: United States

IPUMS NHGIS NAPP IHIS ATUS Terrapop