Full Citation
Title: Predicting Death Using Random Forests
Citation Type: Conference Paper
Publication Year: 2019
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Machine learning methods have become very popular in various scientific disciplines. Using Breimann’s random forests and data from the National Health Interview Survey (NHIS) and its mortality follow-up, we wanted to know 1) Could these methods be used to predict the occurrence of death? 2) Which variables are important for these predictions? We checked the accuracy of the forests by estimating the area under the ROC curve (AUC) for test data and showed that they perform relatively well, with an AUC from 0.83 to 0.87. To indicate the predictive power of every variable we estimated the mean decrease in accuracy (MDA). Not surprisingly ”age” is by far the most predictive, followed by ”mobility limitations” and ”self-rated health”. Typical sociodemographic mortality determinants like ”sex”, ”education”, and ”income” seem to be very weak in their predictive ability in each of the six selected intervals.
Url: http://paa2019.populationassociation.org/uploads/191620
User Submitted?: No
Authors: Sauer, Torsten; Rau, Roland
Conference Name: PAA 2019
Publisher Location: Austin, TX
Data Collections: IPUMS Health Surveys - NHIS
Topics: Fertility and Mortality, Health
Countries: United States