Full Citation
Title: Continuously Updated Data Analysis Systems
Citation Type: Miscellaneous
Publication Year: 2019
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: When doing data science, it’s important to know what you’re building. This paper describes an idealized final product of a data science project, called a Continuously Updated Data-Analysis System (CUDAS). The CUDAS concept synthesizes ideas from a range of successful data science projects, such as Nate Silver’s FiveThirtyEight. A CUDAS can be built for any context, such as the state of the economy, the state of the climate, and so on. To demonstrate, we build two CUDAS systems. The first provides continuously-updated ratings for soccer players, based on the newly developed Augmented Adjusted Plus-Minus statistic. The second creates a large dataset of synthetic ecosystems, which is used for agent-based modeling of infectious diseases.
Url: https://arxiv.org/pdf/1907.09333.pdf
User Submitted?: No
Authors: Richardson, Lee, F
Publisher: Carnegie Mellon University
Data Collections: IPUMS International
Topics: Health, Methodology and Data Collection, Population Data Science
Countries: