Title: Learning from Data: The Two Cultures
Adji Bousso Dieng
Date: July 12, 2021
ABSTRACT
In his influential paper Statistical Modeling: The Two Cultures, written in 2001, Leo Breiman identified and contrasted two approaches to statistical modeling: one that assumes there is a probabilistic model generating the data--the data modeling culture--and another that focuses on mapping inputs to outputs through a black-box--the algorithmic modeling culture. Twenty years later, there is a growing community of researchers working on methodologies embracing both cultures. However, when looking at the broader problem of learning from data, which statistical modeling is an approach to, we can identify two cultures by two separate communities. The first is the statistical modeling culture itself, which starts with a question and/or data. The second, which is driving a lot of the AI breakthroughs, is the task modeling culture, which corresponds to a task-first approach. We revisit Breiman’s take on statis