Blackwell Seminar: Simpler Machine Learning Models for a Complicated World
Seminar presented by Cynthia RudinWhile the trend in machine learning has tended towards building more complicated (black box) models, such models have not shown any performance advantages for many real-world datasets, and they are more difficult to troubleshoot and use. For these datasets, simpler models (sometimes small enough to fit on an index card) can be just as accurate. However, the design of interpretable models is quite challenging due to the "interaction bottleneck" where domain experts must interact with machine learning algorithms.
I will present a new paradigm for interpretable machine learning that solves the interaction bottleneck. In this paradigm, machine learning algorithms are not focused on finding a single optimal model, but instead capture the full collection of good (i.e., low-loss) models, which we call "the Rashomon set." Finding Rashomon sets is extremely computationally difficult, but the benefits are massive. I will present the first algorithm for finding Rashomon sets for a nontrivial function class (sparse decision trees) called TreeFARMS. TreeFARMS, along with its user interface TimberTrek, mitigate the interaction bottleneck for users. TreeFARMS also allows users to incorporate constraints (such as fairness constraints) easily.
I will also present a "path," that is, a mathematical explanation, for the existence of simpler-yet-accurate models and the circumstances under which they arise. In particular, problems where the outcome is uncertain tend to admit large Rashomon sets and simpler models. Hence, the Rashomon set can shed light on the existence of simpler models for many real-world high-stakes decisions. This conclusion has significant policy implications, as it undermines the main reason for using black box models for decisions that deeply affect people's lives.
This is joint work with my colleagues Margo Seltzer and Ron Parr, as well as our exceptional students Chudi Zhong, Lesia Semenova, Jiachang Liu, Zack Boner, Rui Xin, Zhi Chen, and Harry Chen. It builds upon the work of many past students and collaborators over the last decade.
Here are papers I will discuss in the talk:
Cynthia Rudin, Chudi Zhong, Lesia Semenova, Margo Seltzer, Ronald Parr, Jiachang Liu, Srikar Katta, Jon Donnelly, Harry Chen, Zachery Boner Amazing Things Come From Having Many Good Models. ICML spotlight, 2024.
Rui Xin, Chudi Zhong, Zhi Chen, Takuya Takagi, Margo Seltzer, Cynthia Rudin Exploring the Whole Rashomon Set of Sparse Decision Trees, NeurIPS (oral), 2022.
Zijie J. Wang, Chudi Zhong, Rui Xin, Takuya Takagi, Zhi Chen, Duen Horng Chau, Cynthia Rudin, Margo Seltzer
TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization, IEEE VIS, 2022.
Lesia Semenova, Cynthia Rudin, and Ron Parr On the Existence of Simpler Machine Learning Models. ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), 2022.
Lesia Semenova, Harry Chen, Ronald Parr, Cynthia Rudin A Path to Simpler Models Starts With Noise, NeurIPS, 2023.
Zachery Boner, Lesia Semenova, Harry Chen, Cynthia Rudin, Ronald Parr.
Using Noise to Infer Aspects of Simplicity Without Learning, NeurIPS 2024.
Jiachang Liu, Chudi Zhong, Boxuan Li, Margo Seltzer, Cynthia Rudin
FasterRisk: Fast and Accurate Interpretable Risk Scores, NeurIPS, 2022.
Blackwell Seminar:
The Blackwell Seminar was established in 2020 to honor the contributions of David Blackwell to mathematical statistics, game theory, probability theory, and information theory.
David Blackwell had an extraordinary career and overcame numerous obstacles in the process. After receiving his Ph.D. in mathematics in 1941 at the age of 22 from the University of Illinois at Urbana-Champaign, he spent a year at the Institute for Advanced Study at Princeton. During this time, Jerzy Neyman interviewed him for a position at the University of California (UC), Berkeley Department of Mathematics. Prof. Neyman was in favor of offering Prof. Blackwell a faculty position, and his department agreed. However, the department head’s spouse objected based on Blackwell’s race, and so the department did not extend him a job offer.
In 1942, Prof. Blackwell left Princeton and went on to teach at three Historically Black Colleges and Universities (HBCUs), namely Southern University, Clark Atlanta University, and Howard University. Prof. Blackwell remained at Howard for 10 years, where he was quickly promoted to full professor and head of the Department of Mathematics.
In 1954, Prof. Blackwell then moved to Berkeley, as he had been recruited, this time successfully, by Jerzy Neyman to become the first hire in the newly formed Department of Statistics. There, he became the first tenured Black professor in the UC system. During his nearly 35 years at Berkeley, he supervised over 50 doctoral students and served as both department chair and associate dean. He also broke racial barriers by becoming the first Black scholar to be appointed to the National Academy of Sciences in 1965.
The Blackwell Seminar focuses on topics at the interface between statistics and equity issues such as racial disparities, social justice, and ethics. Speakers are invited based on their contributions to these areas. The seminar takes place each year during the fall quarter.
More information about the life and career of David Blackwell can be found in the oral history conducted by Nadine Walmot in 2002 and 2003.
