Lecture 8: Modern Decision Trees

From Hard Splits to Differentiable Soft Trees

Overview

In this lecture, we move from classical decision trees to modern perspectives:

Why axis-aligned splits can be limiting, and how depth affects overfitting
Practical strategies for class imbalance (resampling vs class-weighting)
Differentiable (soft) decision trees that enable gradient-based learning
How to evaluate beyond accuracy with AUC/PR-AUC, calibration, and confusion matrices

By the end of this lecture, you will:

Understand tree impurity criteria and how splits are chosen
Diagnose overfitting and complexity growth as depth increases
Address class imbalance with data rebalancing and class weighting (and know the math behind it)
Build intuition for soft trees and temperature-controlled routing
Compare hard vs soft boundaries and assess probability calibration

Quick Access

Home Credit Default Risk (Kaggle): real-world credit risk dataset used throughout the lecture.
- Source: https://www.kaggle.com/competitions/home-credit-default-risk
- Please review the dataset license and Kaggle terms of use before redistribution
Libraries: scikit-learn (trees/metrics/visualization), PyTorch (soft trees)
Soft Decision Trees (prior work):
- Frosst, N., & Hinton, G. (2017). Distilling a Neural Network Into a Soft Decision Tree
- Kontschieder, P., Fiterau, M., Criminisi, A., & Bulo, S. R. (2015). Deep Neural Decision Forests (ICCV)

Previous: ← Lecture 7: Overfitting and Regularization | Next: Lecture 9: Ensemble Methods →