Lecture 7: Overfitting and Regularization
From Memorization to Generalization
Overview
This lecture examines overfitting and generalization using Fashion-MNIST. We will diagnose overfitting by comparing training and validation performance, and then study both model-centric (regularization) and data-centric (quantity/diversity) interventions to improve generalization. The emphasis is on principled evaluation and reproducible procedures.
Learning Objectives
By the end of this lecture, you will:
- Detect overfitting early using validation sets and learning curves
- Master classical regularization techniques (L1/L2, early stopping)
- Apply modern approaches (label smoothing, mixup augmentation)
- Explore data-centric solutions through quantity and diversity experiments
Materials
TipQuick Access
NoteThree Fundamental Curves You’ll Generate
- Model Complexity Curve: The classic U-shaped validation error showing underfitting → optimal → overfitting zones
- Data Quantity Curve: Diminishing returns as training samples increase
- Data Diversity Effect: Diverse samples can outperform larger homogeneous sets for out-of-distribution generalization
Datasets & Acknowledgments
- Fashion-MNIST Dataset (Kaggle)
Source: Zalando Research
Size: 70,000 grayscale images (28×28) of 10 clothing categories
Why This Dataset: Complex enough to demonstrate overfitting, simple enough to train quickly, real-world relevance
Previous: ← Lecture 6: Model Evaluation and Data Deception | Next: TBD