Lecture 2: k-Nearest Neighbors
Framing Prediction Problems & Instance-Based Learning
Overview
In this hands-on lecture, we tackle a real problem: finding compatible ML project teammates using k-Nearest Neighbors. Yes, you’ll actually use this system to form your project teams! We’ll explore how to frame real-world problems as ML tasks, understand the elegance of lazy learning, and confront the challenges of high-dimensional spaces.
Learning Objectives
By the end of this lecture, you will:
- Frame real-world problems as ML tasks with inputs (X) and outputs (y)
- Apply k-NN for classification and regression tasks
- Analyze the impact of distance metrics and feature scaling
- Understand the curse of dimensionality and its practical implications
- Design fair and effective matching systems with domain constraints
- Evaluate trade-offs between different similarity measures
Materials
TipQuick Access
ImportantPre-Class Requirements
Complete the Project Matchmaker Form by Mon 8/26, 12:00 PM ET. Required for Lecture 2: k-NN; counts toward participation.
Interactive Demo
Note🎮 Team Matcher Visualization
Experience k-NN in action with our Interactive Team Matching Demo
This real-time visualization lets you:
- Adjust k parameter and see immediate effects
- Explore different dimension pairs
- View how proximity translates to similarity
- Discover natural clustering patterns in the class
Key Topics
- The Data Journey: From text surveys → NLP features → 8D vectors
- k-NN Fundamentals: The beautiful simplicity of “you are your neighbors”
- Distance Metrics: Euclidean, Manhattan, Cosine, and when each matters
- Curse of Dimensionality: When all points become equidistant
- Fairness in ML: Ensuring inclusive team formation
Additional Resources
Previous: ← Lecture 1: Welcome to ML | Next: Lecture 3: Linear Regression →