Lecture 2: k-Nearest Neighbors

Framing Prediction Problems & Instance-Based Learning

Overview

In this hands-on lecture, we tackle a real problem: finding compatible ML project teammates using k-Nearest Neighbors. Yes, you’ll actually use this system to form your project teams! We’ll explore how to frame real-world problems as ML tasks, understand the elegance of lazy learning, and confront the challenges of high-dimensional spaces.

Learning Objectives

By the end of this lecture, you will:

  • Frame real-world problems as ML tasks with inputs (X) and outputs (y)
  • Apply k-NN for classification and regression tasks
  • Analyze the impact of distance metrics and feature scaling
  • Understand the curse of dimensionality and its practical implications
  • Design fair and effective matching systems with domain constraints
  • Evaluate trade-offs between different similarity measures

Materials

ImportantPre-Class Requirements

Complete the Project Matchmaker Form by Mon 8/26, 12:00 PM ET. Required for Lecture 2: k-NN; counts toward participation.

Interactive Demo

Note🎮 Team Matcher Visualization

Experience k-NN in action with our Interactive Team Matching Demo

This real-time visualization lets you:

  • Adjust k parameter and see immediate effects
  • Explore different dimension pairs
  • View how proximity translates to similarity
  • Discover natural clustering patterns in the class

Key Topics

  1. The Data Journey: From text surveys → NLP features → 8D vectors
  2. k-NN Fundamentals: The beautiful simplicity of “you are your neighbors”
  3. Distance Metrics: Euclidean, Manhattan, Cosine, and when each matters
  4. Curse of Dimensionality: When all points become equidistant
  5. Fairness in ML: Ensuring inclusive team formation

Additional Resources


Previous: ← Lecture 1: Welcome to ML | Next: Lecture 3: Linear Regression →