Week 5: Clustering & Dimensionality Reduction
Learning Objectives
By the end of this week, students will be able to:
- Distinguish unsupervised from supervised learning
- Apply k-means clustering and choose k using the elbow method and silhouette score
- Apply PCA to reduce dimensions and interpret principal components
- Use t-SNE or UMAP for 2D visualization of high-dimensional data
Perspectival Reading
Reading: TBD
Reflection Questions
- Clustering imposes structure on data — what happens when the groups we find reflect historical inequities?
- PCA finds directions of maximum variance. Whose variation is centered, and whose is treated as noise?
- Unsupervised methods have no ground truth. How should that affect our confidence in their outputs?
Slides
Notebook Demo
Open in Google Colab (link TBD)
Lab Assignment
Week 5 Lab — GitHub Classroom (link TBD)