Week 9: Feature Engineering
Learning Objectives
By the end of this week, students will be able to:
- Apply scaling and normalization to numerical features
- Encode categorical variables using one-hot and ordinal encoding
- Handle missing data with imputation strategies
- Build a scikit-learn
Pipelineto chain preprocessing and modeling steps
Perspectival Reading
Reading: TBD
Reflection Questions
- Feature engineering requires domain knowledge — whose knowledge counts, and who is excluded from this process?
- Imputation fills in missing values with estimates. What assumptions does a chosen strategy make about why data is missing?
- When you encode a variable like gender or ethnicity, what are you doing to how the model treats those groups?
Slides
Notebook Demo
Open in Google Colab (link TBD)
Lab Assignment
Week 9 Lab — GitHub Classroom (link TBD)