3: Data Transformation I
Content for Monday, April 6, 2026
Before class
📖 Reading:
- R4DS Ch 3: Data transformation (sections 3.1–3.4)
ImportantAssignment 1 is due today
Assignment 1: Getting Started — due Sunday, April 5 at 11:59 PM.
During class
We’ll cover:
- Introduction to dplyr
filter()— pick rows by their valuesarrange()— reorder rowsselect()— pick columns by namemutate()— create new columns- The pipe operator (
|>)
Slides
View slides in new tab Download PDFEmbedded slides
After class
✅ Practice:
Using the flights dataset from the nycflights13 package:
- Filter to flights departing in December
- Find all flights to Los Angeles (LAX)
- Create a new variable for flight speed (distance / air_time * 60)
- Select only the carrier, origin, destination, and your new speed variable
- Arrange by speed to find the fastest flights
TipKeyboard shortcut
The pipe (|>) is so common that there’s a keyboard shortcut:
- Windows/Linux: Ctrl + Shift + M
- Mac: Cmd + Shift + M
Installing the flights dataset
install.packages("nycflights13")
library(nycflights13)