6: Data Import

Content for Wednesday, April 15, 2026

Before class

📖 Readings:

During class

We’ll cover:

  • The data science workflow: where import fits in
  • read_csv() — reading CSV files and handling common problems
  • Column types and how to fix them when R guesses wrong
  • Missing value codes (-999, "N/A", etc.)
  • readxl::read_excel() — working with Excel files
  • A quick look at SPSS files with haven::read_sav()
  • Practical tips for importing Qualtrics exports

Slides

View slides in new tab Download PDF

Embedded slides

After class

Practice:

  1. Find a CSV file on your computer (or download one) and import it with read_csv(). Run glimpse() — do the types look right?
  2. Check problems() after importing. Does it flag anything?
  3. Try reading the messy CSV we created in class. Clean up the column names and missing values.
  4. If you have an Excel file handy, try read_excel(). What happens with multiple sheets?
NoteQualtrics import cheat sheet

Most Qualtrics CSV exports have two extra description rows after the header. The standard fix:

qualtrics_data <- read_csv("my_export.csv",
  skip = 2,
  na = c("", "N/A", "-999")
)

Always glimpse() right after — Qualtrics column names are ugly but fixable with rename().

Package check

If you haven’t already, make sure readxl is installed:

install.packages("readxl")

It’s not part of tidyverse, but it is part of the broader tidyverse ecosystem — library(tidyverse) does not load it automatically.