Assignment 3: Tidying & Importing Data

Due by 11:59 PM on Sunday, April 26, 2026

NoteAssignment Details

Assigned: Monday, April 20 (Session 7) Due: Sunday, April 26 at 11:59 PM Submit: Quarto document (.qmd) AND rendered HTML on Canvas

TipFirst Quarto assignment!

This is the first assignment submitted as a Quarto document. See the guides: Setting Up an R Project | Using Quarto Documents

Overview

This assignment practices reshaping data with pivot_longer() and pivot_wider(), and importing data from CSV files. You’ll work with psychology-style datasets.

Setup

# Assignment 3: Tidying & Importing Data
# Your Name
# Date

library(tidyverse)

Part 1: Pivoting Wide to Long (30 points)

A researcher collected anxiety scores at three time points. The data is in “wide” format:

anxiety_wide <- tibble(
  participant_id = 1:5,
  condition = c("treatment", "treatment", "control", "control", "treatment"),
  anxiety_t1 = c(45, 52, 48, 55, 42),
  anxiety_t2 = c(38, 45, 47, 54, 35),
  anxiety_t3 = c(32, 40, 46, 52, 30)
)

Task 1.1

Pivot this data to long format so you have columns for:

  • participant_id
  • condition
  • time (values: “t1”, “t2”, “t3”)
  • anxiety (the score)

Task 1.2

Using your long dataset, calculate the mean anxiety score at each time point, separately for each condition.

Task 1.3

Create a line plot showing anxiety over time, with separate lines for each condition. Add points for the individual observations.

Part 2: Pivoting Long to Wide (25 points)

A survey measured different emotions. The data is in long format:

emotions_long <- tibble(
  participant = rep(1:4, each = 3),
  emotion = rep(c("happy", "sad", "anxious"), 4),
  rating = c(7, 2, 3, 5, 4, 6, 8, 1, 2, 6, 5, 4)
)

Task 2.1

Pivot this to wide format so each emotion is its own column.

Task 2.2

Using the wide format, create a new variable that is the sum of all three emotion ratings for each participant.

Part 3: Importing Data (35 points)

Download the provided data files from Canvas and save them in your project’s data/raw/ folder:

  • survey_data.csv — Survey responses with some messy formatting
  • demographics.xlsx — Participant demographics in Excel format

Task 3.1

Import survey_data.csv using read_csv(). Note any warnings or issues.

Task 3.2

The CSV has some problems:

  • Missing values are coded as “N/A” instead of blank
  • Some columns have wrong types

Re-import the data handling these issues using the appropriate read_csv() arguments (hint: na = and col_types =).

Task 3.3

Import demographics.xlsx using the readxl package. The data is on the second sheet.

library(readxl)
# Your code here

Task 3.4

Join the survey data with the demographics data by participant ID. How many participants have complete data in both files?

Grading Rubric

Component Points
Part 1: Pivoting wide to long 30
Part 2: Pivoting long to wide 25
Part 3: Importing data 35
Code runs without errors 10
Total 100

Submission

Submit your .qmd file and your rendered .html file on Canvas.


NotePSY 510 (Graduate Students)

Students enrolled in PSY 510 must complete the following extension in addition to all tasks above.

Graduate Extension: Qualtrics API

Most psychology researchers collect data through Qualtrics. Instead of logging in and downloading a CSV by hand, you can pull data directly into R using the qualtRics package — a workflow that scales to repeated data collection and eliminates manual steps that introduce error.

For this extension you’ll pull data from the anonymous start-of-term survey that your classmates completed in Week 1. Because it was collected anonymously, the data are safe to work with directly — no masking needed.

Setup

# install.packages("qualtRics")
library(qualtRics)

Task G.1

Authenticate with the Qualtrics API using your UO API key and data center ID. You’ll find both in Qualtrics under Account Settings → Qualtrics IDs.

qualtrics_api_credentials(
  api_key  = "YOUR_API_KEY",
  base_url = "YOUR_DATACENTER.qualtrics.com"
)

Task G.2

Pull responses from the start-of-term survey using fetch_survey(). The survey ID is posted on Canvas.

survey_id <- "SV_XXXXXXXXXXXXXXX"  # replace with actual ID from Canvas
raw <- fetch_survey(surveyID = survey_id)
glimpse(raw)

Task G.3

The raw API output includes a lot of Qualtrics metadata columns (timing, location, status flags) alongside the actual responses. Select only the columns that correspond to the survey questions and give them informative names.

Task G.4

Compare your cleaned API output to the manually exported CSV version of the same survey (posted on Canvas). Note at least two differences in column names, data types, or structure. What would you need to do to make them match exactly?

Task G.5

Write a short reflection (~half a page) answering: When would you use the API rather than a manual export in your own research? What are the tradeoffs in terms of effort, reliability, and reproducibility?

Submission: Add your code and reflection to your .qmd file under a clearly marked ## Graduate Extension section.