
PSY 410: Data Science for Psychology
2026-05-20
You’ve learned to:
But technical skills ≠ communication skills

Stories are memorable:
Stories are persuasive:
Decisions are emotional:
Don’t just show data — tell a story that:
Before making any visualization, ask:
Finding: CBT reduces depression by 8 points on the BDI-II (d = 0.65)
For researchers:
For clinicians:
Match your plot type to your message:
| Goal | Good choice | Bad choice |
|---|---|---|
| Show change over time | Line plot | Pie chart |
| Compare groups | Bar chart, boxplot | 3D pie chart |
| Show distribution | Histogram, density | Table |
| Show relationship | Scatterplot | Multiple pie charts |
| Show parts of whole | Stacked bar, treemap | 3D bar chart |
Clutter is anything that doesn’t help your audience understand the message.
Common clutter:
therapy_data <- tibble(
condition = rep(c("Control", "CBT", "Mindfulness"), each = 30),
depression = c(rnorm(30, 18, 5), rnorm(30, 12, 5), rnorm(30, 14, 5))
)
ggplot(therapy_data, aes(x = condition, y = depression, fill = condition)) +
geom_boxplot() +
labs(title = "Depression Scores by Treatment Condition") +
theme_gray() +
theme(
panel.background = element_rect(fill = "lightblue"),
panel.grid.major = element_line(color = "darkgray", size = 1),
panel.grid.minor = element_line(color = "gray", size = 0.5)
)

theme_classic() is a great starting point. You can customize it into a reusable function:
theme_story <- function(base_size = 14) {
theme_classic(base_size = base_size) %+replace%
theme(
text = element_text(color = "grey40"),
axis.line = element_line(color = "grey60"),
axis.ticks = element_line(color = "grey60"),
axis.text = element_text(color = "grey40"),
plot.title = element_text(color = "grey30", face = "bold", hjust = 0, size = rel(1.3)),
plot.subtitle = element_text(color = "grey40", hjust = 0),
plot.title.position = "plot",
plot.caption.position = "plot"
)
}Now you can use theme_story() anywhere — and every figure looks consistent.
theme_classic() vs theme_story()Grey text, grey axes, title aligned to the full plot — small changes, big improvement.
Your brain groups things automatically based on:
Use these principles intentionally!
Preattentive attributes are processed by the brain in < 500ms:
Use these to direct attention to what matters
therapy_summary <- therapy_data |>
group_by(condition) |>
summarize(mean_depression = mean(depression))
ggplot(therapy_summary, aes(x = condition, y = mean_depression)) +
geom_col(fill = "gray50") +
labs(
title = "Mean depression by condition",
x = "Condition",
y = "Mean depression score"
) +
theme_story()
therapy_summary <- therapy_summary |>
mutate(highlight = if_else(condition == "CBT", "Highlight", "Normal"))
ggplot(therapy_summary, aes(x = condition, y = mean_depression, fill = highlight)) +
geom_col() +
scale_fill_manual(values = c("Highlight" = "steelblue", "Normal" = "gray70")) +
labs(
title = "CBT reduces depression more than other conditions",
subtitle = "Mean post-treatment BDI-II scores",
x = NULL,
y = "Depression score"
) +
theme_story() +
theme(legend.position = "none")
Visual hierarchy guides the eye:
Size matters:
Bad title: “Depression scores by condition”
Better title: “CBT most effective at reducing depression”
Even better (with context):
viridis or ColorBrewer

Visualizations can deceive (intentionally or not):
treatment_effect <- tibble(
condition = c("Control", "Treatment"),
score = c(18, 16)
)
ggplot(treatment_effect, aes(x = condition, y = score)) +
geom_col(fill = "steelblue") +
coord_cartesian(ylim = c(15, 19)) + # Truncated!
labs(
title = "MISLEADING: Treatment looks very effective",
subtitle = "Y-axis starts at 15, not 0",
x = NULL,
y = "Depression score"
) +
theme_story()

Truncation is fine when:
Never truncate:
Eight lines going everywhere. What’s the takeaway?
Eight slices. Which is biggest? By how much?
Every figure should answer a question:
Ask yourself: If someone only sees this figure for 5 seconds, what should they remember?
Here’s a messy figure:
stress_data <- tibble(
profession = c("Teacher", "Nurse", "Engineer", "Retail", "Admin"),
stress = c(7.2, 8.1, 5.5, 6.8, 6.2),
burnout = c(6.8, 7.9, 4.8, 6.5, 5.9)
)
ggplot(stress_data, aes(x = profession, y = stress, fill = profession)) +
geom_col() +
labs(title = "Stress by Profession") +
theme_gray()theme_story()Time: 10 minutes
Your final project should tell a story with three acts:
Weak narrative:
“I looked at depression and anxiety. Here’s a histogram. Here’s a scatterplot. Here’s a boxplot. The correlation was 0.65.”
Strong narrative:
“Depression and anxiety often co-occur, but we don’t know how strongly they’re related in college students. I analyzed 200 student surveys and found a strong correlation (r = .65). This suggests these conditions may share underlying mechanisms and should be treated together.”
At the end of your presentation, your audience should remember one key takeaway.
What’s yours?
Every figure, every sentence should support that one key message.
You’ll receive a handout on APA figure formatting guidelines.
Key points:
Note
Focus on clarity and communication first, then adjust formatting as needed for specific journals.
Not included in the figure itself:
For each figure, identify:


Apply these same questions to your own final project draft.
Before finalizing any figure, ask:
📖 Read:
✅ Do:
Next session (Correlation & Regression) we’ll reveal Fun Challenge 10: The Final Prediction.
It’s a quick one — you’ll look at a scatterplot and predict the correlation. But the deadline is Tuesday at 11:59 PM, so you’ll get time in class on Monday to work on it with your team.
A figure without a story is just a picture. Ask “so what?” until you have the answer.
See you next week for Quarto and reproducible reports!
PSY 410 | Session 15