+ - 0:00:00
Notes for current slide
Notes for next slide

Making your research accessible: Data Visualization and Interactive Graphics

Session 3 - Facets and Small Multiples


Felix Haass

1 / 18

facets

faceted plots (or small multiple plots) are a way to divide your data up by a categorical variable. Facets are "not a geom, but rather a way of organizing a series of geoms" (Kieran Healy).

2 / 18

facets: think about the comparison!

GDP/pc development by continent.

In ggplot, we use the facet_wrap() building block to specify the faceting variable(s).

p <- ggplot(data = gapminder,
mapping = aes(x = year,
y = gdpPercap)) +
geom_line(color="gray70",
aes(group = country)) + # recall we need to map group to country
facet_wrap(~ continent, # "~"
ncol = 5) # how many columns?
3 / 18

facets: think about the comparison!

print(p)

4 / 18

facets: all elements

library(tidyverse)
library(gapminder)
p <- ggplot(data = gapminder,
mapping = aes(x = year,
y = gdpPercap)) +
geom_line(color="gray70", aes(group = country)) +
# add smoother
geom_smooth(size = 1.1, method = "loess", se = FALSE) +
# log y axis (could've also wrapped y=log(gdpPercap) in aes() above)
scale_y_log10(labels=scales::dollar) +
# facet command
facet_wrap(~ continent, ncol = 5) +
# labels and appearance tweaks
labs(x = "Year",
y = "GDP per capita",
title = "GDP per capita on Five Continents") +
theme_bw() +
theme(axis.text.x = element_text(size = 5))
5 / 18

facets: all elements

print(p)

6 / 18

facets: more applications

Recall our example: relationship between GDP per capita and population in Asia.

gdp_pop_plot <- ggplot(gapminder %>% filter(continent == "Asia"),
aes(x = log(pop),
y = log(gdpPercap))) +
geom_point(alpha = 0.5, size = 2) +
geom_smooth(method = "lm")
print(gdp_pop_plot)

7 / 18

Really a negative relationship?

8 / 18

facets: more applications

Plot regression line by country (without facets)

gdp_pop_plot <- ggplot(gapminder %>% filter(continent == "Asia"),
aes(x = log(pop),
y = log(gdpPercap))) +
geom_point(aes(color = country),
alpha = 0.5, size = 2) +
geom_smooth(aes(fill = country, color = country),
method = "lm") +
theme(legend.position = "none")
9 / 18

facets: more applications

10 / 18

facets: more applications

Prior plot useful for MoMA, but not for the data analyst. How to do better?

facets!

gapminder_asia <- gapminder %>%
filter(continent == "Asia")
gdp_pop_plot <- ggplot(gapminder_asia,
aes(x = log(pop),
y = log(gdpPercap))) +
geom_point(alpha = 0.5, size = 1) +
geom_smooth(method = "lm", size = 0.7) +
facet_wrap(~ country, scales = "free") + # scales = "free" to vary axis limits +
theme_bw() +
theme(axis.text = element_text(size = 4),
strip.text = element_text(size = 6))
11 / 18

facets: more applications

12 / 18

Useful tips from the dataviz ninja

  1. Think hard about what you want to visualize!

  2. Don't use too many aesthetics - just use those that help you clarify your comparison!

  3. Trial and error is your friend!

  4. Alphabet is the least useful ways to organize information.

13 / 18

facets: order by summary statistic

library(forcats) # useful to reorder factors or ordered categorical variables
gapminder_asia <- gapminder %>%
filter(continent == "Asia") %>%
# do all data manipulation by country
group_by(country) %>%
# extract beta coefficient from reg of GDP on pop
mutate(beta = coef(lm(log(gdpPercap) ~ log(pop)))[2]) %>%
# remove country grouping
ungroup() %>%
# sort "country" variable by beta
mutate(country_order = fct_reorder(country, beta))
head(gapminder_asia, 5)
## # A tibble: 5 x 8
## country continent year lifeExp pop gdpPercap beta
## <fctr> <fctr> <int> <dbl> <int> <dbl> <dbl>
## 1 Afghanistan Asia 1952 28.801 8425333 779.4453 -0.03799305
## 2 Afghanistan Asia 1957 30.332 9240934 820.8530 -0.03799305
## 3 Afghanistan Asia 1962 31.997 10267083 853.1007 -0.03799305
## 4 Afghanistan Asia 1967 34.020 11537966 836.1971 -0.03799305
## 5 Afghanistan Asia 1972 36.088 13079460 739.9811 -0.03799305
## # ... with 1 more variables: country_order <fctr>
14 / 18

facets: order by summary statistic

15 / 18

facets: Exercise

Plot the relationship between gdpPercap and lifeExp in the Americas, faceted by country.

Bonus: sort country by the direction + strength of the relationship between gdpPercap and lifeExp

What is surprising?

16 / 18

facets: Exercise Solution

gapminder_americas <- gapminder %>%
filter(continent == "Americas") %>%
group_by(country) %>%
mutate(beta = coef(lm(lifeExp ~ log(gdpPercap)))[2]) %>%
ungroup() %>%
mutate(country_order = fct_reorder(country, beta))
gdp_lifeexp_americas <- ggplot(gapminder_americas,
aes(x = log(gdpPercap),
y = lifeExp)) +
geom_point(alpha = 0.5, size = 1) +
geom_smooth(method = "lm", size = 0.7) +
facet_wrap(~ country_order, # facet by "country_order"!
scales = "free") + # scales = "free" to vary axis limits +
theme_bw() +
theme(axis.text = element_text(size = 4),
strip.text = element_text(size = 6))
17 / 18

facets: Exercise Solution

18 / 18

facets

faceted plots (or small multiple plots) are a way to divide your data up by a categorical variable. Facets are "not a geom, but rather a way of organizing a series of geoms" (Kieran Healy).

2 / 18
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow