Set-up


library(ggplot2)
library(tidyverse)
library(janitor)
library(glue)
library(purrr)

Loading the data

For details on how to load the data, see Post #1 by Brock. For details about this dataset, see Post #2 by Akhila and Stephanie.


files <- list.files(here::here("data"),
                    full.names = TRUE)
files
d <- read_csv(files[1]) %>%
  clean_names()

Nesting and using `pmap()` for plotting

Nesting

Let’s say we want to create a separate plot for mobility trends in each country. First, we need to group_by country and nest() the rest of the data. This creates a list of tibbles, one for each country. I’ve called reactable() here so that we can return a simple, interactive table to see the results of nesting.


d %>%
  group_by(country) %>% 
  nest() %>%
  reactable::reactable()

Now we can use this as the input for our plots.

pmap() and ggplot() for generating a series of plots

About `pmap()`: pmap(.l, .f, …)

pmap(.l, .f, ...) is a variation of map() which allows us to apply a function to multiple vectors simultaneously. According to the help documentation, it takes 2 or more arguments:

.d = a list of vectors (such as a data frame).
.f = A function, formula, or vector (not necessarily atomic). In the example below, we’ve supplied another vector.
… = Additional arguments passed on to the mapped function. This is what makes pmap() so flexible!

Getting started with plotting

In the syntax below, you’ll see that the first vector listed corresponds to ..1 (in this case, country) and the second vector listed (data) corresponds to ..2. The vectors listed in aes() are from within the data, which we specifed with ..2. For this example, we’ll create one plot for each country, for just one of the mobility categories (workplaces).


plots_workplaces <- d %>%
  group_by(country) %>% 
  nest() %>% 
  mutate(plots = pmap(list(country, data),  
                      ~ggplot(..2, aes(date, workplaces)) +
                        geom_point() + 
                        geom_line()))

Test to see what the plots look like:


plots_workplaces$plots[1]


[[1]]


plots_workplaces$plots[2]


[[1]]


# or we could view all of them like this: 
# plots_workplaces$plots[1:19]

Plot improvements and customization

We can add a unique title to each plot with {..1} to select the 1st column (country) to insert country names into the title.

Plus a few other visualization adjustments…


plots_workplaces <- d %>%
  group_by(country) %>% 
    nest() %>%
    mutate(plots = pmap(list(country, data),
    ~ggplot(..2, aes(date, workplaces)) +
      geom_line(color = "grey70") +
      geom_point(aes(color = workplaces > 0), 
                 size = 2) +
      scale_color_brewer(palette = "Set2") +      
      theme(axis.title.x = element_blank(), 
            legend.position = 'none', 
            plot.background = element_rect(fill="gray20"),
            panel.background = element_rect(fill="gray20"),
            panel.grid.minor = element_blank(), 
            plot.title = element_text(color="white",
                                      hjust= 0,
                                      vjust=1, 
                                      size=rel(1.5)), 
            axis.text = element_text(color="white", 
                               size=rel(1.5)),
            axis.text.y  = element_text(hjust=1), 
            axis.title.y = element_text(color = "white"),
            plot.caption = element_text(color = "white")) +
      labs(title = glue("Mobility trends to/from workplaces in {..1}"), #unique title
           y = "% change in mobility", 
           caption = "Googlemaps data from https://www.kaggle.com/gustavomodelli/covid-community-measures"))
    )

# testing: 
plots_workplaces$plots[1]


[[1]]


plots_workplaces$plots[2]


[[1]]


# or we could see all of them: 
# plots_workplaces$plots[1:19]

Another application of pmap()

What if we wanted to see plots for all mobility categories, not just workplaces? We can apply the same logic as above, but first nest by both country and mobility category.

In order to do so, we’ll first have to restructure the data so that mobility category is it’s own column. This can be done using pivot_longer():


by_country_mobil_cat <- d %>%
  pivot_longer(
    cols = 4:9,
    names_to = "mobil_category",
    values_to = "perc_change"
  ) %>%
  nest(-country, -mobil_category)

#head(by_country_mobil_cat)

reactable::reactable(by_country_mobil_cat)

Now we can use the same code structure as before for plotting. pmap() is super useful here because we can list() any number of columns. Note that here the first argument for ggplot() is now ..3, referring to the 3rd item (“data”) in the pmap(list()). Also, the aes() now calls for perc_change instead of parks or residential, etc., because that is the name of the column after using pivot_longer.


all_plots <- by_country_mobil_cat %>% 
  mutate(plots = pmap(list(country, mobil_category, data),
                      ~ggplot(..3, aes(date, perc_change)) +
      geom_line(color = "grey70") +
      geom_point(aes(color = perc_change > 0), 
                 size = 2) +
      scale_color_brewer(palette = "Set2") +      
      theme(axis.title.x = element_blank(), 
            legend.position = 'none', 
            plot.background = element_rect(fill="gray20"),
            panel.background = element_rect(fill="gray20"),
            panel.grid.minor = element_blank(), 
            plot.title = element_text(color="white",
                                      hjust= 0,
                                      vjust=1, 
                                      size=rel(1.5)), 
            axis.text = element_text(color="white", 
                               size=rel(1.5)),
            axis.text.y  = element_text(hjust=1), 
            axis.title.y = element_text(color = "white"),
            plot.caption = element_text(color = "white")) +
      labs(title = glue("Mobility trends to/from {..2} in {..1}"), #unique title
           y = "% change in mobility", 
           caption = "Googlemaps data from https://www.kaggle.com/gustavomodelli/covid-community-measures"))
    )

all_plots$plots[1]


[[1]]


all_plots$plots[10]


[[1]]


all_plots$plots[20]


[[1]]

Using `walk2()` to save plots

We can use walk2() to save our plots. It’s similar to map() but doesn’t print anything to the screen, instead creating files and file pathways to save to.


fs::dir_create(here::here("plots", "mobility-plots")) # creates a folder called "plots" and a subdirectory called "mobility"

# then create a vector that has all of the file paths: 

files_c <- str_replace_all(tolower(all_plots$country), " ", "-")
files_m <- str_replace_all(all_plots$mobil_category, "_", "-")
paths <- here::here("plots", "mobility-plots", glue("{files_c}-{files_m}.png"))
#paths

walk2(paths, all_plots$plots, ggsave,
      width = 9.5, 
      height = 6.5,
      dpi = 500)

Creating a function to plot linear model

So far, we’ve only explored mobility trends, not mobility trends in relation to infections. And, we’ve illustrated one strategy for applying a function to multiple vectors - parellel iteration. Another strategy that is especially useful for processes that you find yourself repeating is to write a function yourself! Here’s an example of writing a function to plot linear models of mobility trends by infection data.

Before writing a function, it’s usually helpful to just write the code that you would normally use to achieve the result you want. Here is the code for plotting a linear model:

(first I unnested the data)


d_unnest <- by_country_mobil_cat %>%
  unnest(data)


lm_total_mob <- lm(perc_change ~ total_cases, data = d_unnest)
call <- as.character(match.call()) # for customized title in labs() 
plot(perc_change ~ total_cases, data = d_unnest)
abline(lm_total_mob)

Now let’s use that code to generalize it to a function. This function will have 3 arguments:

the data frame (df)
the mobility category of interest (mobil_cat), e.g. parks, residential
the infection variable of interest (infect_var), e.g. total_cases, fatalities

Here’s the function:


plot_lm <- function(df, mobil_cat, infect_var) {
  lm_mob_by_infect <- lm(mobil_cat ~ infect_var, data = df)
  plot(mobil_cat ~ infect_var, data = df)
  abline(lm_mob_by_infect)
}

Test plot_lm() on one mobility category (parks):


plot_lm(d, d$parks, d$total_cases)

Using map() to loop through our plot_lm() function

We can use map() to loop through each mobilility category column to plot a lm for each.

map(.d, .f) takes a dataframe argument and a function argument.

Here, the ~ means “apply this function” and is paired with the .x argument indicating “over all columns”. For the data argument in map(), I’ve selected only the columns I want to loop through: d[5:9].


map(d[5:9], ~plot_lm(d, .x, d$total_cases))

This is just to demonstrate creating a function and using map() with ~ and .x to loop through that function. Of course, these plots would be more useful with customized title and axes labels, as shown above.

Interesting to see how different the 2nd plot is from the rest! This plot corresponds to “residential” mobility – while the other plots show decreasing mobility to parks, workplaces, etc., this plot shows a large increase in people staying at home as total infections rose.

Post 3 - Using parellel iteration and writing functions to generate plots

Table of Contents

Set-up

Loading the data

Nesting and using `pmap()` for plotting

Nesting

pmap() and ggplot() for generating a series of plots

About `pmap()`: pmap(.l, .f, …)

Getting started with plotting

Plot improvements and customization

Another application of pmap()

Using `walk2()` to save plots

Creating a function to plot linear model

Using map() to loop through our plot_lm() function

Post 3 - Using parellel iteration and writing functions to generate plots

Table of Contents

Set-up

Loading the data

Nesting and using pmap() for plotting

Nesting

pmap() and ggplot() for generating a series of plots

About pmap(): pmap(.l, .f, …)

Getting started with plotting

Plot improvements and customization

Another application of pmap()

Using walk2() to save plots

Creating a function to plot linear model

Using map() to loop through our plot_lm() function

Nesting and using `pmap()` for plotting

About `pmap()`: pmap(.l, .f, …)

Using `walk2()` to save plots