Critique and Create Project 5: Maps

Introduction

As with previous projects, you’re being asked to critique a series of visualizations and then to apply best practices as you create your own. Follow along below, but ultimately do the work using the corresponding file on Posit Cloud.

This project’s data

library(tidyverse)
library(sf)
library(tigris)

spills2019 <- read_csv("../data/spills_2019.csv")

In addition to these packages and data, a helper function called prepare_gps_columns() is provided to prepare some of the columns for mapping. It’s kind of ugly, so I’m hiding its definition by default here:

show or hide the function definition
prepare_gps_columns <- function(df, rm.na = TRUE) {
  result <- df |>
    mutate(
      latitude = LAT_DEG +
        LAT_MIN / 60 +
        LAT_SEC / 3600,
      longitude = LONG_DEG +
        LONG_MIN / 60 +
        LONG_SEC / 3600
    ) |>
    mutate(
      latitude = if_else(
        LAT_QUAD != "S",
        true = latitude,
        false = latitude * -1
      ),
      longitude = if_else(
        LAT_QUAD != "E",
        true = longitude * -1,
        false = longitude
      )
    ) |>
    select(-starts_with("LAT_"),
           -starts_with("LONG_"))
  
  if(rm.na){
    result <- result |> 
      drop_na(latitude, longitude)
  }
  
  return(result)
}

You’ve used the spills2019 data set in a previous week. It’s sourced from the U.S. Coast Guard’s National Response Center. Take a look to remind yourself of the kind of data it includes. Only 100 rows and 50 columns are shown here, but the full set we’re using is more than 300 times longer and five times wider.

Pay particular attention to the columns indicating latitude and longitude.

spills2019 |> 
  select(c(5, 27:34)) |> 
  drop_na(LAT_SEC) |> 
  head()

Columns 27 through 30 show the latitudinal coordinates divided up in degrees, minutes, seconds, and quadrant. Columns 31 through 34 do the same for longitudinal coordinates. We can simplify these eight columns down to two using the helper function prepare_gps_columns():

spills2019 |> 
  select(c(5, 27:34)) |> 
  prepare_gps_columns() |> 
  head()

Assignment

Respond to the three Critique questions, then finish the two Create code chunks below, and finish with a Self-Critique.

Critique

As an example, I’ll show two visualizations of maps that can be drawn from this data set. The polished visualization doesn’t show you the code used to create it; instead, much of the relevant code is sprinkled throughout the methods page for maps. As you critique, consider how the visualizations were made, and try to think about why each visualization is well or poorly made.

Preparation

Before mapping, we need to get everything ready. First we’ll make prepared_data by starting from spills2019, renaming the LOCATION_STATE column to state, and limiting things to the continental U.S. before using prepare_gps_columns(). Next, we’ll download shape files to us_shapes. These first two steps can then be combined into mappable_states. Finally, mappable_incidents will be defined by limiting reports within our map area, converting the data to an {sf} object, and matching the CRS from mappable_states.

The following code will do all of these steps. Run it, and feel free to take a look at any of the resulting data objects to see how they look.

prepared_data <- spills2019 |> 
  rename(state = LOCATION_STATE) |> 
  filter(state %in% state.abb,
         state != "AK", 
         state != "HI") |> 
  prepare_gps_columns()

us_shapes <- 
  states(cb = TRUE, 
         progress_bar = FALSE) |> 
  rename(state = STUSPS)

mappable_states <-
  right_join(
    us_shapes,
    count(prepared_data, state))

mappable_incidents <- prepared_data |> 
  filter(longitude < -60,
         longitude > -130,
         latitude < 52,
         latitude > 22) |> 
  st_as_sf(
    coords = c("longitude", "latitude"), 
    crs = 4326) |> 
  st_transform(crs = st_crs(mappable_states))

Now that the data is ready, we can use it for mapping.

Rough figure

Consider this rough figure, including whether it is faithful to the data and how useful it might be for understanding what it shows:

ggplot() +
  geom_sf(
    data = mappable_states, 
    aes(fill = n)) +
  geom_sf(
    data = mappable_incidents,
    fill = "orange",
    color = "white",
    shape = 21,
    alpha = 0.4) +
  labs(fill = "incidents per state") +
  theme(legend.position = "bottom")

Compare this map to the polished version to answer questions in the next two sections.

Polished figure

Here’s another way of showing the same data. As you compare it to the previous visualization, think about the ways it is changed and how these changes affect the viewer’s ability to understand and respond to the data. You won’t get to see the code that was used to make it, but it builds from what was used in the previous code chunk, with changes and additions.

Critique the visualization

This second version of the map includes many changes made with hopes of improving the visualization.

You respond

When you answer the following questions, it might be helpful to consider not only Wilke’s chapter 15 but also other chapters on aesthetics and argument.

  1. These two maps present the same information two different ways. In what noticeable ways has the polished version improved upon the rough draft?
  2. Which of these changes do you think made the biggest difference? How do they affect the viewer’s experience?
  3. What could still be improved in the polished version?

Create

Recreating a visualization

Taking inspiration from the above map, use the mappable_states and mappable_incidents data to try to emulate the polished version of the map shown here. You probably won’t be able to get it exactly the same, but the reference page on maps and the reference page for color should help you make something quite close.

You code

Write code to recreate this polished figure as closely as you can.

Creating your own visualization

Create another map from spills2019. You might do any one of the following, or do something else entirely:

  1. Focus on one state or region to show where incidents were reported in 2019.
  2. Go back to the original spills2019 data and prepare things to map by RESPONSIBLE_STATE instead of LOCATION_STATE, showing which states house company headquarters for the most reported incidents.
  3. Color code the incident points to show whether the military or some other organization was responsible for an incident, as shown by the RESPONSIBLE_ORG_TYPE column.
You code

Write code to make your own figure, and polish it as much as you’re able.

Self-Critique

It’s helpful to learn how to acknowledge the things in our own work that we’re proud of, just as it is good to be able to point out weaker points. Reflecting through self critique is a valuable way to grow in any skill.

You respond

Explain your process and goals for the visualization you created. Critique it, including describing things that you wish you could do but that you don’t yet know how. Write at least a paragraph.