In class we had a bit of trouble with the third part of Problem 12 in Chapter 7: the choropleth map of population density for U.S. states appeared not to vary in color by state.


Here was our initial wrangle of the data:

ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area))

Look at the data table to your left. You will notice that Washington, DC is listed among the states. Since it is entirely an urban area, its population density (about 3596) is very large in comparison to the states themselves.

DC’s density is so high that all of the states appear to be “squished together” at the low end of a scale.


We can see how DC stands out with the simple density curve shown on the left.

p <- ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  ggplot(aes(x = Density)) +
    geom_density() +
    geom_rug(aes(text = paste0(State, ", Density: ", 
                               round(Density, 1)))) +
    labs(x = "Population Density", y = "")
plotly::ggplotly(p, tooltip = "text")

Accordingly, the graph did not look right at all.


Here is the original code for the graph:

ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
   USMap(key = State, fill = Density)

As you can see, there is no visible variation in color from state to state.

Modify the wrangling so that no density is allowed to be more than a fixed value.


This can be accomplished with a call to the pmin() function, which computes pairwise minima of two given variables.

Since New Jersey is the state with the highest population density at around 499 people per unit area, let’s set 500 as the maximum possible density.

ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500))

Checking the table to your left, you see that DC now has a density of 500.

We make a new graph from the second wrangling.


Here is the code for the new graph:

ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500)) %>%
  USMap(key = State, fill = Density)
---
title: "The Population Density Problem"
author: "Homer White"
date: "9/14/2017"
output:
  flexdashboard::flex_dashboard:
    source_code: embed
    storyboard: true
---

```{r include = FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(DataComputing)
```


### In class we had a bit of trouble with the third part of Problem 12 in Chapter 7:  the choropleth map of population density for U.S. states appeared not to vary in color by state. {data-commentary-width=450}

```{r echo = FALSE}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  DT::datatable(rownames = FALSE)
```

***

Here was our initial wrangle of the data:

```{r eval = FALSE}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area))
```

Look at the data table to your left.  You will notice that Washington, DC is listed among the states.  Since it is entirely an urban area, its population density (about 3596) is *very* large in comparison to the states themselves.

### DC's density is so high that all of the states appear to be "squished together" at the low end of a scale. {data-commentary-width=550}


```{r echo = FALSE}
p <- ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  ggplot(aes(x = Density)) +
    geom_density() +
    geom_rug(aes(text = paste0(State, ", Density: ", 
                               round(Density, 1)))) +
    labs(x = "Population Density", y = "")
plotly::ggplotly(p, tooltip = "text")
```


***

We can see how DC stands out with the simple density curve shown on the left.

```{r eval = FALSE}
p <- ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  ggplot(aes(x = Density)) +
    geom_density() +
    geom_rug(aes(text = paste0(State, ", Density: ", 
                               round(Density, 1)))) +
    labs(x = "Population Density", y = "")
plotly::ggplotly(p, tooltip = "text")
```


### Accordingly, the graph did not look right at all. {data-commentary-width=450}

```{r echo = FALSE, fig.width=7, fig.height=5}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
   USMap(key = State, fill = Density)
```

***

Here is the original code for the graph:

```{r eval = FALSE}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
   USMap(key = State, fill = Density)
```

As you can see, there is no visible variation in color from state to state.



### Modify the wrangling so that no density is allowed to be more than a fixed value. {data-commentary-width=450}

```{r echo = F}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500)) %>%
  DT::datatable(rownames = FALSE)
```

***

This can be accomplished with a call to the `pmin()` function, which computes pairwise minima of two given variables.

Since New Jersey is the state with the highest population density at around 499 people per unit area, let's set 500 as the maximum possible density.

```{r eval = F}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500))
```


Checking the table to your left, you see that DC now has a density of 500.

### We make a new graph from the second wrangling. {data-commentary-width=450}

```{r echo = F, fig.width=7, fig,height=5}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500)) %>%
  USMap(key = State, fill = Density)
```

***

Here is the code for the new graph:

```{r eval = F}
ZipGeography %>%
  group_by(State) %>%
  summarise(area = sum(LandArea,
                       na.rm = TRUE),
            population = sum(Population,
                             na.rm = TRUE)) %>%
  mutate(Density = (population / area)) %>%
  mutate(Density = pmin(Density, 500)) %>%
  USMap(key = State, fill = Density)
```