Problem 1

The U.S. Government Department of Labor, Bureau of Labor Statistics (BLS) publishes a Local Area Unemployent Report, overview available here:

Data format description here:

Assignment: Tidy the data set provided and generate four separate ggplot line plots (including points) of the following:

  1. Employment
  2. Labor Force
  3. Unemploment Rate
  4. Unemployment

Tidy the Data

## # A tibble: 4 × 325
##            `Series ID` `Jan 1990` `Feb 1990` `Mar 1990` `Apr 1990`
##                  <chr>      <dbl>      <dbl>      <dbl>      <dbl>
## 1 LAUMT064186000000003        3.5        3.3        3.2        3.4
## 2 LAUMT064186000000004    70296.0    66517.0    63840.0    67539.0
## 3 LAUMT064186000000005  1952945.0  1948964.0  1951817.0  1942696.0
## 4 LAUMT064186000000006  2023241.0  2015481.0  2015657.0  2010235.0
## # ... with 320 more variables: `May 1990` <dbl>, `Jun 1990` <dbl>, `Jul
## #   1990` <dbl>, `Aug 1990` <dbl>, `Sep 1990` <dbl>, `Oct 1990` <dbl>,
## #   `Nov 1990` <dbl>, `Dec 1990` <dbl>, `Jan 1991` <dbl>, `Feb
## #   1991` <dbl>, `Mar 1991` <dbl>, `Apr 1991` <dbl>, `May 1991` <int>,
## #   `Jun 1991` <dbl>, `Jul 1991` <dbl>, `Aug 1991` <dbl>, `Sep
## #   1991` <dbl>, `Oct 1991` <int>, `Nov 1991` <dbl>, `Dec 1991` <dbl>,
## #   `Jan 1992` <dbl>, `Feb 1992` <int>, `Mar 1992` <dbl>, `Apr
## #   1992` <dbl>, `May 1992` <dbl>, `Jun 1992` <dbl>, `Jul 1992` <dbl>,
## #   `Aug 1992` <dbl>, `Sep 1992` <dbl>, `Oct 1992` <dbl>, `Nov
## #   1992` <dbl>, `Dec 1992` <dbl>, `Jan 1993` <dbl>, `Feb 1993` <dbl>,
## #   `Mar 1993` <dbl>, `Apr 1993` <dbl>, `May 1993` <dbl>, `Jun
## #   1993` <dbl>, `Jul 1993` <dbl>, `Aug 1993` <dbl>, `Sep 1993` <dbl>,
## #   `Oct 1993` <dbl>, `Nov 1993` <dbl>, `Dec 1993` <dbl>, `Jan
## #   1994` <dbl>, `Feb 1994` <dbl>, `Mar 1994` <int>, `Apr 1994` <dbl>,
## #   `May 1994` <dbl>, `Jun 1994` <int>, `Jul 1994` <dbl>, `Aug
## #   1994` <dbl>, `Sep 1994` <dbl>, `Oct 1994` <dbl>, `Nov 1994` <dbl>,
## #   `Dec 1994` <dbl>, `Jan 1995` <dbl>, `Feb 1995` <dbl>, `Mar
## #   1995` <dbl>, `Apr 1995` <dbl>, `May 1995` <dbl>, `Jun 1995` <dbl>,
## #   `Jul 1995` <dbl>, `Aug 1995` <dbl>, `Sep 1995` <dbl>, `Oct
## #   1995` <dbl>, `Nov 1995` <dbl>, `Dec 1995` <dbl>, `Jan 1996` <dbl>,
## #   `Feb 1996` <dbl>, `Mar 1996` <dbl>, `Apr 1996` <dbl>, `May
## #   1996` <dbl>, `Jun 1996` <dbl>, `Jul 1996` <dbl>, `Aug 1996` <dbl>,
## #   `Sep 1996` <dbl>, `Oct 1996` <dbl>, `Nov 1996` <int>, `Dec
## #   1996` <dbl>, `Jan 1997` <dbl>, `Feb 1997` <dbl>, `Mar 1997` <int>,
## #   `Apr 1997` <dbl>, `May 1997` <dbl>, `Jun 1997` <dbl>, `Jul
## #   1997` <dbl>, `Aug 1997` <dbl>, `Sep 1997` <dbl>, `Oct 1997` <dbl>,
## #   `Nov 1997` <dbl>, `Dec 1997` <dbl>, `Jan 1998` <dbl>, `Feb
## #   1998` <dbl>, `Mar 1998` <dbl>, `Apr 1998` <dbl>, `May 1998` <dbl>,
## #   `Jun 1998` <dbl>, `Jul 1998` <dbl>, `Aug 1998` <dbl>, ...
## # A tibble: 1,296 × 3
##             `Series ID`     Date     Value
##                   <chr>    <chr>     <dbl>
## 1  LAUMT064186000000003 Jan 1990       3.5
## 2  LAUMT064186000000004 Jan 1990   70296.0
## 3  LAUMT064186000000005 Jan 1990 1952945.0
## 4  LAUMT064186000000006 Jan 1990 2023241.0
## 5  LAUMT064186000000003 Feb 1990       3.3
## 6  LAUMT064186000000004 Feb 1990   66517.0
## 7  LAUMT064186000000005 Feb 1990 1948964.0
## 8  LAUMT064186000000006 Feb 1990 2015481.0
## 9  LAUMT064186000000003 Mar 1990       3.2
## 10 LAUMT064186000000004 Mar 1990   63840.0
## # ... with 1,286 more rows
## Observations: 1,296
## Variables: 5
## $ series id <fctr> unemployment_rate, unemployment, employment, labor_...
## $ date      <dttm> 1990-01-01, 1990-01-01, 1990-01-01, 1990-01-01, 199...
## $ value     <dbl> 3.5, 70296.0, 1952945.0, 2023241.0, 3.3, 66517.0, 19...
## $ month     <ord> Jan, Jan, Jan, Jan, Feb, Feb, Feb, Feb, Mar, Mar, Ma...
## $ year      <fctr> 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 199...

Plot the Data

Employment Statistics

Labor Force Statistics

Unemployment Statistics

Unemployment Rate

Problem 2

The file assignment2_problem2_data_files.zip contains two files:

  • A geospatial polygon file of USA counties: gz_2010_us_050_00_20m.json

  • a tsv file with monthy employment statistics for all California counties for 2016: california_counties_monthly_employment_2016.tsv

2.1 - Choropleth Map

Create a leaflet choropleth map showing the unemployment rates (only, with legend) for all California counties for December 2016 in your HTML report, generated from RMarkdown

## OGR data source with driver: GeoJSON 
## Source: "gz_2010_us_050_00_20m.json", layer: "OGRGeoJSON"
## with 3221 features
## It has 6 fields
## Observations: 58
## Variables: 6
## $ geo_id     <fctr> 0500000US06005, 0500000US06021, 0500000US06033, 05...
## $ state      <fctr> 06, 06, 06, 06, 06, 06, 06, 06, 06, 06, 06, 06, 06...
## $ county     <fctr> 005, 021, 033, 043, 055, 089, 099, 115, 003, 007, ...
## $ name       <fctr> Amador, Glenn, Lake, Mariposa, Napa, Shasta, Stani...
## $ lsad       <fctr> County, County, County, County, County, County, Co...
## $ censusarea <dbl> 594.583, 1313.947, 1256.464, 1448.816, 748.362, 377...
## Observations: 696
## Variables: 11
## $ area_code       <chr> "CN0600100000000", "CN0600300000000", "CN06005...
## $ fips_state      <chr> "06", "06", "06", "06", "06", "06", "06", "06"...
## $ fips_county     <chr> "001", "003", "005", "007", "009", "011", "013...
## $ area_title      <chr> "Alameda County, CA", "Alpine County, CA", "Am...
## $ period          <date> 2016-01-01, 2016-01-01, 2016-01-01, 2016-01-0...
## $ labor_force     <dbl> 825565, 672, 14174, 101597, 20268, 10897, 5499...
## $ employed        <dbl> 789959, 636, 13237, 94191, 18985, 8462, 524952...
## $ unemployed      <dbl> 35606, 36, 937, 7406, 1283, 2435, 24957, 864, ...
## $ unemployed_rate <dbl> 4.3, 5.4, 6.6, 7.3, 6.3, 22.3, 4.5, 8.7, 5.4, ...
## $ fips            <chr> "06001", "06003", "06005", "06007", "06009", "...
## $ fips_ssamatab1  <chr> "06001", "06003", "06005", "06007", "06009", "...

2.2 - Shiny App

A Shiny app that allows the user to choose the month (in a dropdown) for which to display the above choropleth map:

See the UI and Server tabs for the respective code.

UI.R

library(shiny)
library(tidyverse)
library(forcats)
library(lubridate)
library(rgdal)
library(leaflet)
library(sp)
library(spdplyr)
library(htmltools)

#Read, Tidy, and Merge the Datasets

usa.counties.mapping <- readOGR(dsn = "gz_2010_us_050_00_20m.json")

colnames(usa.counties.mapping@data) <- fix_column_names(colnames(usa.counties.mapping@data))

cal.counties.mapping <- usa.counties.mapping[usa.counties.mapping@data$state == '06', ]

employment.stats <- read_tsv("california_counties_monthly_employment_2016.tsv", col_names = T) 
colnames(employment.stats) <- fix_column_names((colnames(employment.stats)))

employment.stats$period <- parse_date_time(employment.stats$period, "%Y-%m-%d")

employment.stats$fips_county<- factor(employment.stats$fips_county)

cal.merged <- sp::merge(cal.counties.mapping, 
                        employment.stats, by.x = "county", 
                        by.y = "fips_county",
                        duplicateGeoms = TRUE)

cal.merged$period <- month(cal.merged$period, label = TRUE)

#UI.R
navbarPage(title = "California Counties Employment - 2016", id = 'nav',
           tabPanel('Interactive Map',
                    
                    leafletOutput('county.map', 
                                  width = "800px", 
                                  height = "800px"),
                    
                    absolutePanel(id = "controls", 
                                  fixed = TRUE, 
                                  draggable = TRUE, 
                                  top = "auto", 
                                  left = 20, 
                                  right = "auto",
                                  bottom = 100,
                                  width = "auto", 
                                  height = "auto",
                                  selectInput(inputId = "statistic",
                                              label = 'Select Statistic',
                                              choices = c("employed",
                                                          "labor_force",
                                                          "unemployed", 
                                                          "unemployed_rate")
                                              ),
                                  
                                  selectInput(inputId = "months", 
                                              label = "Select Month", 
                                              choices = sort(cal.merged$period) 
                                              ),
                                  
                                  checkboxInput(inputId = "legend",
                                                label = "Show Legend",
                                                value = FALSE)
                                  ) #end of absolute panel
           ), # End of tab panel 1
           
           tabPanel('Data',
                    
                    dataTableOutput('data.table')
                    )# end of tab panel 2
) #end of navbar page
Server.R
#Server.R

function(input, output){

  #############################  
  ### Tab - Interactive Map ###

  by.month <- reactive({cal.merged[cal.merged$period == input$months, ]
  })
    
  by.stat <- reactive({
    switch(input$statistic,
             "employed" = by.month()$employed,
             "labor_force" = by.month()$labor_force,
             "unemployed" = by.month()$unemployed,
             "unemployed_rate" = by.month()$unemployed_rate
         )
  })
  
  my_colors <- reactive({
    colorNumeric(palette = "RdYlBu", reverse = TRUE, domain = by.stat())
  })
  
  output$county.map <- renderLeaflet({
    leaflet(data = cal.merged) %>%
      addProviderTiles(provider = providers$OpenStreetMap.France) %>%
      setMaxBounds(lng1 = -125, lat1 = 31, lng2 = -113, lat2 = 43) %>%
      setView(lng = -119.417931, lat = 36.778259, zoom = 6)
  }) #end of output$county.map
  
  observe({
    pal <- my_colors()
    
    add_labels <- paste(by.month()$name,':', by.stat())
    
    leafletProxy('county.map', data = by.month()) %>%
      clearShapes() %>%
      addPolygons(fillColor = ~pal(by.stat()),
                  weight = 2,
                  opacity = 1,
                  color = "white",
                  dashArray = "3",
                  fillOpacity = .7,
                  highlight = highlightOptions(weight = 2,
                                               fillColor = 'silver',
                                               color = 'blue',
                                               fillOpacity = 0.7,
                                               bringToFront = TRUE),
                  label = add_labels,
                  labelOptions = labelOptions(
                    style = list("font-weight" = "bold",
                                 padding = "3px 8px"),
                    textsize = "15px",
                    direction = "auto")
                  )
     
    })
  
  observe({
    proxy <- leafletProxy("county.map", data = by.month())

    proxy %>% clearControls()
    if (input$legend) {
    proxy %>% addLegend(position = "topright",
                        pal = my_colors(),
                        opacity = .7,
                        values = ~by.stat(),
                        title = paste(input$statistic,'for', input$months)
                        )
      }
    }) # End of Observe
  
##################  
### Tab - Data ###
  
  output$data.table <- renderDataTable(data.frame(by.month()[ , c('name', 
                                                                  'period', 
                                                                  input$statistic)]))
    } #end of function