STAT 451, Day 3

Let's do this!

BayCloud

  • You can access BayCloud directly, or through Blackboard.
  • Be careful to save your work!
  • You can access most software the University has licenses for.
  • Can be accessed from any platform (even your phone) but requires internet access.

R Notebook

  • An R Notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input.
  • R Notebooks are a method of literate programming that allows for direct interaction with R while producing a reproducible document with publication-quality output.
  • Let's take a look at R_Notebook_Ex1.Rmd
    • Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
    • When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

Visualizing Maps

One of the best ways to explore and try to understand a large dataset.

loadstuff <-c("ggplot2", "devtools", "dplyr", "stringr", "maps", "mapdata")
lapply(loadstuff, require, character.only=TRUE)
states <- map_data("state")
ggplot(data = states) + geom_polygon(aes(x = long, y = lat, fill = region, group = group), color = "white") + coord_fixed(1.3) +   guides(fill=FALSE)  # do this to leave off the color legend

plot of chunk unnamed-chunk-2

Maps

plot of chunk unnamed-chunk-3

Maps Cont.

require(viridis); require(mapproj)
url.unemploy_map <- url("http://sharpsightlabs.com/wp-content/datasets/unemployment_map_data_2016_nov.RData")
load(url.unemploy_map)
ggplot() +
  geom_polygon(data = map.county_unemp, aes(x = long, y = lat, group = group, fill = unemployed_rate)) +
  geom_polygon(data = map.states, aes(x = long, y = lat, group = group), color = "#EEEEEE", fill = NA, size = .3) +
  coord_map("albers", lat0 = 30, lat1 = 40) +
  labs(title = "U.S. unemployment rate, by county" , subtitle = "November, 2016") +
  labs(fill = "% unemployed") +
  scale_fill_viridis() +
  theme(text = element_text(family = "Gill Sans", color = "#444444")
        ,plot.title = element_text(size = 30)
        ,plot.subtitle = element_text(size = 20)
        ,axis.text = element_blank()
        ,axis.title = element_blank()
        ,axis.ticks = element_blank()
        ,panel.grid = element_blank()
        ,legend.position = c(.9,.4)
        ,legend.title = element_text(size = 16)
        ,legend.background = element_blank()
        ,panel.background = element_blank()
        )

plot of chunk unnamed-chunk-4

More Code

scale_fill_viridis() + theme(text = element_text(family = “Gill Sans”, color = “#444444”) ,plot.title = element_text(size = 30) ,plot.subtitle = element_text(size = 20) ,axis.text = element_blank() ,axis.title = element_blank() ,axis.ticks = element_blank() ,panel.grid = element_blank() ,legend.position = c(.9,.4) ,legend.title = element_text(size = 16) ,legend.background = element_blank() ,panel.background = element_blank() )

Unemployment

plot of chunk unnamed-chunk-5

Brief History of Motivation

  • Modern Data Visualization started as a means to visually justify quantitative analyses.
  • It has evolved to communicate ideas and tell stories.
  • Finding patterns in real data can help with real-world problems
    • e.g. traffic patterns in the Bay Area
  • This class is about “learning by doing”

Telling Stories

  • Stories can be made much more compelling by visualizing the data behind them.
  • It's easy to fall into the trap when analyzing data to treat that data as “just a bunch of numbers''
    • Data is a representation of real life; these numbers have meaning.
  • Visualizing data can help tell stories that are informative such as newspapers/blogs, entertaining such as art and fun observations'', and compelling to convince people to take action.