This eCOTS 2018 virtual poster (video) assumes you know
If you’re new to R: ModernDive Chapter 2 Getting Started with Data
Decision of whether to use R can be viewed in terms of ratio:
\[ \frac{\mbox{Payoffs of using R}}{\mbox{Costs of learning R}} \]
Ratio has increased of late due to many reasons, in particular DataCamp (free academic licence)
Our proposal to increase ratio: provide data that is
In other words, “taming” sets a balance between data
| As it exists “in the wild” | Completely safe |
|---|---|
We propose the following “tame” data principles with novices in mind. All data should have
The fivethirtyeight R package:
Two data visualizations via:
ggformula packageCode below available at bit.ly/ecots_2018
# Load fivethirtyeight and other needed packages
library(fivethirtyeight)
library(dplyr)
library(ggformula)
# Ex 1: US Births ---------------------------------------------------
View(US_births_1994_2003)
?US_births_1994_2003
# Use filter command from dplyr package for data wrangling
US_births_1999 <- US_births_1994_2003 %>%
filter(year == 1999)
View(US_births_1999)
# Plot time series via base R:
plot(x = US_births_1999$date, y = US_births_1999$births, type = "l")
# Ex 2: Hate crimes -------------------------------------------------
View(hate_crimes)
?hate_crimes
# Create scatterplot & regression line via ggformula package
gf_point(hate_crimes_per_100k_splc ~ share_vote_trump, data = hate_crimes) %>%
gf_lm()
# Ex 3: Campaign stops of last 10 weeks of 2016 US election ---------
View(pres_2016_trail)
?pres_2016_trail
# Create map of Clinton vs Trump campaing stops via ggplot2 package and using
# preloaded map_data of US states in maps package
library(ggplot2)
library(maps)
ggplot(data = pres_2016_trail, aes(x = lng, y = lat)) +
facet_wrap(~candidate) +
geom_point(col = "black", size = 3) +
coord_map() +
geom_path(data = map_data("state"), aes(x = long, y = lat, group = group), size = 0.1)fivethirtyeight in TISE article (HTML, PDF)