Ben Bellman
August 28, 2018
Base functions for graphics are ugly and outdated
However, they can be useful in a pinch, especially plot()
plot.histogram()
vs. plot.xy()
plot()
a try!Standard function and syntax for any plot, regardless of style
Uses logic similar to dplyr and piping
Full suite of functions for customizing graphs
Clean and professional visual styles
Lots of accessory packages, like ggmap
and gganimate
ggplot()
functionaes()
arguments:
aes()
settings+
operator, like %>%
in dplyr
Argument | Property |
---|---|
x | x values |
y | y values |
col | color of line/point |
fill | color of area |
alpha | transparency |
size | size of point/line |
shape | point symbol |
We'll start with the White House salaries data from the dplyr
slides
library(tidyverse)
library(here)
salaries <- read_csv(here("data","white-house-salaries.csv"))
# create the tenure variable
salaries <- salaries %>%
rename(name = employee_name) %>%
arrange(name) %>%
group_by(name) %>%
mutate(tenure = rank(year))
Once we've initialized with ggplot()
, we add a geometry feature to generate a basic graph
ggplot(salaries, aes(x = year, y = salary)) +
geom_point()
When using new plot types, I like to look up its page on the package website
Lets take a look at the geom_point()
documentation and see some ways it can be used:
ggplot(salaries, aes(x = year, y = salary)) +
geom_point(alpha = .1) +
geom_smooth(aes(col = gender, fill = gender), method = "loess")
There are functions for changing plot text, axis formats, and specifying colors and legends
ggplot(salaries, aes(x = year, y = salary)) +
geom_point(alpha = .05) + # alpha sets transparency
geom_smooth(aes(col = gender), method = "loess") + # color by gender
labs(x = "Year", # change labels
y = "Employee Salary for Year ($)",
title = "White House Salaries Since 2001") +
scale_color_manual(name = "Gender", # set legend attributes
values = c("green", "blue"),
breaks = c("female", "male"),
labels = c("Women", "Men"))
Other functions change the overall look if the grid
ggplot(salaries, aes(x = year, y = salary)) +
geom_point(alpha = .05) +
geom_smooth(aes(col = gender), method = "loess") +
theme_minimal()
scale_x_continuous()
, scale_y_discrete()
, scale_x_log10()
, scale_y_reverse()
, etc.xlim()
and ylim()
scale_color_discrete()
and scale_color_manual()
scale_color_brewer()
scale_size_manual()
, scale_shape_manual()
, scale_alpha_manual()
, etc.geom_abline()
, geom_vline()
, geom_hline()
guide_colorbar()
and guide_legend()
facet_grid()
and facet_wrap()
theme()
optionsCreate a histogram of salaries
salaries %>%
ggplot(aes(x = salary)) +
geom_histogram()
salaries %>%
ggplot(aes(x = salary, fill = president)) +
geom_histogram(binwidth = 5000)
Look at salaries for 1-5 years of tenure
salaries %>%
filter(tenure >= 1 & tenure <= 5) %>%
ggplot(aes(tenure, salary, group = tenure)) +
geom_boxplot(aes(fill = as.factor(tenure))) +
scale_fill_brewer(palette = "Accent")
Look at salaries for 1-5 years of tenure
salaries %>%
ggplot(aes(tenure, salary, group = as.factor(tenure))) +
geom_violin(aes(col = as.factor(tenure),
fill = as.factor(tenure))) +
theme(legend.position="none") +
coord_flip()
Look at salaries for 1-5 years of tenure
salaries %>%
ggplot(aes(salary, col = gender, fill = gender)) +
geom_density(alpha = 0.4) +
facet_grid(president ~ .)
ggsave()
here()
is creating a new folder in the projectsalaries %>%
ggplot(aes(salary, col = gender, fill = gender)) +
geom_density(alpha = 0.4) +
facet_grid(president ~ .) +
ggsave(here("results", "test_plot.pdf"), width = 5, height = 4)