import data
salary <- read.csv("../00_data/Salaries.csv")
Introduction
Questions
Variation
Visualizing distributions
ggplot(data = salary) +
geom_bar(mapping = aes(x = rank))

salary %>% count(rank)
## rank n
## 1 AssocProf 64
## 2 AsstProf 67
## 3 Prof 266
ggplot(data = salary) +
geom_histogram(mapping = aes(x = salary))
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

ggplot(data = salary, mapping = aes(x = salary, colour = rank)) +
geom_freqpoly()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Typical values
Unusual values
Missing Values
Covariation
A categorical and continuous variable
Two categorical variables
Two continous variables
Patterns and models