import data

salary <- read.csv("../00_data/Salaries.csv")

Introduction

Questions

Variation

Visualizing distributions

ggplot(data = salary) +
  geom_bar(mapping = aes(x = rank))

salary %>% count(rank)
##        rank   n
## 1 AssocProf  64
## 2  AsstProf  67
## 3      Prof 266
ggplot(data = salary) +
  geom_histogram(mapping = aes(x = salary))
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

ggplot(data = salary, mapping = aes(x = salary, colour = rank)) +
  geom_freqpoly()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Typical values

Unusual values

Missing Values

Covariation

A categorical and continuous variable

Two categorical variables

Two continous variables

Patterns and models