Introduction

Welcome to the famous kids TV show … THE POWER RANGERS

Questions

What type of variation occurs within my variables?

What type of covariation occurs between my variables?

Variation

Variation is the tendency of the values of a variable to change from measurement to measurement.Looking at the graph below you can see that there is quite a change in the data as the season/episodes go on!

Visualizing distributions

ggplot(data = powerrangers) +
    geom_bar(mapping = aes(x = season))

powerrangers %>% count(season)
## # A tibble: 28 × 2
##    season                                  n
##    <chr>                               <int>
##  1 "Be the first one to add a plot.\""     1
##  2 "Beast Morphers (Season 1)"            22
##  3 "Beast Morphers (Season 2)"            22
##  4 "Dino Charge"                          22
##  5 "Dino Super Charge"                    22
##  6 "Dino Thunder"                         38
##  7 "In Space"                             43
##  8 "Jungle Fury"                          32
##  9 "Lightspeed Rescue"                    40
## 10 "Lost Galaxy"                          45
## # ℹ 18 more rows
ggplot(data = powerrangers) +
    geom_histogram(mapping = aes(x = imdb.rating))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

ggplot(data = powerrangers, mapping = aes(x = imdb.rating, colour = season)) +
    geom_freqpoly()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

Typical values

ggplot(data = powerrangers, mapping = aes(x = imdb.rating)) +
  geom_histogram(binwidth = 0.01)
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

Unusual values

ggplot(powerrangers) + 
  geom_histogram(mapping = aes(x = imdb.rating), binwidth = 0.5)
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

Missing Values

ggplot(data = powerrangers, mapping = aes(x = imdb.rating, y = season)) + 
  geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

Covariation

A categorical and continuous variable

ggplot(data = powerrangers, mapping = aes(x = imdb.rating)) + 
  geom_freqpoly(mapping = aes(colour = season), binwidth = 500)
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

### Two categorical variables

ggplot(data = powerrangers) +
  geom_count(mapping = aes(x = imdb.rating, y = season))
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_sum()`).

Two continous variables

ggplot(data = powerrangers) +
  geom_point(mapping = aes(x = imdb.rating, y = season)) +
  coord_cartesian(xlim = c(4, 11), ylim = c(4, 11))
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

Patterns and models

ggplot(data = powerrangers) + 
  geom_point(mapping = aes(x = imdb.rating, y = season))
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).