Import data

data <- read_csv("../00_data/myData.csv")
## New names:
## Rows: 236 Columns: 21
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): TEAM, F4PERCENT, CHAMPPERCENT dbl (18): ...1, TEAMID, PAKE, PAKERANK,
## PASE, PASERANK, GAMES, W, L, WINPERC...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
data
## # A tibble: 236 × 21
##     ...1 TEAMID TEAM   PAKE PAKERANK  PASE PASERANK GAMES     W     L WINPERCENT
##    <dbl>  <dbl> <chr> <dbl>    <dbl> <dbl>    <dbl> <dbl> <dbl> <dbl>      <dbl>
##  1     1      1 Abil…   0.7       45   0.7       52     3     1     2      0.333
##  2     2      2 Akron  -0.9      179  -1.1      187     4     0     4      0    
##  3     3      3 Alab…  -2.1      211  -2.9      220    10     5     5      0.5  
##  4     4      4 Alba…  -0.4      147  -0.3      138     3     0     3      0    
##  5     5      6 Amer…  -0.5      160  -0.4      150     3     0     3      0    
##  6     6      8 Ariz…  -1.7      206  -2.5      216    28    17    11      0.607
##  7     7      9 Ariz…  -2        209  -1.9      206     5     1     4      0.2  
##  8     8     10 Arka…   4.3       11   3.5       16    18    11     7      0.611
##  9     9     11 Arka…   0         76   0         78     1     0     1      0    
## 10    10     12 Aubu…   0.6       53   1.4       30    11     7     4      0.636
## # ℹ 226 more rows
## # ℹ 10 more variables: R64 <dbl>, R32 <dbl>, S16 <dbl>, E8 <dbl>, F4 <dbl>,
## #   F2 <dbl>, CHAMP <dbl>, TOP2 <dbl>, F4PERCENT <chr>, CHAMPPERCENT <chr>

Introduction

Questions

Variation

ggplot(data = data) +
    geom_bar(mapping = aes(x = CHAMPPERCENT))

Visualizing distributions

ggplot(data = data) + 
  geom_histogram(aes(x =  WINPERCENT))
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Typical values

0.0 ### Unusual values 0.8 ## Missing Values

Covariation

A categorical and continuous variable

Two categorical variables

Two continous variables

Patterns and models