Week 8: Apply it to your data 7

Import your data

results <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/results.csv')

## Warning: One or more parsing issues, see `problems()` for details

## Rows: 25220 Columns: 18
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (8): position, positionText, time, milliseconds, fastestLap, rank, fast...
## dbl (10): resultId, raceId, driverId, constructorId, number, grid, positionO...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

skimr::skim(results)

Data summary
Name	results
Number of rows	25220
Number of columns	18
_______________________
Column type frequency:
character	8
numeric	10
________________________
Group variables	None

Variable type: character

skim_variable	complete_rate	min	max	n_unique
position	1	1	2	34
positionText	1	1	2	39
time	1	2	11	6488
milliseconds	1	2	8	6687
fastestLap	1	1	2	80
rank	1	1	2	26
fastestLapTime	1	2	8	6266
fastestLapSpeed	1	2	7	6395

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
resultId	0	1	12611.23	7281.58	1	6305.75	12610.5	18915.25	25225	▇▇▇▇▇
raceId	0	1	517.95	290.34	1	287.00	503.0	762.00	1064	▆▇▇▆▆
driverId	0	1	250.84	258.25	1	56.00	158.0	347.00	854	▇▃▂▁▂
constructorId	0	1	47.48	58.39	1	6.00	25.0	57.00	214	▇▂▁▁▁
number	6	1	17.59	14.80	0	7.00	15.0	23.00	208	▇▁▁▁▁
grid	0	1	11.21	7.27	0	5.00	11.0	17.00	34	▇▇▇▃▁
positionOrder	0	1	12.93	7.74	1	6.00	12.0	19.00	39	▇▇▆▂▁
points	0	1	1.80	4.03	0	0.00	0.0	2.00	50	▇▁▁▁▁
laps	0	1	45.79	30.04	0	21.00	52.0	66.00	200	▅▇▁▁▁
statusId	0	1	17.72	26.10	1	1.00	11.0	14.00	139	▇▁▁▁▁

data_small <- results %>%
    select(rank, position, laps) %>% 
    slice(1:10)

Pivoting

long to wide form

data_wide <- data_small %>%
    pivot_wider(names_from = rank, values_from = laps, values_fill = 0)

data_wide

## # A tibble: 9 × 11
##   position   `2`   `3`   `5`   `7`   `1`  `14`  `12`   `4`   `9`  `13`
##   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "1"         58     0     0     0     0     0     0     0     0     0
## 2 "2"          0    58     0     0     0     0     0     0     0     0
## 3 "3"          0     0    58     0     0     0     0     0     0     0
## 4 "4"          0     0     0    58     0     0     0     0     0     0
## 5 "5"          0     0     0     0    58     0     0     0     0     0
## 6 "6"          0     0     0     0     0    57     0     0     0     0
## 7 "7"          0     0     0     0     0     0    55     0     0     0
## 8 "8"          0     0     0     0     0     0     0    53     0     0
## 9 "\\N"        0     0     0     0     0     0     0     0    47    43

wide to long form

data_long <- data_wide %>%
    pivot_longer(cols = `2`:`13`, names_to = "rank", values_to = "laps")

data_long

## # A tibble: 90 × 3
##    position rank   laps
##    <chr>    <chr> <dbl>
##  1 1        2        58
##  2 1        3         0
##  3 1        5         0
##  4 1        7         0
##  5 1        1         0
##  6 1        14        0
##  7 1        12        0
##  8 1        4         0
##  9 1        9         0
## 10 1        13        0
## # … with 80 more rows

Separating and Uniting

Separate a column

data_separate <- data_small %>%
    separate(col = "position", into = c("//N", "position"))

## Warning: Expected 2 pieces. Missing pieces filled with `NA` in 8 rows [1, 2, 3,
## 4, 5, 6, 7, 8].

data_separate

## # A tibble: 10 × 4
##    rank  `//N` position  laps
##    <chr> <chr> <chr>    <dbl>
##  1 2     "1"   <NA>        58
##  2 3     "2"   <NA>        58
##  3 5     "3"   <NA>        58
##  4 7     "4"   <NA>        58
##  5 1     "5"   <NA>        58
##  6 14    "6"   <NA>        57
##  7 12    "7"   <NA>        55
##  8 4     "8"   <NA>        53
##  9 9     ""    N           47
## 10 13    ""    N           43

Unite two columns

data_unite <- data_small %>%
    unite(col = "position_laps", c("position", "laps"))

data_unite

## # A tibble: 10 × 2
##    rank  position_laps
##    <chr> <chr>        
##  1 2     "1_58"       
##  2 3     "2_58"       
##  3 5     "3_58"       
##  4 7     "4_58"       
##  5 1     "5_58"       
##  6 14    "6_57"       
##  7 12    "7_55"       
##  8 4     "8_53"       
##  9 9     "\\N_47"     
## 10 13    "\\N_43"

Week 8: Apply it to your data 7

Amanda Simpson

2022-10-05

Import your data

Pivoting

long to wide form

wide to long form

Separating and Uniting

Separate a column

Unite two columns