HW03

Author

Xiangzhe Li

Packages

library(tidyverse)
mario_kart <- read_csv("https://raw.githubusercontent.com/vaiseys/dav-course/main/Data/world_records.csv")

Question 1

Whole Race:

three_laps <- mario_kart |> filter(type == "Three Lap")

Dataset Without “Rainbow Road”:

no_rainbow <- three_laps |> filter(track != "Rainbow Road")

Dataset with only “Rainbow Road”:

only_rainbow <- three_laps |> filter(track == "Rainbow Road")

Question 2

Rainbow Avg&Sd:

only_rainbow |> 
  summarize(
    avg_time = mean(time),
    sd_time  = sd(time)
  )
# A tibble: 1 × 2
  avg_time sd_time
     <dbl>   <dbl>
1     276.    91.8

No Rainbow Avg&Sd:

no_rainbow |> 
  summarize(
    avg_time = mean(time),
    sd_time  = sd(time)
  )
# A tibble: 1 × 2
  avg_time sd_time
     <dbl>   <dbl>
1     114.    53.0

Differences: Average time and standard deviation of time decrease significantly for the data set that contain all other roads than that of the data set contain only rainbow road.

Question 3

three_laps |> 
  group_by(track) |> 
  summarize(num_records = n()) |> 
  arrange(desc(num_records))
# A tibble: 16 × 2
   track                 num_records
   <chr>                       <int>
 1 Toad's Turnpike               124
 2 Rainbow Road                   99
 3 Frappe Snowland                92
 4 D.K.'s Jungle Parkway          86
 5 Choco Mountain                 84
 6 Mario Raceway                  82
 7 Luigi Raceway                  81
 8 Royal Raceway                  77
 9 Yoshi Valley                   74
10 Kalimari Desert                73
11 Sherbet Land                   73
12 Wario Stadium                  71
13 Koopa Troopa Beach             56
14 Banshee Boardwalk              55
15 Moo Moo Farm                   44
16 Bowser's Castle                40

Track with the most records: Toad’s Turnpike

Question 4

three_laps |> 
  group_by(player, track) |> 
  summarize(num_records = n()) |> 
  arrange(desc(num_records)) 
`summarise()` has grouped output by 'player'. You can override using the
`.groups` argument.
# A tibble: 306 × 3
# Groups:   player [60]
   player   track                 num_records
   <chr>    <chr>                       <int>
 1 Penev    Choco Mountain                 26
 2 Lacey    D.K.'s Jungle Parkway          24
 3 abney317 Rainbow Road                   21
 4 MR       Toad's Turnpike                20
 5 MR       Frappe Snowland                18
 6 Penev    Toad's Turnpike                18
 7 MR       Sherbet Land                   16
 8 abney317 Kalimari Desert                16
 9 MR       Banshee Boardwalk              15
10 Penev    Rainbow Road                   15
# ℹ 296 more rows

Player: Penev Track: Choco Mountain

Question 5

three_laps |> 
  group_by(track) |> 
  summarize(
    avg_time = mean(time)
  ) |> 
  arrange(desc(avg_time))
# A tibble: 16 × 2
   track                 avg_time
   <chr>                    <dbl>
 1 Rainbow Road             276. 
 2 Wario Stadium            214. 
 3 Royal Raceway            158. 
 4 Bowser's Castle          134. 
 5 Kalimari Desert          126. 
 6 Banshee Boardwalk        126. 
 7 Toad's Turnpike          122. 
 8 Sherbet Land             116. 
 9 Luigi Raceway            104. 
10 D.K.'s Jungle Parkway    101. 
11 Koopa Troopa Beach        96.6
12 Choco Mountain            95.2
13 Moo Moo Farm              88.4
14 Yoshi Valley              82.7
15 Mario Raceway             79.1
16 Frappe Snowland           77.1

Track with the highest average time: Rainbow Road

Tracks with their lowest record:

three_laps |> 
  group_by(track) |> 
  arrange(time) |> 
  slice(1) |> 
  select(track, time)
# A tibble: 16 × 2
# Groups:   track [16]
   track                  time
   <chr>                 <dbl>
 1 Banshee Boardwalk     124. 
 2 Bowser's Castle       132  
 3 Choco Mountain         17.3
 4 D.K.'s Jungle Parkway  21.4
 5 Frappe Snowland        23.6
 6 Kalimari Desert       122. 
 7 Koopa Troopa Beach     95.2
 8 Luigi Raceway          25.3
 9 Mario Raceway          58.5
10 Moo Moo Farm           85.9
11 Rainbow Road           50.4
12 Royal Raceway         119. 
13 Sherbet Land           91.6
14 Toad's Turnpike        30.3
15 Wario Stadium          14.6
16 Yoshi Valley           33.4

Question 6

three_laps |> 
  mutate(long_record = if_else(record_duration > 100, 1, 0)) |> 
  group_by(player) |> 
  summarize(total_long_records = sum(long_record)) |> 
  arrange(desc(total_long_records))
# A tibble: 60 × 2
   player   total_long_records
   <chr>                 <dbl>
 1 MR                       81
 2 MJ                       50
 3 Penev                    27
 4 VAJ                      26
 5 abney317                 26
 6 Zwartjes                 24
 7 Lacey                    23
 8 Dan                      21
 9 Karlo                    18
10 Booth                    17
# ℹ 50 more rows

Player with most long record: MR