library(tidyverse)
<- read_csv("https://raw.githubusercontent.com/vaiseys/dav-course/main/Data/world_records.csv") mario_kart
HW03
Packages
Question 1
Whole Race:
<- mario_kart |> filter(type == "Three Lap") three_laps
Dataset Without “Rainbow Road”:
<- three_laps |> filter(track != "Rainbow Road") no_rainbow
Dataset with only “Rainbow Road”:
<- three_laps |> filter(track == "Rainbow Road") only_rainbow
Question 2
Rainbow Avg&Sd:
|>
only_rainbow summarize(
avg_time = mean(time),
sd_time = sd(time)
)
# A tibble: 1 × 2
avg_time sd_time
<dbl> <dbl>
1 276. 91.8
No Rainbow Avg&Sd:
|>
no_rainbow summarize(
avg_time = mean(time),
sd_time = sd(time)
)
# A tibble: 1 × 2
avg_time sd_time
<dbl> <dbl>
1 114. 53.0
Differences: Average time and standard deviation of time decrease significantly for the data set that contain all other roads than that of the data set contain only rainbow road.
Question 3
|>
three_laps group_by(track) |>
summarize(num_records = n()) |>
arrange(desc(num_records))
# A tibble: 16 × 2
track num_records
<chr> <int>
1 Toad's Turnpike 124
2 Rainbow Road 99
3 Frappe Snowland 92
4 D.K.'s Jungle Parkway 86
5 Choco Mountain 84
6 Mario Raceway 82
7 Luigi Raceway 81
8 Royal Raceway 77
9 Yoshi Valley 74
10 Kalimari Desert 73
11 Sherbet Land 73
12 Wario Stadium 71
13 Koopa Troopa Beach 56
14 Banshee Boardwalk 55
15 Moo Moo Farm 44
16 Bowser's Castle 40
Track with the most records: Toad’s Turnpike
Question 4
|>
three_laps group_by(player, track) |>
summarize(num_records = n()) |>
arrange(desc(num_records))
`summarise()` has grouped output by 'player'. You can override using the
`.groups` argument.
# A tibble: 306 × 3
# Groups: player [60]
player track num_records
<chr> <chr> <int>
1 Penev Choco Mountain 26
2 Lacey D.K.'s Jungle Parkway 24
3 abney317 Rainbow Road 21
4 MR Toad's Turnpike 20
5 MR Frappe Snowland 18
6 Penev Toad's Turnpike 18
7 MR Sherbet Land 16
8 abney317 Kalimari Desert 16
9 MR Banshee Boardwalk 15
10 Penev Rainbow Road 15
# ℹ 296 more rows
Player: Penev Track: Choco Mountain
Question 5
|>
three_laps group_by(track) |>
summarize(
avg_time = mean(time)
|>
) arrange(desc(avg_time))
# A tibble: 16 × 2
track avg_time
<chr> <dbl>
1 Rainbow Road 276.
2 Wario Stadium 214.
3 Royal Raceway 158.
4 Bowser's Castle 134.
5 Kalimari Desert 126.
6 Banshee Boardwalk 126.
7 Toad's Turnpike 122.
8 Sherbet Land 116.
9 Luigi Raceway 104.
10 D.K.'s Jungle Parkway 101.
11 Koopa Troopa Beach 96.6
12 Choco Mountain 95.2
13 Moo Moo Farm 88.4
14 Yoshi Valley 82.7
15 Mario Raceway 79.1
16 Frappe Snowland 77.1
Track with the highest average time: Rainbow Road
Tracks with their lowest record:
|>
three_laps group_by(track) |>
arrange(time) |>
slice(1) |>
select(track, time)
# A tibble: 16 × 2
# Groups: track [16]
track time
<chr> <dbl>
1 Banshee Boardwalk 124.
2 Bowser's Castle 132
3 Choco Mountain 17.3
4 D.K.'s Jungle Parkway 21.4
5 Frappe Snowland 23.6
6 Kalimari Desert 122.
7 Koopa Troopa Beach 95.2
8 Luigi Raceway 25.3
9 Mario Raceway 58.5
10 Moo Moo Farm 85.9
11 Rainbow Road 50.4
12 Royal Raceway 119.
13 Sherbet Land 91.6
14 Toad's Turnpike 30.3
15 Wario Stadium 14.6
16 Yoshi Valley 33.4
Question 6
|>
three_laps mutate(long_record = if_else(record_duration > 100, 1, 0)) |>
group_by(player) |>
summarize(total_long_records = sum(long_record)) |>
arrange(desc(total_long_records))
# A tibble: 60 × 2
player total_long_records
<chr> <dbl>
1 MR 81
2 MJ 50
3 Penev 27
4 VAJ 26
5 abney317 26
6 Zwartjes 24
7 Lacey 23
8 Dan 21
9 Karlo 18
10 Booth 17
# ℹ 50 more rows
Player with most long record: MR