Tugas 01 Praktikum STA581 Sains Data
Datasets
Tugas praktikum ini akan menggunakan data football dari package Countr. Data ini berisi hasil pertandingan di Liga Premier Inggris dari musim 2009/2010 sampai 2016/2017.
library(Countr)
library(tidyverse)
library(rmarkdown)
data("football")
football <-tibble::as_tibble(football)Berikut informasi terkait isi dari data football :
head(football)## # A tibble: 6 x 6
## seasonId gameDate homeTeam awayTeam homeTeamGoals awayTeamGoals
## <int> <dttm> <chr> <chr> <int> <int>
## 1 2009 2009-08-15 18:45:00 Chelsea Hull 2 1
## 2 2009 2009-08-15 21:00:00 Aston Villa Wigan 0 2
## 3 2009 2009-08-15 21:00:00 Wolves West Ham 0 2
## 4 2009 2009-08-15 21:00:00 Blackburn Man City 0 2
## 5 2009 2009-08-15 21:00:00 Bolton Sunderland 0 1
## 6 2009 2009-08-15 21:00:00 Portsmouth Fulham 0 1
glimpse(football)## Rows: 3,040
## Columns: 6
## $ seasonId <int> 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 20~
## $ gameDate <dttm> 2009-08-15 18:45:00, 2009-08-15 21:00:00, 2009-08-15 21~
## $ homeTeam <chr> "Chelsea", "Aston Villa", "Wolves", "Blackburn", "Bolton~
## $ awayTeam <chr> "Hull", "Wigan", "West Ham", "Man City", "Sunderland", "~
## $ homeTeamGoals <int> 2, 0, 0, 0, 0, 0, 2, 1, 1, 2, 1, 0, 1, 1, 1, 4, 4, 2, 1,~
## $ awayTeamGoals <int> 1, 2, 2, 2, 1, 1, 0, 6, 0, 1, 3, 1, 0, 5, 0, 0, 1, 1, 0,~
Fungsi Summarise
Penggunaan fungsi summarise() untuk menghitung total gol tim kandang dan tim tandang dari musim 2009/2010 sampai 2016/2017 :
football %>% summarise(totalhomeTeamGoals=sum(homeTeamGoals),totalawayTeamGoals=sum(awayTeamGoals),.groups='drop')## # A tibble: 1 x 2
## totalhomeTeamGoals totalawayTeamGoals
## <int> <int>
## 1 4790 3572
Penggunaan fungsi summarise() untuk menghitung total gol tim kandang dan tim tandang untuk setiap musim dari musim 2009/2010 sampai 2016/2017 :
football %>% group_by(seasonId) %>% summarise(totalhomeTeamGoals=sum(homeTeamGoals),totalawayTeamGoals=sum(awayTeamGoals),.groups='drop')## # A tibble: 8 x 3
## seasonId totalhomeTeamGoals totalawayTeamGoals
## <int> <int> <int>
## 1 2009 645 408
## 2 2010 617 446
## 3 2011 604 462
## 4 2012 592 471
## 5 2013 598 454
## 6 2014 560 415
## 7 2015 567 459
## 8 2016 607 457
Fungsi Arrange
Penggunaan fungsi arrange() untuk mengurutkan data berdasarkan pertandingan terbaru :
football %>% arrange(desc(gameDate))## # A tibble: 3,040 x 6
## seasonId gameDate homeTeam awayTeam homeTeamGoals awayTeamGoals
## <int> <dttm> <chr> <chr> <int> <int>
## 1 2016 2017-05-21 21:00:00 Arsenal Everton 3 1
## 2 2016 2017-05-21 21:00:00 Burnley West Ham 1 2
## 3 2016 2017-05-21 21:00:00 Chelsea Sunderl~ 5 1
## 4 2016 2017-05-21 21:00:00 Hull Tottenh~ 1 7
## 5 2016 2017-05-21 21:00:00 Leicester Bournem~ 1 1
## 6 2016 2017-05-21 21:00:00 Liverpool Middles~ 3 0
## 7 2016 2017-05-21 21:00:00 Man Utd Crystal~ 2 0
## 8 2016 2017-05-21 21:00:00 Southampton Stoke 0 1
## 9 2016 2017-05-21 21:00:00 Swansea West Br~ 2 1
## 10 2016 2017-05-21 21:00:00 Watford Man City 0 5
## # ... with 3,030 more rows
Penggunaan fungsi arrange() untuk mengurutkan data berdasarkan gol kandang dan gol tandang terkecil :
football %>% arrange(homeTeamGoals,awayTeamGoals)## # A tibble: 3,040 x 6
## seasonId gameDate homeTeam awayTeam homeTeamGoals awayTeamGoals
## <int> <dttm> <chr> <chr> <int> <int>
## 1 2009 2009-08-22 21:00:00 Birmingham Stoke 0 0
## 2 2009 2009-08-29 21:00:00 Blackburn West Ham 0 0
## 3 2009 2009-10-24 21:00:00 Hull Portsmouth 0 0
## 4 2009 2009-11-01 22:00:00 Birmingham Man City 0 0
## 5 2009 2009-11-28 22:00:00 Blackburn Stoke 0 0
## 6 2009 2009-12-05 22:00:00 Blackburn Liverpool 0 0
## 7 2009 2009-12-12 21:00:00 Hull Blackburn 0 0
## 8 2009 2009-12-26 18:45:00 Birmingham Chelsea 0 0
## 9 2009 2009-12-26 20:00:00 Fulham Tottenham 0 0
## 10 2009 2010-01-16 21:00:00 Tottenham Hull 0 0
## # ... with 3,030 more rows
Fungsi Filter
Penggunaan fungsi filter untuk menampilkan data pertandingan yang hanya dimenangkan oleh tim tandang :
football %>% filter(awayTeamGoals>homeTeamGoals)## # A tibble: 866 x 6
## seasonId gameDate homeTeam awayTeam homeTeamGoals awayTeamGoals
## <int> <dttm> <chr> <chr> <int> <int>
## 1 2009 2009-08-15 21:00:00 Aston Villa Wigan 0 2
## 2 2009 2009-08-15 21:00:00 Wolves West Ham 0 2
## 3 2009 2009-08-15 21:00:00 Blackburn Man City 0 2
## 4 2009 2009-08-15 21:00:00 Bolton Sunderland 0 1
## 5 2009 2009-08-15 21:00:00 Portsmouth Fulham 0 1
## 6 2009 2009-08-15 23:30:00 Everton Arsenal 1 6
## 7 2009 2009-08-19 01:45:00 Sunderland Chelsea 1 3
## 8 2009 2009-08-19 01:45:00 Wigan Wolves 0 1
## 9 2009 2009-08-20 01:45:00 Hull Tottenham 1 5
## 10 2009 2009-08-22 21:00:00 Wigan Man Utd 0 5
## # ... with 856 more rows
Penggunaan fungsi filter untuk menampilkan data pertandingan yang dimainkan oleh tim Blackburn :
football %>% filter(homeTeam == 'Blackburn' | awayTeam == 'Blackburn')## # A tibble: 114 x 6
## seasonId gameDate homeTeam awayTeam homeTeamGoals awayTeamGoals
## <int> <dttm> <chr> <chr> <int> <int>
## 1 2009 2009-08-15 21:00:00 Blackburn Man City 0 2
## 2 2009 2009-08-22 21:00:00 Sunderland Blackburn 2 1
## 3 2009 2009-08-29 21:00:00 Blackburn West Ham 0 0
## 4 2009 2009-09-12 21:00:00 Blackburn Wolves 3 1
## 5 2009 2009-09-20 21:00:00 Everton Blackburn 3 0
## 6 2009 2009-09-26 21:00:00 Blackburn Aston Villa 2 1
## 7 2009 2009-10-04 19:30:00 Arsenal Blackburn 6 2
## 8 2009 2009-10-18 19:00:00 Blackburn Burnley 3 2
## 9 2009 2009-10-24 23:30:00 Chelsea Blackburn 5 0
## 10 2009 2009-11-01 00:30:00 Man Utd Blackburn 2 0
## # ... with 104 more rows
Fungsi Mutate
Penggunaan fungsi Mutate untuk membuat menambah kolom baru GoalsDif yang berisi selisih goal tim kandang dengan goal tim tandang :
paged_table(football %>% mutate(GoalsDif=homeTeamGoals-awayTeamGoals))Penggunaan fungsi Mutate untuk membuat menambah kolom baru homeWin yang berisi 1 jika tim kandang menang dan 0 jika lainnya :
paged_table(football %>% mutate(homeTeamWin = ifelse(homeTeamGoals > awayTeamGoals, 1, 0)))Fungsi Select
Penggunaan fungsi select untuk menampilkan tanggal pertandingan dan tim yang bertanding saja :
football %>% select(gameDate,homeTeam,awayTeam)## # A tibble: 3,040 x 3
## gameDate homeTeam awayTeam
## <dttm> <chr> <chr>
## 1 2009-08-15 18:45:00 Chelsea Hull
## 2 2009-08-15 21:00:00 Aston Villa Wigan
## 3 2009-08-15 21:00:00 Wolves West Ham
## 4 2009-08-15 21:00:00 Blackburn Man City
## 5 2009-08-15 21:00:00 Bolton Sunderland
## 6 2009-08-15 21:00:00 Portsmouth Fulham
## 7 2009-08-15 21:00:00 Stoke Burnley
## 8 2009-08-15 23:30:00 Everton Arsenal
## 9 2009-08-16 19:30:00 Man Utd Birmingham
## 10 2009-08-16 22:00:00 Tottenham Liverpool
## # ... with 3,030 more rows
Kombinasi Fungsi
Menampilkan klasemen akhir musim 2016/2017 jika hanya dilihat pertandingan kandang saja :
football %>%
mutate(homeTeamWin = ifelse(homeTeamGoals > awayTeamGoals, 1, 0),homeTeamDraw = ifelse(homeTeamGoals == awayTeamGoals, 1, 0),homeTeamLose = ifelse(homeTeamGoals < awayTeamGoals, 1, 0),GoalsDif=homeTeamGoals-awayTeamGoals,) %>%
mutate(Points = ifelse(homeTeamWin == 1, 3, ifelse(homeTeamDraw == 1, 1, 0))) %>%
group_by(seasonId,homeTeam) %>% summarise(totalGoals=sum(homeTeamGoals), Win=sum(homeTeamWin),Draw=sum(homeTeamDraw),Lose=sum(homeTeamLose),GoalsDif=sum(GoalsDif),Points=sum(Points),TotalMatch = n(), .groups = "drop") %>%
filter(seasonId==2016) %>%
arrange(desc(Points),TotalMatch,desc(GoalsDif)) %>%
mutate(No = row_number()) %>%
select(No,Team=homeTeam,TotalMatch,Win,Draw,Lose,GoalsDif,Points)## # A tibble: 20 x 8
## No Team TotalMatch Win Draw Lose GoalsDif Points
## <int> <chr> <int> <dbl> <dbl> <dbl> <int> <dbl>
## 1 1 Tottenham 19 17 2 0 38 53
## 2 2 Chelsea 19 17 0 2 38 51
## 3 3 Arsenal 19 14 3 2 23 45
## 4 4 Everton 19 13 4 2 26 43
## 5 5 Liverpool 19 12 5 2 27 41
## 6 6 Man City 19 11 7 1 20 40
## 7 7 Man Utd 19 8 10 1 14 34
## 8 8 Leicester 19 10 4 5 6 34
## 9 9 Burnley 19 10 3 6 6 33
## 10 10 Bournemouth 19 9 4 6 6 31
## 11 11 West Brom 19 9 2 8 5 29
## 12 12 Watford 19 8 4 7 -4 28
## 13 13 Hull 19 8 4 7 -7 28
## 14 14 Stoke 19 7 6 6 0 27
## 15 15 Swansea 19 8 3 8 -7 27
## 16 16 West Ham 19 7 4 8 -12 25
## 17 17 Southampton 19 6 6 7 -4 24
## 18 18 Crystal Palace 19 6 2 11 -1 20
## 19 19 Middlesbrough 19 4 6 9 -6 18
## 20 20 Sunderland 19 3 5 11 -18 14