Tugas 01 Praktikum STA581 Sains Data


Datasets

Tugas praktikum ini akan menggunakan data football dari package Countr. Data ini berisi hasil pertandingan di Liga Premier Inggris dari musim 2009/2010 sampai 2016/2017.

library(Countr)
library(tidyverse)
library(rmarkdown)

data("football")
football <-tibble::as_tibble(football)

Berikut informasi terkait isi dari data football :

head(football)
## # A tibble: 6 x 6
##   seasonId gameDate            homeTeam    awayTeam   homeTeamGoals awayTeamGoals
##      <int> <dttm>              <chr>       <chr>              <int>         <int>
## 1     2009 2009-08-15 18:45:00 Chelsea     Hull                   2             1
## 2     2009 2009-08-15 21:00:00 Aston Villa Wigan                  0             2
## 3     2009 2009-08-15 21:00:00 Wolves      West Ham               0             2
## 4     2009 2009-08-15 21:00:00 Blackburn   Man City               0             2
## 5     2009 2009-08-15 21:00:00 Bolton      Sunderland             0             1
## 6     2009 2009-08-15 21:00:00 Portsmouth  Fulham                 0             1
glimpse(football)
## Rows: 3,040
## Columns: 6
## $ seasonId      <int> 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 20~
## $ gameDate      <dttm> 2009-08-15 18:45:00, 2009-08-15 21:00:00, 2009-08-15 21~
## $ homeTeam      <chr> "Chelsea", "Aston Villa", "Wolves", "Blackburn", "Bolton~
## $ awayTeam      <chr> "Hull", "Wigan", "West Ham", "Man City", "Sunderland", "~
## $ homeTeamGoals <int> 2, 0, 0, 0, 0, 0, 2, 1, 1, 2, 1, 0, 1, 1, 1, 4, 4, 2, 1,~
## $ awayTeamGoals <int> 1, 2, 2, 2, 1, 1, 0, 6, 0, 1, 3, 1, 0, 5, 0, 0, 1, 1, 0,~


Fungsi Summarise

Penggunaan fungsi summarise() untuk menghitung total gol tim kandang dan tim tandang dari musim 2009/2010 sampai 2016/2017 :

football %>% summarise(totalhomeTeamGoals=sum(homeTeamGoals),totalawayTeamGoals=sum(awayTeamGoals),.groups='drop')
## # A tibble: 1 x 2
##   totalhomeTeamGoals totalawayTeamGoals
##                <int>              <int>
## 1               4790               3572

Penggunaan fungsi summarise() untuk menghitung total gol tim kandang dan tim tandang untuk setiap musim dari musim 2009/2010 sampai 2016/2017 :

football %>% group_by(seasonId) %>% summarise(totalhomeTeamGoals=sum(homeTeamGoals),totalawayTeamGoals=sum(awayTeamGoals),.groups='drop')
## # A tibble: 8 x 3
##   seasonId totalhomeTeamGoals totalawayTeamGoals
##      <int>              <int>              <int>
## 1     2009                645                408
## 2     2010                617                446
## 3     2011                604                462
## 4     2012                592                471
## 5     2013                598                454
## 6     2014                560                415
## 7     2015                567                459
## 8     2016                607                457


Fungsi Arrange

Penggunaan fungsi arrange() untuk mengurutkan data berdasarkan pertandingan terbaru :

football %>% arrange(desc(gameDate))
## # A tibble: 3,040 x 6
##    seasonId gameDate            homeTeam    awayTeam homeTeamGoals awayTeamGoals
##       <int> <dttm>              <chr>       <chr>            <int>         <int>
##  1     2016 2017-05-21 21:00:00 Arsenal     Everton              3             1
##  2     2016 2017-05-21 21:00:00 Burnley     West Ham             1             2
##  3     2016 2017-05-21 21:00:00 Chelsea     Sunderl~             5             1
##  4     2016 2017-05-21 21:00:00 Hull        Tottenh~             1             7
##  5     2016 2017-05-21 21:00:00 Leicester   Bournem~             1             1
##  6     2016 2017-05-21 21:00:00 Liverpool   Middles~             3             0
##  7     2016 2017-05-21 21:00:00 Man Utd     Crystal~             2             0
##  8     2016 2017-05-21 21:00:00 Southampton Stoke                0             1
##  9     2016 2017-05-21 21:00:00 Swansea     West Br~             2             1
## 10     2016 2017-05-21 21:00:00 Watford     Man City             0             5
## # ... with 3,030 more rows

Penggunaan fungsi arrange() untuk mengurutkan data berdasarkan gol kandang dan gol tandang terkecil :

football %>% arrange(homeTeamGoals,awayTeamGoals)
## # A tibble: 3,040 x 6
##    seasonId gameDate            homeTeam   awayTeam   homeTeamGoals awayTeamGoals
##       <int> <dttm>              <chr>      <chr>              <int>         <int>
##  1     2009 2009-08-22 21:00:00 Birmingham Stoke                  0             0
##  2     2009 2009-08-29 21:00:00 Blackburn  West Ham               0             0
##  3     2009 2009-10-24 21:00:00 Hull       Portsmouth             0             0
##  4     2009 2009-11-01 22:00:00 Birmingham Man City               0             0
##  5     2009 2009-11-28 22:00:00 Blackburn  Stoke                  0             0
##  6     2009 2009-12-05 22:00:00 Blackburn  Liverpool              0             0
##  7     2009 2009-12-12 21:00:00 Hull       Blackburn              0             0
##  8     2009 2009-12-26 18:45:00 Birmingham Chelsea                0             0
##  9     2009 2009-12-26 20:00:00 Fulham     Tottenham              0             0
## 10     2009 2010-01-16 21:00:00 Tottenham  Hull                   0             0
## # ... with 3,030 more rows


Fungsi Filter

Penggunaan fungsi filter untuk menampilkan data pertandingan yang hanya dimenangkan oleh tim tandang :

football %>% filter(awayTeamGoals>homeTeamGoals)
## # A tibble: 866 x 6
##    seasonId gameDate            homeTeam    awayTeam   homeTeamGoals awayTeamGoals
##       <int> <dttm>              <chr>       <chr>              <int>         <int>
##  1     2009 2009-08-15 21:00:00 Aston Villa Wigan                  0             2
##  2     2009 2009-08-15 21:00:00 Wolves      West Ham               0             2
##  3     2009 2009-08-15 21:00:00 Blackburn   Man City               0             2
##  4     2009 2009-08-15 21:00:00 Bolton      Sunderland             0             1
##  5     2009 2009-08-15 21:00:00 Portsmouth  Fulham                 0             1
##  6     2009 2009-08-15 23:30:00 Everton     Arsenal                1             6
##  7     2009 2009-08-19 01:45:00 Sunderland  Chelsea                1             3
##  8     2009 2009-08-19 01:45:00 Wigan       Wolves                 0             1
##  9     2009 2009-08-20 01:45:00 Hull        Tottenham              1             5
## 10     2009 2009-08-22 21:00:00 Wigan       Man Utd                0             5
## # ... with 856 more rows

Penggunaan fungsi filter untuk menampilkan data pertandingan yang dimainkan oleh tim Blackburn :

football %>% filter(homeTeam == 'Blackburn' | awayTeam == 'Blackburn')
## # A tibble: 114 x 6
##    seasonId gameDate            homeTeam   awayTeam    homeTeamGoals awayTeamGoals
##       <int> <dttm>              <chr>      <chr>               <int>         <int>
##  1     2009 2009-08-15 21:00:00 Blackburn  Man City                0             2
##  2     2009 2009-08-22 21:00:00 Sunderland Blackburn               2             1
##  3     2009 2009-08-29 21:00:00 Blackburn  West Ham                0             0
##  4     2009 2009-09-12 21:00:00 Blackburn  Wolves                  3             1
##  5     2009 2009-09-20 21:00:00 Everton    Blackburn               3             0
##  6     2009 2009-09-26 21:00:00 Blackburn  Aston Villa             2             1
##  7     2009 2009-10-04 19:30:00 Arsenal    Blackburn               6             2
##  8     2009 2009-10-18 19:00:00 Blackburn  Burnley                 3             2
##  9     2009 2009-10-24 23:30:00 Chelsea    Blackburn               5             0
## 10     2009 2009-11-01 00:30:00 Man Utd    Blackburn               2             0
## # ... with 104 more rows


Fungsi Mutate

Penggunaan fungsi Mutate untuk membuat menambah kolom baru GoalsDif yang berisi selisih goal tim kandang dengan goal tim tandang :

paged_table(football %>% mutate(GoalsDif=homeTeamGoals-awayTeamGoals))

Penggunaan fungsi Mutate untuk membuat menambah kolom baru homeWin yang berisi 1 jika tim kandang menang dan 0 jika lainnya :

paged_table(football %>% mutate(homeTeamWin = ifelse(homeTeamGoals > awayTeamGoals, 1, 0)))


Fungsi Select

Penggunaan fungsi select untuk menampilkan tanggal pertandingan dan tim yang bertanding saja :

football %>% select(gameDate,homeTeam,awayTeam)
## # A tibble: 3,040 x 3
##    gameDate            homeTeam    awayTeam  
##    <dttm>              <chr>       <chr>     
##  1 2009-08-15 18:45:00 Chelsea     Hull      
##  2 2009-08-15 21:00:00 Aston Villa Wigan     
##  3 2009-08-15 21:00:00 Wolves      West Ham  
##  4 2009-08-15 21:00:00 Blackburn   Man City  
##  5 2009-08-15 21:00:00 Bolton      Sunderland
##  6 2009-08-15 21:00:00 Portsmouth  Fulham    
##  7 2009-08-15 21:00:00 Stoke       Burnley   
##  8 2009-08-15 23:30:00 Everton     Arsenal   
##  9 2009-08-16 19:30:00 Man Utd     Birmingham
## 10 2009-08-16 22:00:00 Tottenham   Liverpool 
## # ... with 3,030 more rows


Kombinasi Fungsi

Menampilkan klasemen akhir musim 2016/2017 jika hanya dilihat pertandingan kandang saja :

football %>% 
  mutate(homeTeamWin = ifelse(homeTeamGoals > awayTeamGoals, 1, 0),homeTeamDraw = ifelse(homeTeamGoals == awayTeamGoals, 1, 0),homeTeamLose = ifelse(homeTeamGoals < awayTeamGoals, 1, 0),GoalsDif=homeTeamGoals-awayTeamGoals,) %>% 
  mutate(Points = ifelse(homeTeamWin == 1, 3, ifelse(homeTeamDraw == 1, 1, 0))) %>% 
  group_by(seasonId,homeTeam) %>% summarise(totalGoals=sum(homeTeamGoals), Win=sum(homeTeamWin),Draw=sum(homeTeamDraw),Lose=sum(homeTeamLose),GoalsDif=sum(GoalsDif),Points=sum(Points),TotalMatch = n(), .groups = "drop") %>% 
  filter(seasonId==2016) %>% 
  arrange(desc(Points),TotalMatch,desc(GoalsDif)) %>% 
  mutate(No = row_number()) %>%
  select(No,Team=homeTeam,TotalMatch,Win,Draw,Lose,GoalsDif,Points)
## # A tibble: 20 x 8
##       No Team           TotalMatch   Win  Draw  Lose GoalsDif Points
##    <int> <chr>               <int> <dbl> <dbl> <dbl>    <int>  <dbl>
##  1     1 Tottenham              19    17     2     0       38     53
##  2     2 Chelsea                19    17     0     2       38     51
##  3     3 Arsenal                19    14     3     2       23     45
##  4     4 Everton                19    13     4     2       26     43
##  5     5 Liverpool              19    12     5     2       27     41
##  6     6 Man City               19    11     7     1       20     40
##  7     7 Man Utd                19     8    10     1       14     34
##  8     8 Leicester              19    10     4     5        6     34
##  9     9 Burnley                19    10     3     6        6     33
## 10    10 Bournemouth            19     9     4     6        6     31
## 11    11 West Brom              19     9     2     8        5     29
## 12    12 Watford                19     8     4     7       -4     28
## 13    13 Hull                   19     8     4     7       -7     28
## 14    14 Stoke                  19     7     6     6        0     27
## 15    15 Swansea                19     8     3     8       -7     27
## 16    16 West Ham               19     7     4     8      -12     25
## 17    17 Southampton            19     6     6     7       -4     24
## 18    18 Crystal Palace         19     6     2    11       -1     20
## 19    19 Middlesbrough          19     4     6     9       -6     18
## 20    20 Sunderland             19     3     5    11      -18     14