Berikut ini adalah tabel hasil perolehan European Championship, update terbaru disertai dengan visualisasinya.
Sumber: https://www.bbc.com/sport/football/european-championship/table Last updated 16th Juni 2021 at 18:54
loading library
library(rvest)
library(stringr)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ dplyr 1.0.6
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x readr::guess_encoding() masks rvest::guess_encoding()
## x dplyr::lag() masks stats::lag()
library(data.table)
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
membaca file HTML dari halaman website
url <- "https://www.bbc.com/sport/football/european-championship/table"
html <- url %>% read_html
membuat data frame dan menampilkan strukturnya
epl_table <- html %>%
html_node(".gs-o-table") %>%
html_table
str(epl_table)
## tibble [5 × 12] (S3: tbl_df/tbl/data.frame)
## $ : chr [1:5] "1" "2" "3" "4" ...
## $ : chr [1:5] "team has moved up" "team has moved down" "team hasn't moved" "team hasn't moved" ...
## $ Team: chr [1:5] "Wales" "Italy" "Switzerland" "Turkey" ...
## $ P : chr [1:5] "2" "1" "1" "2" ...
## $ W : chr [1:5] "1" "1" "0" "0" ...
## $ D : chr [1:5] "1" "0" "1" "0" ...
## $ L : chr [1:5] "0" "0" "0" "2" ...
## $ F : chr [1:5] "3" "3" "1" "0" ...
## $ A : chr [1:5] "1" "0" "1" "5" ...
## $ GD : chr [1:5] "2" "3" "0" "-5" ...
## $ Pts : chr [1:5] "4" "3" "1" "0" ...
## $ Form: chr [1:5] "DDrew 1 - 1 against Switzerland on June 12th 2021.WWon 2 - 0 against Turkey on June 16th 2021." "WWon 3 - 0 against Turkey on June 11th 2021." "DDrew 1 - 1 against Wales on June 12th 2021." "LLost 0 - 3 against Italy on June 11th 2021.LLost 0 - 2 against Wales on June 16th 2021." ...
merapikan data frame
epl_table
## # A tibble: 5 x 12
## `` `` Team P W D L F A GD Pts Form
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 team … Wales 2 1 1 0 3 1 2 4 DDrew…
## 2 2 team … Italy 1 1 0 0 3 0 3 3 WWon …
## 3 3 team … Switz… 1 0 1 0 1 1 0 1 DDrew…
## 4 4 team … Turkey 2 0 0 2 0 5 -5 0 LLost…
## 5 Last … Last … Last … Last … Last … Last … Last… Last… Last… Last… Last… Last …
epl_table[1:2] <- list(NULL)
epl_table <- epl_table[-21,]
epl_table
## # A tibble: 5 x 10
## Team P W D L F A GD Pts Form
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Wales 2 1 1 0 3 1 2 4 DDrew 1…
## 2 Italy 1 1 0 0 3 0 3 3 WWon 3 …
## 3 Switze… 1 0 1 0 1 1 0 1 DDrew 1…
## 4 Turkey 2 0 0 2 0 5 -5 0 LLost 0…
## 5 Last u… Last u… Last u… Last u… Last u… Last u… Last u… Last … Last … Last up…
str(epl_table)
## tibble [5 × 10] (S3: tbl_df/tbl/data.frame)
## $ Team: chr [1:5] "Wales" "Italy" "Switzerland" "Turkey" ...
## $ P : chr [1:5] "2" "1" "1" "2" ...
## $ W : chr [1:5] "1" "1" "0" "0" ...
## $ D : chr [1:5] "1" "0" "1" "0" ...
## $ L : chr [1:5] "0" "0" "0" "2" ...
## $ F : chr [1:5] "3" "3" "1" "0" ...
## $ A : chr [1:5] "1" "0" "1" "5" ...
## $ GD : chr [1:5] "2" "3" "0" "-5" ...
## $ Pts : chr [1:5] "4" "3" "1" "0" ...
## $ Form: chr [1:5] "DDrew 1 - 1 against Switzerland on June 12th 2021.WWon 2 - 0 against Turkey on June 16th 2021." "WWon 3 - 0 against Turkey on June 11th 2021." "DDrew 1 - 1 against Wales on June 12th 2021." "LLost 0 - 3 against Italy on June 11th 2021.LLost 0 - 2 against Wales on June 16th 2021." ...
format ulang kolom form
epl_table$Form[1]
## [1] "DDrew 1 - 1 against Switzerland on June 12th 2021.WWon 2 - 0 against Turkey on June 16th 2021."
extract_form <- function(form){
str_extract_all(form, "WWon|DDrew|LLost")
}
form <- sapply(epl_table$Form, extract_form, USE.NAMES = FALSE)
str(form)
## List of 5
## $ : chr [1:2] "DDrew" "WWon"
## $ : chr "WWon"
## $ : chr "DDrew"
## $ : chr [1:2] "LLost" "LLost"
## $ : chr(0)
merapikan kolom form
simply_form <- function(form){
form %>%
str_extract("W|D|L") %>%
paste(collapse = ",")
}
form <- sapply(form, simply_form)
str(form)
## chr [1:5] "D,W" "W" "D" "L,L" ""
update kolom Form dengan vector form
epl_table$Form <- form
print(epl_table)
## # A tibble: 5 x 10
## Team P W D L F A GD Pts Form
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Wales 2 1 1 0 3 1 2 4 "D,W"
## 2 Italy 1 1 0 0 3 0 3 3 "W"
## 3 Switzer… 1 0 1 0 1 1 0 1 "D"
## 4 Turkey 2 0 0 2 0 5 -5 0 "L,L"
## 5 Last up… Last u… Last u… Last u… Last u… Last u… Last u… Last u… Last u… ""
Legenda P : Played W : Won D : Draw L : Lose F : For A : Against GD : Goal Different Pts : Points Form: Form
visualisasi perolehan point tertinggi
ggplot(data = epl_table) +
geom_bar(mapping = aes(x = Team, y = Pts), stat = "identity")
visualisai perbandingan jumlah goal ke kandang lawan dan kebobolan
ggplot(data = epl_table) +
geom_bar(mapping = aes(x = Team, y = GD), stat = "identity")
Referensi:
https://www.bbc.com/sport/football/european-championship/table https://www.nurandi.id/blog/web-scraping-dengan-r-dan-rvest-parsing-tabel-html/ https://www.netlify.com/ https://gohugo.io/ https://www.netlify.com/authors/david-calavera/ https://bookdown.org/moh_rosidi2610/Metode_Numerik/dataviz.html#plotfunc