This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
# Question 5
# install.packages("robotstxt")
library(robotstxt)
## Warning: package 'robotstxt' was built under R version 4.3.3
paths_allowed("https://www.imdb.com/title/tt7235466/fullcredits?ref_=tt_cl_sm")
##
www.imdb.com
## [1] TRUE
# Question 6
# install.packages("tidyverse")
library(rvest)
imdb <- read_html("https://www.imdb.com/title/tt7235466/fullcredits?ref_=tt_cl_sm")
tables <- html_elements(imdb,"table")
tibble_list <- html_table(tables[3])
Tibble <- tibble_list[[1]]
str(Tibble)
## tibble [3,152 × 4] (S3: tbl_df/tbl/data.frame)
## $ X1: logi [1:3152] NA NA NA NA NA NA ...
## $ X2: chr [1:3152] "" "Angela Bassett" "" "Peter Krause" ...
## $ X3: chr [1:3152] "" "..." "" "..." ...
## $ X4: chr [1:3152] "" "Athena Grant\n / ... \n 115 episodes, 2018-2025" "" "Bobby Nash\n 115 episodes, 2018-2025" ...
nrow(Tibble)
## [1] 3152
ncol(Tibble)
## [1] 4
# Question 7
Tibble <- tibble_list[[1]]
Tibble_subset <- Tibble[, c(2, 4), drop = FALSE]
str(Tibble_subset)
## tibble [3,152 × 2] (S3: tbl_df/tbl/data.frame)
## $ X2: chr [1:3152] "" "Angela Bassett" "" "Peter Krause" ...
## $ X4: chr [1:3152] "" "Athena Grant\n / ... \n 115 episodes, 2018-2025" "" "Bobby Nash\n 115 episodes, 2018-2025" ...
# After looking through the data set, I found 1576 obs. of 2 variables
# Question 8
colnames(Tibble) <- c("v1","v2")
# ncol(Tibble) <- c("v1","v2") DIDN'T WORK
# rownames(Tibble) <- c("v1","v2") DIDN'T WORK
names(Tibble) <- c("v1","v2")
## Warning: The `value` argument of `names<-()` must have the same length as `x` as of
## tibble 3.0.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The `value` argument of `names<-()` can't be empty as of tibble 3.0.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Question 9
# Inspect webpage
CSS_scraping <- html_element(imdb, "#fullcredits_content > table:nth-child(38)")
Table1 <- html_table(CSS_scraping)
nrow(Table1)
## [1] 196
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.