Packages Needed
library(tuber)
library(magrittr)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::extract() masks magrittr::extract()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::set_names() masks magrittr::set_names()
library(purrr)
YouTube API key setup instructions
Login to Google’s developer console. On the dashboard click “Enable
APIS and Services” and scroll through the list of APIs until you reach
YouTube Data API v3.
To create credentials an OAuth client ID is required. Once this is
created you will be provided with a client ID and client secret. Copy
and paste both of these into R Studio to complete the following
steps:
R Studio Setup
Save the two objects in RStudio
client_id <- "454647064650-djq2l0tjvhusctmdhmc7uqfa48b060cd.apps.googleusercontent.com"
client_secret <- "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"
Authenticate the application. Running this command will open a
browser to sign into the Google account you used in the developer
console.
yt_oauth(app_id = client_id,
app_secret = client_secret,
token = '')
After the sign in page, make sure you click “allow” when asked if
you trust the application R Studio.
The following code shows a function of the YouTube API which allows
you to download data of specific youtube playlists.
cody_ko_playlist_id <- stringr::str_split(
string = "https://youtube.com/playlist?list=PL2ifFb9uR3cLlyW4OcCAeOJtiPjjHZb7_",
pattern = "=",
n = 2,
simplify = TRUE)[ , 2]
cody_ko_playlist_id
## [1] "PL2ifFb9uR3cLlyW4OcCAeOJtiPjjHZb7_"
playlist_info <- tuber::get_playlist_items(filter =
c(playlist_id = "PL2ifFb9uR3cLlyW4OcCAeOJtiPjjHZb7_"),
part = "contentDetails",
max_results = 16)
After downloading the data, the YouTube API can also collect
statistics such as views and likes, which can then be organised into a
data frame.
video_ids <- playlist_info$items$contentDetails$resourceId$videoId
video_titles <- playlist_info$items$contentDetails$title
get_all_stats <- function(id) {
tuber::get_stats(video_id = id)
}
video_stats <- lapply(video_ids, function(x) get_video_stats(x, "statistics", yt))
video_data <- data.frame(video_title = video_titles,
views = sapply(video_stats, function(x) x$items$statistics$viewCount),
likes = sapply(video_stats, function(x) x$items$statistics$likeCount),
dislikes = sapply(video_stats, function(x) x$items$statistics$dislikeCount))