This project seeks to make use of Youtube API to study the profile and the engagement statistics of the COVID-19 related videos on Youtube.

First, let’s require the two packages: jsonlie, RCurl and plotly.

require(jsonlite)
## Loading required package: jsonlite
require(RCurl)
## Loading required package: RCurl
require(plotly)
## Loading required package: plotly
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Setup your API token.

API <- "PUT YOUR YOUTUBE TOKEN HERE"

Read Youtube Search’s online document: “https://developers.google.com/youtube/v3/docs/search” and define the Youtube Search function - getvideo_search.

getvideo_search <- function(sterm,API_key,nextp=NA){
  nextpage <- ifelse(is.na(nextp),"",paste0("&pageToken=",nextp))
  url <- paste0("https://www.googleapis.com/youtube/v3/search?part=snippet&maxResults=50&q=",URLencode(sterm),"&key=",API_key,nextpage)
  result <- fromJSON(txt=url, flatten = TRUE)  
  return(result)
}

Let’s search “covid vaccine side effect”. The main part of the returned list is the “items” where all searched 50 items are contained in a data.frame. “NextPageToken” is the token to access the next page.

covid_vaccine <- getvideo_search('covid vaccine side effect',API)
dim(covid_vaccine[["items"]])
## [1] 50 20
class(covid_vaccine[["items"]])
## [1] "data.frame"
covid_vaccine[["nextPageToken"]]
## [1] "CDIQAA"

Then, a new data.frame called video_data is set to record all returned items, 50 in a time. Extract the column videoId for the next step to collect engagement figures. We then create a for-loop to repeat getting the data of page 2 to page 5. (Remarks: Column 5 named “thumbnails” is a data.frame in a data.frame. Remove it avoiding unmatched format.)

video_data <- covid_vaccine[["items"]]
covid_vaccine_prepage <- covid_vaccine

for (page in 2:5){
  covid_vaccine_newpage <- getvideo_search('covid vaccine side effect',API,covid_vaccine_prepage[["nextPageToken"]])
  new_data <- covid_vaccine_newpage[["items"]]
  video_data <- rbind(video_data,new_data)
  covid_vaccine_prepage <- covid_vaccine_newpage
  Sys.sleep(5)
}

Now it’s time to collect the engagement statisitcs. Let’s define another function to get the engagement statistics of a video with videoId.

getstats_video<-function(video_id,API_key){
  url <- paste0("https://www.googleapis.com/youtube/v3/videos?part=snippet,statistics&id=",video_id,"&key=",API_key)
  result <- fromJSON(txt=url, flatten = TRUE)  
  return(result$items)
}

Write a for loop to get all 500 items’ engagement statistics (views, likes, dislikes, comments, and favorites) for each video. stats_data is an empty data.frame. Data are iternatively added to it in the loop. Since the returned data may have missing fields, i.e. say missing commentCount or favoriteCount, the colnames of the returned data are compared to match with the reference called data_shell.

stats_data <- data.frame()

for (i in 1:nrow(video_data)){
  stats_result <- getstats_video(video_data$id.videoId[i],API)
  data_shell <- data.frame()
  video_data$viewCount[i] <- ifelse(is.null(stats_result$statistics.viewCount),NA,stats_result$statistics.viewCount)
  video_data$likeCount[i] <- ifelse(is.null(stats_result$statistics.likeCount),NA,stats_result$statistics.likeCount)
  video_data$dislikeCount[i] <- ifelse(is.null(stats_result$statistics.dislikeCount),NA,stats_result$statistics.dislikeCount)
  video_data$favoriteCount[i] <- ifelse(is.null(stats_result$statistics.favoriteCount),NA,stats_result$statistics.favoriteCount)
  video_data$commentCount[i] <- ifelse(is.null(stats_result$statistics.commentCount),NA,stats_result$statistics.commentCount)
  Sys.sleep(5)
}

Show the top 5 most viewed Youtube video.

video_data_sorted <- video_data[order(as.integer(video_data$viewCount),decreasing=T),]

head(video_data_sorted[,c("snippet.title","viewCount")])
##                                                                                            snippet.title
## 45                                                              What The COVID Vaccine Does To Your Body
## 126                            Doctor Dies After Getting COVID 19 Vaccine? || Florida Doctor&#39;s Death
## 49            COVID-19 survivor has rare side effect of COVID-19 treatment -- a massively swollen tongue
## 41                                             What Did Bill Gates Say About COVID Vaccine Side Effects?
## 66    Vaccine Side Effect? Norway Sounds Alarm As 23 Elderly Patients Die After Receiving Pfizer Vaccine
## 181 COVID 19 Vaccine Deep Dive: Safety, Immunity, RNA Production, w Shane Crotty, PhD (Pfizer / Moderna)
##     viewCount
## 45    3562398
## 126   3190677
## 49    3190313
## 41    2269531
## 66    1438944
## 181   1252357

Show the top 5 most viewed Youtube channels (total)

video_totalview <- aggregate(as.integer(viewCount) ~ snippet.channelTitle,data=video_data,sum,na.rm=T)

video_totalview <- video_totalview[order(video_totalview$`as.integer(viewCount)`,decreasing=T),]

head(video_totalview)
##                             snippet.channelTitle as.integer(viewCount)
## 46                            Doctor Mike Hansen               3598909
## 18                                   AsapSCIENCE               3562398
## 85                                       KHOU 11               3203115
## 40                                          CRUX               2799363
## 123                                  PowerfulJRE               2269531
## 100 MedCram - Medical Lectures Explained CLEARLY               1252357

Show the top 5 most viewed Youtube channels (average)

video_averageview <- aggregate(as.integer(viewCount) ~ snippet.channelTitle,data=video_data,mean,na.rm=T)

video_averageview <- video_averageview[order(video_averageview$`as.integer(viewCount)`,decreasing=T),]

head(video_averageview)
##                             snippet.channelTitle as.integer(viewCount)
## 18                                   AsapSCIENCE               3562398
## 123                                  PowerfulJRE               2269531
## 100 MedCram - Medical Lectures Explained CLEARLY               1252357
## 46                            Doctor Mike Hansen               1199636
## 40                                          CRUX                933121
## 112                                        NBCLA                821423

Last, we create a horizonal bar to display the top5 and upload it to plotly site. (Signup your free plotly account: https://chart-studio.plotly.com/).

p1 <- plot_ly(head(video_averageview), y = ~snippet.channelTitle, x = ~`as.integer(viewCount)`, type = 'bar', orientation = 'h', name = "Top 5 Most Viewd Youtube Channels related to COVID (in average views per video")
p1
Sys.setenv("plotly_username"="YOUR USERNAME")
Sys.setenv("plotly_api_key"="YOUR PASSWORD")
api_create(p1, filename = "lecture4_2021")
## Found a grid already named: 'lecture4_2021 Grid'. Since fileopt='overwrite', I'll try to update it
## Found a plot already named: 'lecture4_2021'. Since fileopt='overwrite', I'll try to update it

Read the Youtube Data API page for more: https://developers.google.com/youtube/v3/docs and find more API calls and functions that can support your work. Here is the one to read a video’s (ID = ‘R6reyiSpKuw’) comments page-by-page.

getvideo_comments <- function(video_id,API_key,nextp=NA){
  nextpage <- ifelse(is.na(nextp),"",paste0("&pageToken=",nextp))
  url <- paste0("https://youtube.googleapis.com/youtube/v3/commentThreads?part=id,replies,snippet&maxResults=100&videoId=",video_id,"&key=",API_key,nextpage)
  result <- fromJSON(txt=url, flatten = TRUE)  
  return(result)
}

First100comments <- getvideo_comments('R6reyiSpKuw',API)
Second100comments <- getvideo_comments('R6reyiSpKuw',API,First100comments$nextPageToken)

head(First100comments$items$snippet.topLevelComment.snippet.textDisplay)
## [1] "AOC is a horrible human being"                                                                                                                                                                                                                                                                            
## [2] "More. MORE"                                                                                                                                                                                                                                                                                               
## [3] "JOHN 3:16 “For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life.”"                                                                                                                                                
## [4] "List of things she do all day:<br />#1. Be incredibly beautiful"                                                                                                                                                                                                                                          
## [5] "AOC is so cool and amazing"                                                                                                                                                                                                                                                                               
## [6] "Alexandria you really can talk crap and think you are safe. You need to keep using that black makeup to hide your green Lizard scales.<br />Some people really dont like you at all including homeless alies that might thing to roll you for a few bucks in your purse and whatever jewelry you wearing."