Download the necessary libraries

library(vosonSML)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(igraph)
## 
## Attaching package: 'igraph'
## 
## The following objects are masked from 'package:lubridate':
## 
##     %--%, union
## 
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## 
## The following objects are masked from 'package:purrr':
## 
##     compose, simplify
## 
## The following object is masked from 'package:tidyr':
## 
##     crossing
## 
## The following object is masked from 'package:tibble':
## 
##     as_data_frame
## 
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## 
## The following object is masked from 'package:base':
## 
##     union

1. DO EITHER YOUTUBE or REDDIT analysis

through the Google API console, I generate the associated API key and pass it to the program

myAPIKey <- "AIzaSyCAfFMwXaCOYhp19wj_oTCa9iyCVXWaPAU"# this is my API 
youtubeAuth <- Authenticate("youtube", apiKey = myAPIKey)

For homework, I chose a YouTube video dedicated to a brief overview of the 2022 FIFA World Cup, where the national football teams of Argentina and France played. To do this, I copied the link address of this video and inserted it into the following command

videoIDs <- ("https://www.youtube.com/watch?v=zhEWqfP6V_w")

Next, using the Collect function, I create a table with the maximum number of comments - 100 and output this table to make sure that the data is correctly saved

youtubeData <- youtubeAuth %>%
  Collect(videoIDs = videoIDs,
          maxComments = 100,
          writeToFile = TRUE)

youtubeData
## # A tibble: 130 × 12
##    Comment              AuthorDisplayName AuthorProfileImageUrl AuthorChannelUrl
##    <chr>                <chr>             <chr>                 <chr>           
##  1 😢😢😢😢             @74-hfc           https://yt3.ggpht.co… http://www.yout…
##  2 Without a doubt the… @zeinz            https://yt3.ggpht.co… http://www.yout…
##  3 Cuando el debate ac… @Lalo_18          https://yt3.ggpht.co… http://www.yout…
##  4 No one should disre… @hasanatbaher6332 https://yt3.ggpht.co… http://www.yout…
##  5 October 19 2024      @ArfatinNurrahmah https://yt3.ggpht.co… http://www.yout…
##  6 the best of all tim… @victorfarcas8905 https://yt3.ggpht.co… http://www.yout…
##  7 هرجع للتعليق ده في … @Shakir_Mo.       https://yt3.ggpht.co… http://www.yout…
##  8 الله على الذكريات    @Shakir_Mo.       https://yt3.ggpht.co… http://www.yout…
##  9 Messi Fans click th… @kelvinakuneto15… https://yt3.ggpht.co… http://www.yout…
## 10 একটি ম্যাচ একটি পেনা… @mdseezan4451     https://yt3.ggpht.co… http://www.yout…
## # ℹ 120 more rows
## # ℹ 8 more variables: AuthorChannelID <chr>, ReplyCount <chr>, LikeCount <chr>,
## #   PublishedAt <chr>, UpdatedAt <chr>, CommentID <chr>, ParentID <chr>,
## #   VideoID <chr>

Next, I save this table to a csv file using the following command

write.csv(youtubeData,
          "youtubeDataLab3.csv",
          row.names = TRUE)

2. Make activity Graph:

Next, I will create an activity graph from the saved data. I also want to mark in blue the comments that mention France and their leader Mbappe, and in yellow where Argentina and their leader Messi are mentioned

activityNetwork <- youtubeData %>% Create("activity") %>% AddText(youtubeData)
activityGraph <- activityNetwork %>% Graph(writeToFile = TRUE)

V(activityGraph)$color <- "grey"
V(activityGraph)$color[which(V(activityGraph)$node_type=="video")] <- "red"
indFr <- grep("france|mbappe",tolower(V(activityGraph)$vosonTxt_comment))
V(activityGraph)$color[indFr] <- "blue"
indArg <- grep("messi|argentina",tolower(V(activityGraph)$vosonTxt_comment))
V(activityGraph)$color[indArg] <- "yellow"

plot(activityGraph,
     vertex.label="",# deletes edge names from the graph
     vertex.size=4,
     edge.arrow.size=0.5)

This activity graph reflects each point as a separate comment. The red dot is the video itself; the blue dot is the comment that mentions the words “France” and “Mbappe”; the golden dot is the comment that mentions the words “Argentina” and “Messi”. We can notice that most of the comments relate directly to the video, but you can also see the response comments that relate to the main ones, and there are also two comments that have been commented out more than three times. Next, we can additionally see how many nodes we have. To do this, simply output the activityGraph

activityGraph
## IGRAPH fa29f3d DN-- 131 130 -- 
## + attr: type (g/c), name (v/c), video_id (v/c), published_at (v/c),
## | updated_at (v/c), author_id (v/c), screen_name (v/c), node_type
## | (v/c), vosonTxt_comment (v/c), color (v/c), edge_type (e/c)
## + edges from fa29f3d (vertex names):
## [1] Ugxp_F6k9GLojS2XTqN4AaABAg->VIDEOID:zhEWqfP6V_w
## [2] UgxVUM3PDX_pB_NYmtd4AaABAg->VIDEOID:zhEWqfP6V_w
## [3] Ugx1zi5UDK8gItvh9Hl4AaABAg->VIDEOID:zhEWqfP6V_w
## [4] UgzghtvM4v7xrUncrZR4AaABAg->VIDEOID:zhEWqfP6V_w
## [5] Ugwnho--Lhg4CgpKVS54AaABAg->VIDEOID:zhEWqfP6V_w
## [6] Ugx0AtInkkXjJdxioyR4AaABAg->VIDEOID:zhEWqfP6V_w
## + ... omitted several edges

In total, we got only This means that there were overall 125 comments collected and this comments were 124 times commented.

3. Make actor Graph

Next, we will create a graph of actors, where the node will be the user

actorNetwork <- youtubeData %>% Create("actor") %>% AddText(youtubeData)
actorGraph <- actorNetwork %>% Graph(writeToFile = TRUE)

V(actorGraph)$color <-ifelse(V(actorGraph)$node_type=="video",#coloring
                             "red",
                             "grey")

#plotting YouTube actor network (red node is video)
plot(actorGraph,
     vertex.size=4,
     vertex.label="",
     edge.arrow.size=0.5)

These are the actors’ graphs. It shows the interaction of users with the video and each other. On it, we see that most of the actors directly commented on the video, but the interaction of users with each other is also noticeable. If we consider their interaction, they usually communicate one-on-one. We can also see closed loops, as well as several arrows from one node. This is an indicator that users have replied to each other several times.

4. Make sentiment analysis barplot and provide comments.

Next, we will analyze the tone of the comments. To do this, first activate the necessary library and run the following command

library(syuzhet)
comments <- iconv(youtubeData$Comment, to = 'UTF-8')# converting data to use in package
comments %>% str
##  chr [1:130] "😢😢😢😢" "Without a doubt the best football match ever." ...
# Obtain sentiment scores
s <- get_nrc_sentiment(comments)

s$neautral <- ifelse(s$negative + s$positive ==0, 1, 0)

barplot(100*colSums(s)/sum(s),
        las = 2,
        col = rainbow(10),
        ylab = "Percentage",
        main = "10 Parenting Tips to Calm Down Any Child In a Minute")

On the resulting graph, we can see that most of the comments were considered neutral. You can also notice that there are more negative comments than positive ones. The results of the graph turned out to be ambiguous for me, because I assumed that there would be more positive comments. It is also worth considering that some of the comments are written in languages other than English