This document is meant to do sentiment analysis on review that have been webscraped from Amazon’s site. We shall be using the package syuzhet. Its lower level functions are hidden and we shall only consider the functions it provides to do our analysis.
library(syuzhet)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.7
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
Unfortunately the method with which we have webscraped does not allow for reproducibility. We use the extension Amazon Review Export to obtain our data which we then move into the working directory as a csv file.
sentiment_data = read.csv("Amazon-reivew-export-necklace.csv")
As mentioned, the deeper workings of the functions in the package are abstracted away. We only need to use the appropriate functions to get the result we require.
analysis_result = sentiment_data$Review.Content %>%
get_nrc_sentiment
## Warning: `spread_()` was deprecated in tidyr 1.2.0.
## Please use `spread()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
It is useful to see the analysed result displayed on a graph and that makes it easier to understand the information.
barplot(colSums(analysis_result), col = rainbow(10),
ylab = "Counts", main = "Amazon Reviews Sentiment Analysis",
las = 2 #For rotating labels
)