Abstract

This document is meant to do sentiment analysis on review that have been webscraped from Amazon’s site. We shall be using the package syuzhet. Its lower level functions are hidden and we shall only consider the functions it provides to do our analysis.

Dependencies

library(syuzhet)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Loading Dataset

Unfortunately the method with which we have webscraped does not allow for reproducibility. We use the extension Amazon Review Export to obtain our data which we then move into the working directory as a csv file.

sentiment_data = read.csv("Amazon-reivew-export-necklace.csv")

Analysis

As mentioned, the deeper workings of the functions in the package are abstracted away. We only need to use the appropriate functions to get the result we require.

analysis_result = sentiment_data$Review.Content %>%
                      get_nrc_sentiment
## Warning: `spread_()` was deprecated in tidyr 1.2.0.
## Please use `spread()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

Visualization

It is useful to see the analysed result displayed on a graph and that makes it easier to understand the information.

barplot(colSums(analysis_result), col = rainbow(10), 
                    ylab = "Counts", main = "Amazon Reviews Sentiment Analysis", 
                    las = 2 #For rotating labels
                    )