1 Summary

This report provides an analysis of fitness tracker reviews on Amazon US. The target product is Amazon Halo, the competitors are Fitbit Inspire 2, Fitbit Charge 4, and Apple Watch 6. Fitbit is considered the leader in fitness trackers, while Apple is a leader in producing smart-watches. Goal is to apply unstructured data analysis using customer reviews to evaluate strengths and weaknesses of each product. The results can be used to improve future generations of the Amazon Halo.

Amazon Halo, a fitness tracker in form of a wristband, was launched by Amazon in December 2020. At the beginning of March 2021, the product is in high demand (sold out) but overall star ratings (3.7/5) suggest that customers are not very satisfied. To get a better idea of what customers think, a look beyond star ratings is required.

To understand the reviews, text mining is the applied method of analysis. Text mining combines machine learning and natural language processing (NLP) to draw meaning from unstructured text. After the reviews are cleaned, the analysis gives a rough overview of frequent terms and ratios. Different sentiment dictionaries score reviews over time and extract customer emotions. Topic Models group high frequency terms, and networks group pairs of common words together. Detailed calculations can be found under each product. In addition, Tweets (from August 2020 to March 2021) were scraped under a keywords constraint (amazon, halo, fitness). Tweets were then analyzed following a similar method compared to the reviews.

Results

Score: A weighted score (considering star ratings, review stars, sentiments) was applied to rank the products, and scaled to a star rating. All products perform worse compared to their original rating.

library("readxl")
table3 <- read_excel("Score.xlsx", sheet = 3)

library(gt)
table3 %>% gt(rowname_col = "Product") %>%
  tab_header(title = md("**Final Scores**")) %>%
  fmt_number(columns = vars(Score), decimals = 2) %>%
  fmt_number(columns = vars(AdjustedStarRating), decimals = 2) %>%
  fmt_number(columns = vars(OriginalStarRating), decimals = 2) %>%
  fmt_percent(columns = vars(Difference), decimals = 1) %>%
  tab_options(
    column_labels.font.size = "smaller",
    table.font.size = "smaller",
    data_row.padding = px(3))

Final Scores

	Score	AdjustedStarRating	OriginalStarRating	Difference
Apple Watch 6	0.71	3.53	4.80	−26.4%
Fitbit Inspire 2	0.67	3.35	4.60	−27.2%
Amazon Halo	0.66	3.30	3.70	−10.9%
Fitbit Charge 4	0.61	3.05	4.50	−32.2%

Amazon Reviews Sentiment: Comparing sentiment scores for Reviews, the Apple Watch 6 scores the highest over time (80% range). The Fitbit Inspire 2 scores in the 75% range, showing a slight upward trend. Amazon´s fitness tracker shows a sideways trend scoring in the 66% range.

Twitter Sentiment: The sentiment scored in the 68% to 76% range, increasing slightly after the product announcement in August but then decreasing until the product launch date in December. In 2021 a slight increase is observed.

Amazon Reviews - Topics, Words, Networks: Customers appreciate Halo´s new features such as voice recognition and are satisfied with accurate tracking results. Opinions over the fabric-band are split, some like it, other think it´s uncomfortable, causing skin irritations. The body of the tracker is described as bulky. Many of the reviews indicate that people returned their tracker connected to problems with poor customer support and monthly charges.

Twitter - Topics, Words, Networks: Tweets cover Amazon Halo´s new features (voice recording and the emotions tracker) as well as Amazon´s entry in the wearables market. The topics that stand out all lead to privacy and data tracking concerns. The privacy concerns come from news-outlets and politicians.

Stars from Ratings vs Stars from Reviews: For the four analyzed products, written reviews resulted in an on average 10% lower rating compared to star ratings only. This implies that customers who write reviews are more critical compared to the ones who just leave a star rating.

Recommendations

Privacy: Features like voice recording challenge the trackers privacy. Increase data safety features or lobbying.

Sensor Size: Customer complaints about tracker´s housing (bulky). Achieve smoother fit.

Wristband: Increase wristband selection, offering various materials (elastomer, high-end silicone) to increase comfort & introduce new premium colors.

Battery Life: Tone analysis drains battery. Increase tracker´s battery life.

Customer Support: Expand customer support for fitness devices at Amazon US.

1.1 Sentiment Plots

1.1.1 NRC

library(miceadds)
library(reshape2)

emHaloSum <- load.Rdata2("emHaloSum.RData", path=getwd())
emFit2Sum <- load.Rdata2("emFit2Sum.RData", path=getwd())
emFit4Sum <- load.Rdata2("emFit4Sum.RData", path=getwd())
emawSum <- load.Rdata2("emawSum.RData", path=getwd())

library(plotly)
p1 <- plot_ly(emHaloSum, y=~emotion, x=~Perc, type="bar", colors =~"cornflowerblue",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Halo")

p2 <- plot_ly(emFit2Sum, y=~emotion, x=~Perc, type="bar", ccolors =~"cornflowerblue",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Halo")

p3 <- plot_ly(emFit4Sum, y=~emotion, x=~Perc, type="bar", colors =~"cornflowerblue",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Halo")

p4 <- plot_ly(emawSum, y=~emotion, x=~Perc, type="bar", colors =~"cornflowerblue",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="")

p <- subplot(p1, p2, p3, p4, nrows = 2, titleX = TRUE, shareX = TRUE)


p %>% layout(annotations = list(
 list(x = .2 , y = 1.05, text = "Amazon Halo", showarrow = F, xref='paper', yref='paper'),
 list(x = .8 , y = 1.05, text = "Fitbit Inspire 2", showarrow = F, xref='paper', yref='paper'),
 list(x = .2 , y = .5, text = "Fitbit Charge 4", showarrow = F, xref='paper', yref='paper'),
 list(x = .8 , y = .5, text = "Apple Watch 6", showarrow = F, xref='paper', yref='paper'))
)

1.1.2 Huliu

library(miceadds)
library(reshape2)

ha2 <- load.Rdata2("ha.RData", path=getwd())
fb22 <- load.Rdata2("fb2.RData", path=getwd())
fb42 <- load.Rdata2("fb4.RData", path=getwd())
aw62 <- load.Rdata2("aw6.RData", path=getwd())

# split dfs
ha2 <-split(ha2, ha2$series)

ha2Huliu <- ha2[["scoreHuliu"]]
ha2JR <- ha2[["scoreJR"]]


fb22 <-split(fb22, fb22$series)

fb22Huliu <- fb22[["scoreHuliu"]]
fb22JR <- fb22[["scoreJR"]]


fb42 <-split(fb42, fb42$series)

fb42Huliu <- fb42[["scoreHuliu"]]
fb42JR <- fb42[["scoreJR"]]


aw62 <-split(aw62, aw62$series)

aw62Huliu <- aw62[["scoreHuliu"]]
aw62JR <- aw62[["scoreJR"]]


library(plyr)
newHuliu <- join_all(list(aw62Huliu,fb22Huliu,ha2Huliu,fb42Huliu), by='date', type='left')

newHuliu <- newHuliu[-c(2,4,6,8)]

colnames(newHuliu)[2] <- "Apple Watch 6"
colnames(newHuliu)[3] <- "Fitbit Inspire 2"
colnames(newHuliu)[4] <- "Amazon Halo"
colnames(newHuliu)[5] <- "Fitbit Charge 4"

plotDataHuliu <- melt(newHuliu,  id.vars = 'date', variable.name = 'Product')

library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

gHuliu <- ggplot(plotDataHuliu, aes(date, value)) +
  geom_smooth(aes(colour = Product)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores Huliu",
       x = "Month (2020/21)",
       y = "") +
  theme_ipsum() 

gHuliu <- ggplotly(gHuliu)
gHuliu

1.1.3 JR

# JR Sentiment
library(plyr)
newJR <- join_all(list(aw62JR,fb22JR,ha2JR,fb42JR), by='date', type='left')

newJR <- newJR[-c(2,4,6,8)]

colnames(newJR)[2] <- "Apple Watch 6"
colnames(newJR)[3] <- "Fitbit Inspire 2"
colnames(newJR)[4] <- "Amazon Halo"
colnames(newJR)[5] <- "Fitbit Charge 4"

plotDataJR <- melt(newJR,  id.vars = 'date', variable.name = 'Product')

library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

gJR <- ggplot(plotDataJR, aes(date, value)) +
  geom_smooth(aes(colour = Product)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores Jockers Rinker",
       x = "Month (2020/21)",
       y = "") +
  theme_ipsum() 

gJR <- ggplotly(gJR)
gJR

1.1.4 Halo Reviews vs Tweets: Sentiment

library(miceadds)
library(reshape2)

ha2 <- load.Rdata2("ha.RData", path=getwd())
scTw <- load.Rdata2("scoresTwitter.Rda", path=getwd())

# split dfs
ha2 <-split(ha2, ha2$series)
ha2JR <- ha2[["scoreJR"]]

ha2JR <- data.frame(ha2JR)
scTw <- data.frame(scTw)


library(plyr)
newJR <- join_all(list(scTw, ha2JR), by='date', type='left')
newJR <- newJR[-c(3)]

colnames(newJR)[2] <- "Tweets"
colnames(newJR)[3] <- "Reviews"

plotDataComp <- melt(newJR,  id.vars = 'date', variable.name = 'Sentiment')

library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

gComp <- ggplot(plotDataComp, aes(date, value)) +
  geom_smooth(aes(colour = Sentiment)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Tweets vs Reviews",
       x = "Month (2020/21)",
       y = "") +
  theme_ipsum() 

gComp <- ggplotly(gComp)
gComp

1.2 Networks & Scores

1.2.1 Network: Halo Reviews

By seeing how often word X is followed by word Y, we can then build a model of the relationships between them. Below, bigram tokens for the reviews of Amazon Halo are displayed.

library(miceadds)
library(reshape2)
# load data
networkHalo <- load.Rdata2("networkHalo.RData", path=getwd())

library(networkD3)
# Plot
netHalo <- simpleNetwork(networkHalo, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10) 
netHalo

1.2.2 Network: Halo Tweets

library(miceadds)
library(reshape2)
# Load data
networkTwitter <- load.Rdata2("networkTwitter.RData", path=getwd())

# Plot
library(networkD3)
netHaloTw <- simpleNetwork(networkTwitter, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10) 
netHaloTw

1.2.3 Scores

Ratings, Reviews, and Words were divided by the column sum. All sentiment scores were converted to a [0:1] scale and multiplied by the respective weight. The final score is a number from 0 to 1.

library("readxl")
table2 <- read_excel("Score.xlsx", sheet = 2)

library(gt)
table2 %>% gt(rowname_col = "Product") %>%
  tab_header(title = md("**Final Scores Breakdown**")) %>%
  fmt_number(columns = vars(Ratings), decimals = 2) %>%
  fmt_number(columns = vars(Reviews), decimals = 2) %>%
  fmt_number(columns = vars(Words), decimals = 2) %>%
  fmt_number(columns = vars(Days), decimals = 2) %>%
  fmt_number(columns = vars(StarRating), decimals = 2) %>%
  fmt_number(columns = vars(StarReviews), decimals = 2) %>%
  fmt_number(columns = vars(AFINN), decimals = 2) %>%
  fmt_number(columns = vars(Huliu), decimals = 2) %>%
  fmt_number(columns = vars(JockersRinker), decimals = 2) %>%
  fmt_number(columns = vars(Score), decimals = 2) %>%
  fmt_number(columns = vars(AdjRating), decimals = 1) %>%
  fmt_number(columns = vars(OrStarRating), decimals = 1) %>%
  tab_options(
    column_labels.font.size = "smaller",
    table.font.size = "smaller",
    data_row.padding = px(3)) %>%
  tab_style(
    style = list(cell_text(weight = "bold")),
    locations = cells_body(columns = vars(Score))) %>%
  fmt_missing(
    columns = 1:12,
    missing_text = "")

Final Scores Breakdown

	Ratings	Reviews	Words	Days	StarRating	StarReviews	AFINN	Huliu	JockersRinker	Score	AdjRating	OrStarRating
Apple Watch 6	0.27	0.17	0.09	0.22	0.96	0.90	0.67	0.87	0.79	0.71	3.5	4.8
Fitbit Inspire 2	0.15	0.18	0.11	0.21	0.92	0.82	0.64	0.82	0.77	0.67	3.4	4.6
Amazon Halo	0.43	0.29	0.28	0.45	0.90	0.62	0.59	0.74	0.72	0.66	3.3	3.7
Fitbit Charge 4	0.15	0.36	0.51	0.11	0.74	0.61	0.60	0.76	0.74	0.61	3.1	4.5
x												NA
Weights (%)	5.00	5.00	5.00	5.00	15.00	15.00	20.00	15.00	15.00	0.66	3.3	4.4

2 Amazon Halo

2.1 Data & Cleaning

In this section the data is prepared for the analysis. All text is transformed to lower case, unnecessary space, punctuation, stopwords, smileys, and numbers are removed. The reviews are then lemmatized. That means that a group of words that form the same idea (am, be, are are all within the same lexeme) are grouped together and can be analyzed as a single item.

scrapeDate <- as.Date("2020-03-04")

productLink <- "https://www.amazon.com/Amazon-Halo-Fitness-And-Health-Band/dp/B07QK955LS/ref=cm_cr_arp_d_product_top?ie=UTF8"

# Load in scraped data
library(dplyr)
library(tidyverse)
library(stringr)
library(tm)
library(textstem)
library(miceadds)
library(lubridate)

scrapeHaloB07QK955LS <- load.Rdata2("scrapeHaloB07QK955LS.RData", path=getwd())

# Create new df for scraped data
haloReviews <- scrapeHaloB07QK955LS

# Clean df
haloReviews <- haloReviews %>%
  dplyr::select(review_title, review_text, review_star, review_date, product_specs, page) %>%
  mutate(review_title = as.character(review_title), 
         review_title = str_replace_all(review_title, "\n", " "),   
         review_title = str_replace_all(review_title, "(\\[.*?\\])", ""),
         review_title = gsub("([a-z])([A-Z])", "\\1 \\2", review_title), 
         review_title = gsub('[[:punct:] ]+',' ', review_title),
         review_title = gsub("[^\x01-\x7F]", "", review_title), # remove smileys
         review_title = tolower(review_title), 
         review_title = tm::removeWords(review_title, c("'", stopwords(kind = "en"))), 
         review_title = removePunctuation(review_title), 
         review_title = removeNumbers(review_title),
         review_title = textstem::lemmatize_strings(review_title),
         review_title = textclean::replace_contraction(review_title),
         review_title = str_squish(review_title),
         review_text = as.character(review_text), 
         review_text = str_replace_all(review_text, "\n", " "),   
         review_text = str_replace_all(review_text, "(\\[.*?\\])", ""),
         review_text = gsub("([a-z])([A-Z])", "\\1 \\2", review_text), 
         review_text = gsub('[[:punct:] ]+',' ', review_text),
         review_text = gsub("[^\x01-\x7F]", "", review_text),
         review_text = tolower(review_text), 
         review_text = tm::removeWords(review_text, c("'", stopwords(kind = "en"))), 
         review_text = removePunctuation(review_text), 
         review_text = removeNumbers(review_text),
         review_text = textstem::lemmatize_strings(review_text),
         review_text = textclean::replace_contraction(review_text),
         review_text = str_squish(review_text),
         review_star = gsub(".0 out of 5 stars", "", review_star),
         review_date = gsub("Reviewed in the United States on ", "", review_date),
         review_date = mdy(review_date),
         review_text = paste(review_title, review_text)) %>% 
  select(-c(review_title)) # drop column review_title because it was merged into review_text

rmarkdown::paged_table(haloReviews,
                       list(rows.print = 3))

2.2 Initial Text Analysis

There are a total of 2919 reviews over a time frame of about 2.5 months. Splitting the reviews up gives us about 180k words - the words are later used for the analysis and grouped by date. Before we jump into the analysis it makes sense to remove product specific words. These are custom defined, in this case removing amazon makes sense.

# Total Reviews
totR <- count(haloReviews)
totR

## # A tibble: 1 x 1
##       n
##   <int>
## 1  2919

# Time frame
FirstR <- min(haloReviews$review_date)
LastR <- max(haloReviews$review_date)

LengthRD <- difftime(LastR, FirstR)
LengthRD

## Time difference of 80 days

LengthRM <- interval(FirstR, LastR) %/% months(1)

# Word Count
library(tidyr)
library(tidytext)

words <- data_frame(text = haloReviews$review_text) %>% 
  unnest_tokens(tbl = ., output = word, input = text)
count(words)

## # A tibble: 1 x 1
##        n
##    <int>
## 1 182121

# Remove common words - custom defined
library(tm)
haloReviews$review_text <-  tm::removeWords(haloReviews$review_text, words = c("amazon", "halo", "fitness", "tracker"))
haloReviews$review_text <- str_squish(haloReviews$review_text)

2.2.1 Document Term Matrix

library(dplyr)
library(stringr)
library(tm)
library(textstem)

documents <- haloReviews$review_text

documentsCorp <- tm::SimpleCorpus(VectorSource(documents))
documentsDTM <- DocumentTermMatrix(documentsCorp)
inspect(documentsDTM)

## <<DocumentTermMatrix (documents: 2919, terms: 6197)>>
## Non-/sparse entries: 128896/17960147
## Sparsity           : 99%
## Maximal term length: 20
## Weighting          : term frequency (tf)
## Sample             :
##      Terms
## Docs  app band get good like sleep time track use work
##   12    3   12   5    3    6     8    8     3  14    5
##   157   8    0  11   12    3     7    8    15   9    0
##   177  25   11   8    2    4     5    4     1   7    8
##   2     6   10  13   10   14     4    1     6   4    9
##   388   7    3   9    6    4     8    3    12   5    2
##   433   4   10   7    4   15     5   14    22   9    1
##   55   20    1   4   10   14     2    5     6   7    8
##   61    8    6   6   14    7     7   10    18  15    6
##   776  24    1   8    6    8    10    9     5  14    0
##   777   7    6   0    0    0     1    0     0   0    0

2.2.2 Frequent Words

Frequent words help getting a rough idea of the reviews. The first plot is filtered for words with at least 3 letters, while the words in the second plot have to have at least 7 words.

library(rJava)
library(qdap)

ft1 <- freq_terms(haloReviews$review_text, 25, 
                   at.least=3, 
                   stopwords=qdapDictionaries::Top25Words)
plot(ft1)

ft2 <- freq_terms(haloReviews$review_text, 25, 
                   at.least=7, 
                   stopwords=qdapDictionaries::Top25Words)
plot(ft2)

2.3 Sentiment Analysis

When human readers approach a text, we use our understanding of the emotional intent of words to infer whether a section of text is positive or negative, or perhaps characterized by some other more nuanced emotion like surprise or disgust. We can use the tools of text mining to approach the emotional content of text programmatically.

2.3.1 Afinn

The AFINN lexicon assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment. A sentiment of 1.04 results slightly positive considering the range [-5:5]. This is an overall sentiment for the whole time frame of the reviews.

quickSent <- words %>%
  inner_join(get_sentiments("afinn")) %>% 
  summarize(n = nrow(.), sentSum = sum(value)) %>% 
  mutate(sentiment = sentSum / n)
quickSent # -5 to 5, score of 1.035 is slightly positive, compare to rating

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 24811   25686      1.04

2.3.2 Bing

The bing lexicon categorizes words in a binary fashion into positive and negative categories. The sentiment is 65% positive & 35% negative. In the plot most common positive and negative words are displayed.

# Positive Negative using bing
quickPN <- words %>%
  inner_join(get_sentiments("bing")) %>% 
  count(sentiment) %>% 
  pivot_wider(values_from = n, names_from = sentiment) %>% 
  mutate(sentiment = positive - negative)
quickPN

## # A tibble: 1 x 3
##   negative positive sentiment
##      <int>    <int>     <int>
## 1     9987    18112      8125

# Frequent Words by sentiment
bingPN <- words %>%
  inner_join(get_sentiments("bing")) %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup()

# Plot
bingPN %>%
  group_by(sentiment) %>%
  top_n(20) %>%
  ggplot(aes(reorder(word, n), n, fill = sentiment)) +
  geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free_y") +
  labs(y = "Contribution to sentiment", x = NULL) +
  coord_flip()

2.3.3 NRC

The NRC Emotion Lexicon is a list of English words and their associations with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive). The annotations were manually done by crowdsourcing.

# NRC Sentiment
library(syuzhet)
nrcSent <- iconv(haloReviews$review_text)

# get the emotions using the NRC dictionary
emotionsHalo <- get_nrc_sentiment(nrcSent)
emHalo = colSums(emotionsHalo)
emHaloSum = data.frame(count=emHalo, emotion=names(emHalo))
# add percentage
emHaloSum$Perc <- (emHaloSum$count / sum(emHaloSum$count) * 100)
# factor and sort
emHaloSum$emotion = factor(emHaloSum$emotion, levels=emHaloSum$emotion[order(emHaloSum$Perc, decreasing = F)])

# Visualize the emotions from NRC sentiments
library(plotly)
plot_ly(emHaloSum, y=~emotion, x=~Perc, type="bar", color=~emotion, colors = "Set1",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Halo")

2.3.4 Customer Review Sentiment

In this section the lexicon hash_sentiment_huliu (Hu Liu, 2004) and hash_sentiment_jockers_rinker are used to calculate sentiment scores. Both lexicons were specifically developed for analyzing customer sentiment in reviews.

Hu Liu: A dataset containing an augmented version of Hu & Liu’s (2004) positive/negative word list as sentiment lookup values. Sentiment values (+1, 0, -1.05, -1, -2)

Jocker Rinker: A dataset containing a combined and augmented version of Jockers (2017) & Rinker’s augmented Hu & Liu (2004) positive/negative word list as sentiment lookup values. Values and scores are the same as in hash_sentiment_huliu.

# Score Range: [-2 to 1]
library(syuzhet)
library(sentimentr)
library(lexicon)
library(magrittr)

# Split into words and group by the date
sentimentHalo <- data_frame(text = haloReviews$review_text) %>% 
  group_by(haloReviews$review_date) %>%
  unnest_tokens(tbl = ., output = word, input = text) 

colnames(sentimentHalo)[1] <- "date"

2.3.4.1 huliu

# Load lexicon huliu
sentimentHuliu <- hash_sentiment_huliu
colnames(sentimentHuliu)[1] <- "word"
colnames(sentimentHuliu)[2] <- "score"

# Get Overall sentiment
quickSent <- words %>%
  inner_join(sentimentHuliu) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 28745    8471     0.295

# Merge sentiment and call words by date
sentimentHuliu <- merge(sentimentHalo, sentimentHuliu, by = 'word') 

# Group by date and sum scores
scoresHuliu <- sentimentHuliu %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresHuliu)

2.3.4.2 jockers_rinker

# Load lexicon jockers_rinker
sentimentJR <- hash_sentiment_jockers_rinker
colnames(sentimentJR)[1] <- "word"
colnames(sentimentJR)[2] <- "score"

# Get Overall sentiment
quickSent <- words %>%
  inner_join(sentimentJR) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 41596   8568.     0.206

# Merge sentiment and call words by date
sentimentJR <- merge(sentimentHalo, sentimentJR, by = 'word') 

# Group by date and sum scores
scoresJR <- sentimentJR %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresJR)

2.3.4.3 Sentiment Plots

# Plot Sentiment over time
library(dplyr)
plotData <- scoresHuliu %>%
  select(date,score) %>%
  distinct() %>%
  right_join(scoresJR, by = 'date')

colnames(plotData)[2] <- "Huliu"
colnames(plotData)[3] <- "Jockers_Rinker"

library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

plotData <- melt(plotData ,  id.vars = 'date', variable.name = 'score')

ts1 <- ggplot(plotData, aes(date, value)) +
  geom_line(aes(colour = score)) +
  geom_smooth(aes(colour = score)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores - Halo",
       x = "Month",
       y = "") +
  theme_ipsum() 

ts1 <- ggplotly(ts1)
ts1

2.3.5 Twitter Sentiment Halo

To get a bigger picture I scraped Twitter from July 2020 to March 2021 to explore how the sentiment evolved over time.

library(dplyr)
library(stringr)
library(tm)
library(textstem)
library(sentimentr)
library(lexicon)
library(magrittr)
library(tidytext)
library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(fpp)
library(tibbletime)
library(xts)
library(tidyverse)
library(textcat)
library(plotly)
library(hrbrthemes)

# load in scraped tweets
haloTweets <- read.csv(file = "tweets.csv")
haloTweets <- filter(haloTweets, lang == 'en')

# keep columns tweet, date, and retweet count
haloTweets <- haloTweets[c(3,2,10)]

# Rename columns
colnames(haloTweets)[1] <- "text"
colnames(haloTweets)[2] <- "created"

# Clean text
haloTweets <- haloTweets %>%
  dplyr::select(text, created, retweetCount) %>%
  mutate(text = as.character(text), 
         text = str_replace_all(text, "\n", " "),   
         text = str_replace_all(text, "(\\[.*?\\])", ""),
         text = gsub("([a-z])([A-Z])", "\\1 \\2", text), 
         text = gsub('[[:punct:] ]+',' ', text),
         text = gsub("[^\x01-\x7F]", "", text),
         text = tolower(text), 
         text = removeWords(text, c("'", stopwords(kind = "en"))), 
         text = removePunctuation(text), 
         text = removeNumbers(text),
         text = textstem::lemmatize_strings(text),
         text = str_squish(text))


# Remove common words - custom defined
library(tm)
haloTweets$text <-  tm::removeWords(haloTweets$text, words = c("amazon", "halo", "fitness", "tracker", "prime", "https", "t"))
haloTweets$text <- str_squish(haloTweets$text)

# Convert to fitting date format
haloTweets$created <- as.Date(haloTweets$created)

# Rearrange columns
haloTweets <- haloTweets[,c(2,1,3)]

# Split text into words but keep the date
tweetSentHalo <- data_frame(text = haloTweets$text) %>% 
  group_by(haloTweets$created, haloTweets$retweetCountPercent) %>%
  unnest_tokens(tbl = ., output = word, input = text) 

# Rename col
colnames(tweetSentHalo)[1] <- "date"

# Load lexicon jockers_rinker
sentimentJR <- hash_sentiment_jockers_rinker
colnames(sentimentJR)[1] <- "word"
colnames(sentimentJR)[2] <- "score"

# Merge sentiment and call words by date
sentimentJR <- merge(tweetSentHalo, sentimentJR, by = 'word') 

# Group by date and sum scores
scoresJR <- sentimentJR %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresJR)

# plot data
ts1 <- ggplot(scoresJR, aes(date, score)) +
  #geom_line() +
  geom_smooth() +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Twitter Sentiment - Halo",
       x = "Month",
       y = "") +
  theme_ipsum() 

ts1 <- ggplotly(ts1)
ts1

2.4 Topic Models

Topic modeling is an unsupervised machine learning technique that’s capable of scanning a set of documents, detecting word and phrase patterns within them, and automatically clustering word groups and similar expressions that best characterize a set of documents. Topic Models look at the proportion of each topic, displaying some of the highest probability words. For a full version of words please refer to the table below. In the table, frex words (occur frequently within the topic and are exclusive to that topic) and the highest probability words (words that have the highest probability of occurring within that topic) are important indicators. The Lift (calculated by dividing by frequencies in other topics) and Score (uses log frequencies) can be useful and are here to give an idea about how common the words are in comparison to how common they are in other topics.

### Topic Models
library(stm)

set.seed(1001)
holdoutRows <- sample(1:nrow(haloReviews), 100, replace = FALSE) # normally leave out
haloText <- textProcessor(documents = haloReviews$review_text[-c(holdoutRows)],
                           metadata = haloReviews[-c(holdoutRows), ], 
                           stem = FALSE) # not stemming this time

## Building corpus... 
## Converting to Lower Case... 
## Removing punctuation... 
## Removing stopwords... 
## Removing numbers... 
## Creating Output...

haloPrep <- prepDocuments(documents = haloText$documents,
                           vocab = haloText$vocab,
                           meta = haloText$meta)

## Removing 2479 of 6115 terms (2479 of 124096 tokens) due to frequency 
## Your corpus now has 2819 documents, 3636 terms and 121617 tokens.

# Determine how many topics is best
kTest <- searchK(documents = haloPrep$documents,
                 vocab = haloPrep$vocab,
                 K = c(3, 4, 5, 10), verbose = FALSE)
plot(kTest)

# Plot topics
topics5 <- stm(documents = haloPrep$documents,
               vocab = haloPrep$vocab, seed = 1001,
               K = 5, verbose = FALSE)

plot(topics5)

labelTopics(topics5)

## Topic 1 Top Words:
##       Highest Prob: charge, work, device, product, return, month, sync 
##       FREX: return, stop, money, support, reset, customer, clip 
##       Lift: advancement, american, anger, annual, approximately, arrogant, assistance 
##       Score: que, customer, stop, charge, replacement, return, refund 
## Topic 2 Top Words:
##       Highest Prob: can, activity, time, see, will, step, get 
##       FREX: asleep, watcher, total, calculate, match, expectation, consistently 
##       Lift: alas, biking, bio, bookmark, chair, cheerful, class 
##       Score: activity, step, can, body, see, fat, weight 
## Topic 3 Top Words:
##       Highest Prob: app, point, rate, heart, workout, give, datum 
##       FREX: rate, heart, workout, sound, live, value, intense 
##       Lift: affair, anticipate, api, appointment, asmr, athletic, beforehand 
##       Score: point, rate, heart, workout, score, sound, intense 
## Topic 4 Top Words:
##       Highest Prob: band, like, wear, just, get, app, wrist 
##       FREX: wrist, uncomfortable, skin, bulky, functionality, shape, clean 
##       Lift: adjustable, advertisement, audience, bandage, bland, bone, bothersome 
##       Score: wear, wrist, thank, uncomfortable, comfortable, skin, like 
## Topic 5 Top Words:
##       Highest Prob: sleep, track, good, use, feature, tone, watch 
##       FREX: sleep, wake, isnt, good, accurate, great, glad 
##       Lift: apnea, assist, awareness, baby, beef, beneficial, bet 
##       Score: sleep, track, good, accurate, love, great, watch

2.5 Relationships between Words

2.5.1 Bigram Reviews

By seeing how often word X is followed by word Y, we can then build a model of the relationships between them. We do this by adding the token = “ngrams” option to unnest_tokens(), and setting n to the number of words we wish to capture in each n-gram. When we set n to 2, we are examining pairs of two consecutive words, often called “bigrams”.

As one might expect, a lot of the most common bigrams are pairs of common (uninteresting) words, such as of the and to be: what we call “stop-words”. This is a useful time to use tidyr’s separate(), which splits a column into multiple based on a delimiter. This lets us separate it into two columns, “word1” and “word2”, at which point we can remove cases where either is a stop-word.

library(dplyr)
library(tidytext)
library(janeaustenr)

# group words together
halo_bigrams <- data_frame(text = haloReviews$review_text) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2)

# count and sort grouped words
halo_bigrams %>%
  count(bigram, sort = TRUE)

library(tidyr)
# separate a column into mutliple based on a delimiter
bigrams_separated <- halo_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# remove stopwords
bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# now we get a new & different bigram count
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

# create new df to unite them too
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

# Network graph
bigram_graph <- bigram_counts %>%
  filter(n > 40)

library(networkD3)
src <- bigram_graph$word1
target <- bigram_graph$word2
networkData <- data.frame(src, target)

# Plot
netHalo <- simpleNetwork(networkData, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10)
netHalo

2.5.2 Bigram Twitter

The Twitter Word Network looks a bit different considering that we are looking at a more broad target of individuals writing about the fitness tracker. The main categories identified were: Online Magazines writing a product review, actual individual users, and potential users and outside observers such as Politicians or simply people stating what they think.

library(dplyr)
library(tidytext)
library(janeaustenr)

# group words together
halo_bigrams <- data_frame(text = haloTweets$text) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2)

# count and sort grouped words
halo_bigrams %>%
  count(bigram, sort = TRUE)

library(tidyr)
# separate a column into mutliple based on a delimiter
bigrams_separated <- halo_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# remove stopwords
bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# now we get a new & different bigram count
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

# create new df to unite them too
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

# Network graph
bigram_graph <- bigram_counts %>%
  filter(n > 25)

library(networkD3)
src <- bigram_graph$word1
target <- bigram_graph$word2
networkData <- data.frame(src, target)

# Plot
netHaloTwitter <- simpleNetwork(networkData, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10)
netHaloTwitter

2.6 Ratings vs Reviews (Stars)

Amazon Halo´s total ratings indicate a score of 3.7/5. The ratings for the reviews are on average 13% worse (3.05/5). Please click to expand code and see the calculations.

### Reviews vs Stars

# Identify Reviews
table(haloReviews$review_star)

## 
##   1   2   3   4   5 
## 605 553 576 464 721

# Amazon´s statement on how ratings are calculated: "To calculate the overall star rating and percentage breakdown by star, we don't use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzes reviews to verify trustworthiness." You can see below that for this specific product (and all of the following in this analysis I was able to get the rating by applying a simple weight)

# Released in December 2020. 

# Stars for ratings only 

# Total Ratings: 13,617. Score: 3.7/5
# 5 star - 46% - 6262 - 5 points: 2.3
# 4 star - 16% - 2179 - 4 points: 0.64
# 3 star - 15% - 2043 - 3 points: 0.45
# 2 star - 10% - 1362 - 2 points: 0.2
# 1 star - 13% - 1770 - 1 points: 0.13


# Stars for reviews only

# Total Review Ratings: 2919. Score: 3.05/5
# 5 star - 24.7% - 721 - 5 points: 1.235
# 4 star - 15.9% - 464 - 4 points: 0.636
# 3 star - 19.7% - 576 - 3 points: 0.591
# 2 star - 18.9% - 553 - 2 points: 0.378
# 1 star - 20.7% - 605 - 1 Points: 0.207

3 Fitbit Inspire 2

3.1 Data & Cleaning

scrapeDate <- as.Date("2020-02-26")

productLink <- "https://www.amazon.com/Fitbit-Inspire-Fitness-Tracker-Included/dp/B08DFGPTSK"

library(dplyr)
library(stringr)
library(tm)
library(textstem)

### Load in scraped data
scrapeB08DFGPTSK <- load.Rdata2("scrapeFitbitInspire2B08DFGPTSK.RData", path=getwd())

### Clean & prepare data for analysis

# Create new df for scraped data
FitbitIns2 <- scrapeB08DFGPTSK

# Clean df
FitbitIns2 <- FitbitIns2 %>%
  dplyr::select(review_title, review_text, review_star, review_date, product_specs, page) %>%
  mutate(review_title = as.character(review_title), 
         review_title = str_replace_all(review_title, "\n", " "),   
         review_title = str_replace_all(review_title, "(\\[.*?\\])", ""),
         review_title = gsub("([a-z])([A-Z])", "\\1 \\2", review_title), 
         review_title = gsub('[[:punct:] ]+',' ', review_title),
         review_title = gsub("[^\x01-\x7F]", "", review_title), # remove smileys
         review_title = tolower(review_title), 
         review_title = tm::removeWords(review_title, c("'", stopwords(kind = "en"))), 
         review_title = removePunctuation(review_title), 
         review_title = removeNumbers(review_title),
         review_title = textstem::lemmatize_strings(review_title),
         review_title = textclean::replace_contraction(review_title),
         review_title = str_squish(review_title),
         review_text = as.character(review_text), 
         review_text = str_replace_all(review_text, "\n", " "),   
         review_text = str_replace_all(review_text, "(\\[.*?\\])", ""),
         review_text = gsub("([a-z])([A-Z])", "\\1 \\2", review_text), 
         review_text = gsub('[[:punct:] ]+',' ', review_text),
         review_text = gsub("[^\x01-\x7F]", "", review_text),
         review_text = tolower(review_text), 
         review_text = tm::removeWords(review_text, c("'", stopwords(kind = "en"))), 
         review_text = removePunctuation(review_text), 
         review_text = removeNumbers(review_text),
         review_text = textstem::lemmatize_strings(review_text),
         review_text = textclean::replace_contraction(review_text),
         review_text = str_squish(review_text),
         review_star = gsub(".0 out of 5 stars", "", review_star),
         review_date = gsub("Reviewed in the United States on ", "", review_date),
         review_date = mdy(review_date),
         review_text = paste(review_title, review_text)) %>% 
  select(-c(review_title)) # drop column review_title because it was merged into review_text

rmarkdown::paged_table(FitbitIns2,
                       list(rows.print = 5))

3.2 Initial Text Analysis

There are a total of 1470 reviews over a time frame of about 5 months. Splitting the reviews up gives us 37865 words - the words are later used for the analysis and grouped by date. Before we jump into the analysis it makes sense to remove product specific words.

# Total Reviews
count(FitbitIns2)

## # A tibble: 1 x 1
##       n
##   <int>
## 1  1470

# Time frame
FirstR2 <- min(FitbitIns2$review_date)
LastR2 <- max(FitbitIns2$review_date)

LengthRD2 <- difftime(LastR2, FirstR2)
LengthRD2

## Time difference of 151 days

# Word Count
library(tidyr)
library(tidytext)

words2 <- data_frame(text = FitbitIns2$review_text) %>% 
  unnest_tokens(tbl = ., output = word, input = text)
count(words2)

## # A tibble: 1 x 1
##       n
##   <int>
## 1 39590

# Remove common words - custom defined
library(tm)
FitbitIns2$review_text <-  tm::removeWords(FitbitIns2$review_text, words = c("amazon", "fitbit", "fitness", "tracker", "inspire", "fitbits"))
FitbitIns2$review_text <- str_squish(FitbitIns2$review_text)

3.2.1 Document Term Matrix

# Document Term Matrix
library(dplyr)
library(stringr)
library(tm)
library(textstem)

documents2 <- FitbitIns2$review_text

documentsCorp2 <- tm::SimpleCorpus(VectorSource(documents2))
documentsDTM2 <- DocumentTermMatrix(documentsCorp2)
inspect(documentsDTM2)

## <<DocumentTermMatrix (documents: 1470, terms: 3049)>>
## Non-/sparse entries: 29864/4452166
## Sparsity           : 99%
## Maximal term length: 16
## Weighting          : term frequency (tf)
## Sample             :
##       Terms
## Docs   get good great like love sleep step track use work
##   1152   3    3     0    4    0     0    0     1   1    5
##   1156   3    5     0    1    1     1    3     1   4    3
##   1157   1    0     0    2    0     2    6     0   0    0
##   13     2    4     1    3    0     0    4     3   0    6
##   1393   6    5     1    5    7     1    2     0   6    0
##   1395   4   10     1    4    0     5    1     4   3    5
##   1457   7    0     2    2    0     1    2     4   2    1
##   38     2    2     0    2    3     1    3     0   1    0
##   82     5    1     0    2    3     3    0     7   6    2
##   9      3    4     4    4    2     1    3     1   1    1

3.2.2 Frequent Words

Frequent words help getting a rough idea of the reviews. The first plot is filtered for words with at least 3 letters, while the words in the second plot have to have at least 7 words.

# Frequent Words
library(rJava)
library(qdap)

ft5 <- freq_terms(FitbitIns2$review_text, 25, 
                  at.least=3, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft5)

ft6 <- freq_terms(FitbitIns2$review_text, 25, 
                  at.least=7, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft6)

3.3 Sentiment Analysis

3.3.1 Afinn

The AFINN lexicon assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment. A sentiment of 1.37 results slightly positive considering the range [-5:5]. This is an overall sentiment for the whole time frame of the reviews.

# Overall Sentiment Score with afinn
quickSent2 <- words2 %>%
  inner_join(get_sentiments("afinn")) %>% 
  summarize(n = nrow(.), sentSum = sum(value)) %>% 
  mutate(sentiment = sentSum / n)
quickSent2

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1  6765    9533      1.41

3.3.2 Bing

The bing lexicon categorizes words in a binary fashion into positive and negative categories. The sentiment is 72% positive & 28% negative. In the plot the most common positive and negative words are displayed.

# Positive Negative using bing
quickPN2 <- words2 %>%
  inner_join(get_sentiments("bing")) %>% 
  count(sentiment) %>% 
  pivot_wider(values_from = n, names_from = sentiment) %>% 
  mutate(sentiment = positive - negative)
quickPN2

## # A tibble: 1 x 3
##   negative positive sentiment
##      <int>    <int>     <int>
## 1     1958     5497      3539

# Frequent Words by sentiment
bingPN2 <- words2 %>%
  inner_join(get_sentiments("bing")) %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup()

# Plot
library(ggplot2)
bingPN2 %>%
  group_by(sentiment) %>%
  top_n(20) %>%
  ggplot(aes(reorder(word, n), n, fill = sentiment)) +
  geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free_y") +
  labs(y = "Contribution to sentiment", x = NULL) +
  coord_flip()

3.3.3 NRC

# NRC Sentiment
library(syuzhet)
nrcSent <- iconv(FitbitIns2$review_text)

# get the emotions using the NRC dictionary
emotionsFit2 <- get_nrc_sentiment(nrcSent)
emFit2 = colSums(emotionsFit2)
emFit2Sum = data.frame(count=emFit2, emotion=names(emFit2))
# add percentage
emFit2Sum$Perc <- (emFit2Sum$count / sum(emFit2Sum$count) * 100)
# factor and sort
emFit2Sum$emotion = factor(emFit2Sum$emotion, levels=emFit2Sum$emotion[order(emFit2Sum$Perc, decreasing = F)])

# Visualize the emotions from NRC sentiments
library(plotly)
plot_ly(emFit2Sum, y=~emotion, x=~Perc, type="bar", color=~emotion, colors = "Set1",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Fitbit Inspire 2")

3.3.4 Customer Review Sentiment

In this section the lexicon hash_sentiment_huliu (Hu Liu, 2004) and hash_sentiment_jockers_rinker (2017) are used to calculate sentiment scores. Both lexicons were specifically developed for analyzing customer sentiment in reviews and have the following scores: [1, 0, -1.05, -1, -2]

library(sentimentr)
library(lexicon)
library(magrittr)

# Split into words but keep the date
sentimentFitbitIns2 <- data_frame(text = FitbitIns2$review_text) %>% 
  group_by(FitbitIns2$review_date) %>%
  unnest_tokens(tbl = ., output = word, input = text) 

colnames(sentimentFitbitIns2)[1] <- "date"

3.3.4.1 huliu

# Load lexicon huliu
sentimentHuliu2 <- hash_sentiment_huliu
colnames(sentimentHuliu2)[1] <- "word"
colnames(sentimentHuliu2)[2] <- "score"

# Get Overall sentiment
quickSent <- words2 %>%
  inner_join(sentimentHuliu2) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1  7545    3598     0.477

# Merge sentiment and call words by date
sentimentHuliu2 <- merge(sentimentFitbitIns2, sentimentHuliu2, by = 'word') 

# Group by date and sum scores
scoresHuliu2 <- sentimentHuliu2 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresHuliu2)

3.3.4.2 jockers_rinker

# Load lexicon jockers_rinker
sentimentJR2 <- hash_sentiment_jockers_rinker
colnames(sentimentJR2)[1] <- "word"
colnames(sentimentJR2)[2] <- "score"

# Get Overall sentiment
quickSent <- words2 %>%
  inner_join(sentimentJR2) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 10521   3114.     0.296

# Merge sentiment and call words by date
sentimentJR2 <- merge(sentimentFitbitIns2, sentimentJR2, by = 'word') 

# Group by date and sum scores
scoresJR2 <- sentimentJR2 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresJR2)

3.3.4.3 Sentiments Plot

# Plot Sentiment over time
library(dplyr)
plotData2 <- scoresHuliu2 %>%
  select(date,score) %>%
  distinct() %>%
  right_join(scoresJR2, by = 'date')

colnames(plotData2)[2] <- "Huliu"
colnames(plotData2)[3] <- "JR"


library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

plotData2 <- melt(plotData2 ,  id.vars = 'date', variable.name = 'score')

ts3 <- ggplot(plotData2, aes(date, value)) +
  geom_line(aes(colour = score)) +
  geom_smooth(aes(colour = score)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores - Fitbit Inspire 2",
       x = "Month",
       y = "") +
  theme_ipsum() 

ts3 <- ggplotly(ts3)
ts3

3.4 Topic Models

library(stm)

set.seed(1001)
holdoutRows <- sample(1:nrow(FitbitIns2), 100, replace = FALSE)
FitbitInspire2Text <- textProcessor(documents = FitbitIns2$review_text[-c(holdoutRows)],
                            metadata = FitbitIns2[-c(holdoutRows), ],
                            stem = FALSE) # not stemming this time

## Building corpus... 
## Converting to Lower Case... 
## Removing punctuation... 
## Removing stopwords... 
## Removing numbers... 
## Creating Output...

FitbitInspire2Prep <- prepDocuments(documents = FitbitInspire2Text$documents,
                            vocab = FitbitInspire2Text$vocab,
                            meta = FitbitInspire2Text$meta)

## Removing 1345 of 2962 terms (1345 of 27775 tokens) due to frequency 
## Your corpus now has 1367 documents, 1617 terms and 26430 tokens.

# Determine how many topics is best
kTest <- searchK(documents = FitbitInspire2Prep$documents,
                 vocab = FitbitInspire2Prep$vocab,
                 K = c(3, 4, 5, 10), verbose = FALSE)
plot(kTest)

# Plot topics
topics5 <- stm(documents = FitbitInspire2Prep$documents,
               vocab = FitbitInspire2Prep$vocab, seed = 1001,
               K = 5, verbose = FALSE)

plot(topics5)

# full table
labelTopics(topics5)

## Topic 1 Top Words:
##       Highest Prob: good, watch, see, screen, small, turn, come 
##       FREX: good, watch, turn, minute, big, apple, dnd 
##       Lift: although, always, apparently, appear, auto, average, big 
##       Score: good, watch, minute, turn, small, spot, tap 
## Topic 2 Top Words:
##       Highest Prob: track, love, use, rate, keep, also, need 
##       FREX: track, love, read, happy, accuracy, workout, slim 
##       Lift: attractive, certain, circuit, clearly, climb, competitive, cute 
##       Score: love, track, use, rate, easy, gift, help 
## Topic 3 Top Words:
##       Highest Prob: sleep, step, one, heart, buy, feature, much 
##       FREX: sleep, step, fit, bite, exercise, monitor, far 
##       Lift: android, awareness, class, continuous, far, faulty, grayish 
##       Score: sleep, step, heart, one, bite, fit, exercise 
## Topic 4 Top Words:
##       Highest Prob: work, time, device, wrist, battery, charge, day 
##       FREX: work, device, wrist, battery, charge, life, premium 
##       Lift: accord, accountable, advantage, advertise, altimeter, annoyance, assistance 
##       Score: work, device, battery, life, charge, soon, wrist 
## Topic 5 Top Words:
##       Highest Prob: great, like, get, app, just, can, want 
##       FREX: great, phone, issue, nice, notification, sync, review 
##       Lift: watcher, adjust, ankle, anytime, believe, blaze, bluetooth 
##       Score: great, app, like, phone, get, just, product

3.5 Relationships between Words

3.5.1 Bigram (n-gram token)

library(dplyr)
library(tidytext)
library(janeaustenr)

# group words together
fb2_bigrams <- data_frame(text = FitbitIns2$review_text) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2)

# count and sort grouped words
fb2_bigrams %>%
  count(bigram, sort = TRUE)

library(tidyr)
# separate a column into mutliple based on a delimiter
bigrams_separated <- fb2_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# remove stopwords
bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# now we get a new & different bigram count
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

# create new df to unite them too
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

# Network graph
bigram_graph <- bigram_counts %>%
  filter(n > 10)

library(networkD3)
src <- bigram_graph$word1
target <- bigram_graph$word2
networkData <- data.frame(src, target)

# Plot
netfb2 <- simpleNetwork(networkData, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10)
netfb2

3.6 Ratings vs Reviews (Stars)

Fitbit Inspire 2´s ratings indicate a rating of 4.6/5. The ratings for the reviews are on average 10% worse (4.1/5). Please click on expand code to see calculations.

### Reviews vs Stars

# Identify Reviews
table(FitbitIns2$review_star)

## 
##   1   2   3   4   5 
## 155  99 103 200 913

# Stars for ratings only 
# Total 13369 - 4.6/5

# 5 stars - 78% - 10428 - 5 points: 3.9
# 4 stars - 12% - 1604 - 4 points: 0.48
# 3 stars - 4% - 535 - 3 points: 0.12
# 2 stars - 2% - 267 - 2 points: 0.04
# 1 star - 3% - 401 - 1 points: 0.03

# Stars for reviews only 
# Total 1470 - 4.1/5

# 5 -   913 - 62.1% - 3.105442177
# 4 - 200   - 13.6% - 0.544217687
# 3 - 103   - 7%      - 0.210204082
# 2 - 99    - 6.7%  - 0.134693878
# 1 - 155   - 10.5% - 0.105442177

4 Fitbit Charge 4

4.1 Data & Cleaning

scrapeDate <- as.Date("2020-02-27")

ProductLink <- "https://www.amazon.com/Fitbit-Activity-Tracking-Rosewood-Included/dp/B084CQSL3Q"

library(dplyr)
library(stringr)
library(tm)
library(textstem)
library(miceadds)

### Load in scraped data
scrapeFitbitCharge4B084CQSL3Q <- load.Rdata2("scrapeFitbitCharge4B084CQSL3Q.RData", path=getwd())

### Clean & prepare data for analysis

# Create new df for scraped data
FitbitCh4 <- scrapeFitbitCharge4B084CQSL3Q


# Clean df
FitbitCh4 <- FitbitCh4 %>%
  dplyr::select(review_title, review_text, review_star, review_date, product_specs, page) %>%
  mutate(review_title = as.character(review_title), 
         review_title = str_replace_all(review_title, "\n", " "),   
         review_title = str_replace_all(review_title, "(\\[.*?\\])", ""),
         review_title = gsub("([a-z])([A-Z])", "\\1 \\2", review_title), 
         review_title = gsub('[[:punct:] ]+',' ', review_title),
         review_title = gsub("[^\x01-\x7F]", "", review_title), # remove smileys
         review_title = tolower(review_title), 
         review_title = tm::removeWords(review_title, c("'", stopwords(kind = "en"))), 
         review_title = removePunctuation(review_title), 
         review_title = removeNumbers(review_title),
         review_title = textstem::lemmatize_strings(review_title),
         review_title = textclean::replace_contraction(review_title),
         review_title = str_squish(review_title),
         review_text = as.character(review_text), 
         review_text = str_replace_all(review_text, "\n", " "),   
         review_text = str_replace_all(review_text, "(\\[.*?\\])", ""),
         review_text = gsub("([a-z])([A-Z])", "\\1 \\2", review_text), 
         review_text = gsub('[[:punct:] ]+',' ', review_text),
         review_text = gsub("[^\x01-\x7F]", "", review_text),
         review_text = tolower(review_text), 
         review_text = tm::removeWords(review_text, c("'", stopwords(kind = "en"))), 
         review_text = removePunctuation(review_text), 
         review_text = removeNumbers(review_text),
         review_text = textstem::lemmatize_strings(review_text),
         review_text = textclean::replace_contraction(review_text),
         review_text = str_squish(review_text),
         review_star = gsub(".0 out of 5 stars", "", review_star),
         review_date = gsub("Reviewed in the United States on ", "", review_date),
         review_date = mdy(review_date),
         review_text = paste(review_title, review_text)) %>% 
  select(-c(review_title)) # drop column review_title because it was merged into review_text

rmarkdown::paged_table(FitbitCh4,
                       list(rows.print = 5))

4.2 Initial Text Analysis

There are a total of 2390 reviews over a time frame of about 10 months. Splitting the reviews up gives us 99491 words - the words are later used for the analysis and grouped by date. Before we jump into the analysis it makes sense to remove product specific words.

### Text Analysis

# Total Reviews
count(FitbitCh4)

## # A tibble: 1 x 1
##       n
##   <int>
## 1  2390

# Time frame
FirstR4 <- min(FitbitCh4$review_date)
LastR4 <- max(FitbitCh4$review_date)

LengthRD4 <- difftime(LastR4, FirstR4)
LengthRD4

## Time difference of 317 days

# Word Count
library(tidyr)
library(tidytext)

words4 <- data_frame(text = FitbitCh4$review_text) %>% 
  unnest_tokens(tbl = ., output = word, input = text)
count(words4)

## # A tibble: 1 x 1
##       n
##   <int>
## 1 99491

# Remove common words - custom defined
library(tm)
FitbitCh4$review_text <-  tm::removeWords(FitbitCh4$review_text, words = c("amazon", "fitbit", "fitness", "tracker", "charge", "fitbits"))
FitbitCh4$review_text <- str_squish(FitbitCh4$review_text)

4.2.1 Document Term Matrix

# Document Term Matrix
library(dplyr)
library(stringr)
library(tm)
library(textstem)

documents4 <- FitbitCh4$review_text

documentsCorp4 <- tm::SimpleCorpus(VectorSource(documents4))
documentsDTM4 <- DocumentTermMatrix(documentsCorp4)
inspect(documentsDTM4)

## <<DocumentTermMatrix (documents: 2390, terms: 4754)>>
## Non-/sparse entries: 70033/11292027
## Sparsity           : 99%
## Maximal term length: 16
## Weighting          : term frequency (tf)
## Sample             :
##       Terms
## Docs   app get good like one time track use watch work
##   1620   8   6    7   10   7    4     8   7     2    8
##   1663   1   2    5    2   1    1     1   2     0    1
##   1794   3  10    2    5   1    0     0   4     2    1
##   1797   4   6   14    2   1    6     2  10    22    4
##   1803   6   1    6    1   1    6     5   1    16    2
##   1808  10   6    3    7   2    2     5   2     3    5
##   1821   9  15    9    9   6   11    17  15    19    8
##   2269   2   7    1    3   2    5     0   4     0    4
##   2280   2   4    6    5   7    4     3   8    13    2
##   98     6  10    5    8   1   12     6   4     7    2

4.2.2 Frequent Words

Frequent words help getting a rough idea of the reviews. The first plot is filtered for words with at least 3 letters, while the words in the second plot have to have at least 7 words.

# Frequent Words
library(rJava)
library(qdap)

ft7 <- freq_terms(FitbitCh4$review_text, 25, 
                  at.least=3, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft7)

ft8 <- freq_terms(FitbitCh4$review_text, 25, 
                  at.least=7, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft8)

4.3 Sentiment Analysis

4.3.1 Afinn

The AFINN lexicon assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment. A sentiment of 0.9 results slightly negative considering the range [-5:5]. This is an overall sentiment for the whole time frame of the reviews.

quickSent4 <- words4 %>%
  inner_join(get_sentiments("afinn")) %>% 
  summarize(n = nrow(.), sentSum = sum(value)) %>% 
  mutate(sentiment = sentSum / n)
quickSent4

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 12914   11653     0.902

4.3.2 Bing

The bing lexicon categorizes words in a binary fashion into positive and negative categories. The sentiment is 60% positive & 40% negative. In the plot the most common positive and negative words are displayed.

# Positive Negative using bing
quickPN4 <- words4 %>%
  inner_join(get_sentiments("bing")) %>% 
  count(sentiment) %>% 
  pivot_wider(values_from = n, names_from = sentiment) %>% 
  mutate(sentiment = positive - negative)
quickPN4

## # A tibble: 1 x 3
##   negative positive sentiment
##      <int>    <int>     <int>
## 1     6104     9403      3299

# Frequent Words by sentiment
bingPN4 <- words4 %>%
  inner_join(get_sentiments("bing")) %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup()

# Plot
library(ggplot2)
bingPN4 %>%
  group_by(sentiment) %>%
  top_n(20) %>%
  ggplot(aes(reorder(word, n), n, fill = sentiment)) +
  geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free_y") +
  labs(y = "Contribution to sentiment", x = NULL) +
  coord_flip()

4.3.3 NRC

# NRC Sentiment
library(syuzhet)
nrcSent <- iconv(FitbitCh4$review_text)

# get the emotions using the NRC dictionary
emotionsFit4 <- get_nrc_sentiment(nrcSent)
emFit4 = colSums(emotionsFit4)
emFit4Sum = data.frame(count=emFit4, emotion=names(emFit4))
# add percentage
emFit4Sum$Perc <- (emFit4Sum$count / sum(emFit4Sum$count) * 100)
# factor and sort
emFit4Sum$emotion = factor(emFit4Sum$emotion, levels=emFit4Sum$emotion[order(emFit4Sum$Perc, decreasing = F)])

# Visualize the emotions from NRC sentiments
library(plotly)
plot_ly(emFit4Sum, y=~emotion, x=~Perc, type="bar", color=~emotion, colors = "Set1",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Fitbit Charge 4")

4.3.4 Customer Review Sentiment

library(sentimentr)
library(lexicon)
library(magrittr)

# Split into words but keep the date
sentimentFitbitCh4 <- data_frame(text = FitbitCh4$review_text) %>% 
  group_by(FitbitCh4$review_date) %>%
  unnest_tokens(tbl = ., output = word, input = text) 

colnames(sentimentFitbitCh4)[1] <- "date"

4.3.4.1 huliu

# Load lexicon huliu
sentimentHuliu4 <- hash_sentiment_huliu
colnames(sentimentHuliu4)[1] <- "word"
colnames(sentimentHuliu4)[2] <- "score"

# Get Overall sentiment
quickSent <- words4 %>%
  inner_join(sentimentHuliu4) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 15757    3503     0.222

# Merge sentiment and call words by date
sentimentHuliu4 <- merge(sentimentFitbitCh4, sentimentHuliu4, by = 'word') 

# Group by date and sum scores
scoresHuliu4 <- sentimentHuliu4 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresHuliu4)

4.3.4.2 jockers_rinker

# Load lexicon jockers_rinker
sentimentJR4 <- hash_sentiment_jockers_rinker
colnames(sentimentJR4)[1] <- "word"
colnames(sentimentJR4)[2] <- "score"

# Get Overall sentiment
quickSent <- words4 %>%
  inner_join(sentimentJR4) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1 23127   3590.     0.155

# Merge sentiment and call words by date
sentimentJR4 <- merge(sentimentFitbitCh4, sentimentJR4, by = 'word') 

# Group by date and sum scores
scoresJR4 <- sentimentJR4 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresJR4)

4.3.4.3 Sentiments Plot

# Plot Sentiment over time
library(dplyr)
plotData4 <- scoresHuliu4 %>%
  select(date,score) %>%
  distinct() %>%
  right_join(scoresJR4, by = 'date')

colnames(plotData4)[2] <- "Huliu"
colnames(plotData4)[3] <- "JR"


library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

plotData4 <- melt(plotData4,  id.vars = 'date', variable.name = 'score')

ts4 <- ggplot(plotData4, aes(date, value)) +
  geom_line(aes(colour = score)) +
  geom_smooth(aes(colour = score)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores - Fitbit Charge 4",
       x = "Month",
       y = "") +
  theme_ipsum() 

ts4 <- ggplotly(ts4)
ts4

4.4 Topic Models

### Topic Models
library(stm)

set.seed(1001)
holdoutRows <- sample(1:nrow(FitbitCh4), 100, replace = FALSE) # normally leave out
FitbitCh4Text <- textProcessor(documents = FitbitCh4$review_text[-c(holdoutRows)],
                          metadata = FitbitCh4[-c(holdoutRows), ], 
                          stem = FALSE) # not stemming this time

## Building corpus... 
## Converting to Lower Case... 
## Removing punctuation... 
## Removing stopwords... 
## Removing numbers... 
## Creating Output...

FitbitCh4Prep <- prepDocuments(documents = FitbitCh4Text$documents, 
                          vocab = FitbitCh4Text$vocab,
                          meta = FitbitCh4Text$meta)

## Removing 2033 of 4680 terms (2033 of 67111 tokens) due to frequency 
## Your corpus now has 2288 documents, 2647 terms and 65078 tokens.

# Determine how many topics is best
kTest <- searchK(documents = FitbitCh4Prep$documents,
                 vocab = FitbitCh4Prep$vocab,
                 K = c(3, 4, 5, 10), verbose = FALSE)
plot(kTest)

# Plot topics
topics5 <- stm(documents = FitbitCh4Prep$documents,
               vocab = FitbitCh4Prep$vocab, seed = 1001,
               K = 5, verbose = FALSE)

plot(topics5)

# full table
labelTopics(topics5)

## Topic 1 Top Words:
##       Highest Prob: work, buy, device, get, return, will, sync 
##       FREX: return, customer, support, service, replacement, send, nothing 
##       Lift: acknowledge, adapter, aforementioned, agent, ancient, anybody, apparent 
##       Score: customer, support, return, send, service, window, replacement 
## Topic 2 Top Words:
##       Highest Prob: track, sleep, great, love, good, step, rate 
##       FREX: track, love, step, rate, heart, count, calorie 
##       Lift: adjustment, amaze, amazingly, annual, anyways, aria, asin 
##       Score: love, heart, track, sleep, step, rate, accurate 
## Topic 3 Top Words:
##       Highest Prob: use, band, battery, feature, like, really, life 
##       FREX: band, life, nice, spotify, premium, large, useful 
##       Lift: allergic, aspect, clutter, comparable, contactless, cpap, doubt 
##       Score: band, battery, life, feature, spotify, nice, special 
## Topic 4 Top Words:
##       Highest Prob: watch, get, just, can, phone, need, good 
##       FREX: watch, run, map, versa, google, outside, dim 
##       Lift: acquisition, adventure, brighten, cellular, daylight, dial, differently 
##       Score: watch, run, break, phone, read, apple, plastic 
## Topic 5 Top Words:
##       Highest Prob: app, time, day, want, screen, set, take 
##       FREX: information, figure, difficult, ride, swipe, usually, inspire 
##       Lift: activation, apnea, approach, approve, bedtime, category, combination 
##       Score: app, screen, time, minute, set, day, show

4.5 Relationships between Words

4.5.1 Bigram (n-gram token)

library(dplyr)
library(tidytext)
library(janeaustenr)

# group words together
fb4_bigrams <- data_frame(text = FitbitCh4$review_text) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2)

# count and sort grouped words
fb4_bigrams %>%
  count(bigram, sort = TRUE)

library(tidyr)
# separate a column into mutliple based on a delimiter
bigrams_separated <- fb4_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# remove stopwords
bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# now we get a new & different bigram count
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

# create new df to unite them too
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

# Network graph
bigram_graph <- bigram_counts %>%
  filter(n > 20)

library(networkD3)
src <- bigram_graph$word1
target <- bigram_graph$word2
networkData <- data.frame(src, target)

# Plot
netfb4 <- simpleNetwork(networkData, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10)
netfb4

4.6 Ratings vs Reviews (Stars)

Fitbit Charge 4´s ratings indicate a score of 4.5/5. The ratings for the reviews are on average 28% worse (3.1/5). For this product I only scraped 50% of the written reviews so this number might be scewed. Please click expand the code to see calculations.

### Reviews vs Stars

# Identify Reviews
table(FitbitCh4$review_star)

## 
##   1   2   3   4   5 
## 738 283 247 244 878

# Stars for ratings only 
# Total 39318 - 4.5/5

# 5 stars - 76% - 5 points: 3.8
# 4 stars - 13% - 4 points: 0.52
# 3 stars - 4%  - 3 points: 0.16
# 2 stars - 2% - 2 points: 0.04
# 1 star - 5% - 1 points: 0.05

# Stars for reviews only 
# Total 2390 - 3.1/5
    
# 5 878 0.367364017 1.836820084
# 4 244 0.10209205  0.408368201
# 3 247 0.10334728  0.310041841
# 2 283 0.118410042 0.236820084
# 1 738 0.308786611 0.308786611

5 Apple Watch 6

5.1 Data & Cleaning

scrapeDate <- as.Date("2020-02-27")

ProductLink <- "https://www.amazon.com/New-Apple-Watch-GPS-40mm-Aluminum/dp/B08J5QC43Q"

library(dplyr)
library(stringr)
library(tm)
library(textstem)
library(miceadds)

scrapeAppleWatch6B08J5QC43Q <- load.Rdata2("scrapeAppleWatch6B08J5QC43Q.RData", path=getwd())


### Clean & prepare data for analysis

# Create new df for scraped data
awReviews <- scrapeAppleWatch6B08J5QC43Q

# Clean df
awReviews <- awReviews %>%
  dplyr::select(review_title, review_text, review_star, review_date, product_specs, page) %>%
  mutate(review_title = as.character(review_title), 
         review_title = str_replace_all(review_title, "\n", " "),   
         review_title = str_replace_all(review_title, "(\\[.*?\\])", ""),
         review_title = gsub("([a-z])([A-Z])", "\\1 \\2", review_title), 
         review_title = gsub('[[:punct:] ]+',' ', review_title),
         review_title = gsub("[^\x01-\x7F]", "", review_title), # remove smileys
         review_title = tolower(review_title), 
         review_title = tm::removeWords(review_title, c("'", stopwords(kind = "en"))), 
         review_title = removePunctuation(review_title), 
         review_title = removeNumbers(review_title),
         review_title = textstem::lemmatize_strings(review_title),
         review_title = textclean::replace_contraction(review_title),
         review_title = str_squish(review_title),
         review_text = as.character(review_text), 
         review_text = str_replace_all(review_text, "\n", " "),   
         review_text = str_replace_all(review_text, "(\\[.*?\\])", ""),
         review_text = gsub("([a-z])([A-Z])", "\\1 \\2", review_text), 
         review_text = gsub('[[:punct:] ]+',' ', review_text),
         review_text = gsub("[^\x01-\x7F]", "", review_text),
         review_text = tolower(review_text), 
         review_text = tm::removeWords(review_text, c("'", stopwords(kind = "en"))), 
         review_text = removePunctuation(review_text), 
         review_text = removeNumbers(review_text),
         review_text = textstem::lemmatize_strings(review_text),
         review_text = textclean::replace_contraction(review_text),
         review_text = str_squish(review_text),
         review_star = gsub(".0 out of 5 stars", "", review_star),
         review_date = gsub("Reviewed in the United States on ", "", review_date),
         review_date = mdy(review_date),
         review_text = paste(review_title, review_text)) %>% 
  select(-c(review_title)) # drop column review_title because it was merged into review_text

rmarkdown::paged_table(awReviews,
                       list(rows.print = 5))

5.2 Initial Text Analysis

There are a total of 1370 reviews over a time frame of about 5 months. Splitting the reviews up gives us 33341 words - the words are later used for the analysis and grouped by date. Before we jump into the analysis it makes sense to remove product specific words.

### Text Analysis

# Total Reviews
count(awReviews)

## # A tibble: 1 x 1
##       n
##   <int>
## 1  1370

# Time frame
FirstR1 <- min(awReviews$review_date)
LastR1 <- max(awReviews$review_date)

LengthRD1 <- difftime(LastR1, FirstR1)
LengthRD1

## Time difference of 155 days

# Word Count
library(tidyr)
library(tidytext)

words1 <- data_frame(text = awReviews$review_text) %>% 
  unnest_tokens(tbl = ., output = word, input = text)
count(words1)

## # A tibble: 1 x 1
##       n
##   <int>
## 1 33341

# Remove common words - custom defined
library(tm)
awReviews$review_text <-  tm::removeWords(awReviews$review_text, words = c("amazon", "apple", "fitness", "tracker", "charge", "fitbit", "watch", "apples"))
awReviews$review_text <- str_squish(awReviews$review_text)

5.2.1 Document Term Matrix

# Document Term Matrix
library(dplyr)
library(stringr)
library(tm)
library(textstem)

documents1 <- awReviews$review_text

documentsCorp1 <- tm::SimpleCorpus(VectorSource(documents1))
documentsDTM1 <- DocumentTermMatrix(documentsCorp1)
inspect(documentsDTM1)

## <<DocumentTermMatrix (documents: 1370, terms: 3201)>>
## Non-/sparse entries: 23967/4361403
## Sparsity           : 99%
## Maximal term length: 17
## Weighting          : term frequency (tf)
## Sample             :
##       Terms
## Docs   battery buy feature get good great love phone use work
##   1033       2   4       0   6    4     1    0    25   7    5
##   1034       2   1       1   5    4     2    0     2   2    1
##   1085       2   3       1   1    1     0    0     2   0    1
##   1092       0   0       2   0    5     2    1     1   7    1
##   1093       0   0       2   7    1     4    1     5   1    3
##   1111       1   1       4   6    4     1    2     0   3    1
##   12         0   0       4   3    1     0    3     1   2    0
##   1321       2   0       2   2    3     1    1     1   2    1
##   32         0   0       6   3    1     1    3     7   3    0
##   87         0   0       1   3    0     3    4     0   1    5

5.2.2 Frequent Words

Frequent words help getting a rough idea of the reviews. The first plot is filtered for words with at least 3 letters, while the words in the second plot have to have at least 7 words.

# Frequent Words
library(rJava)
library(qdap)

ft3 <- freq_terms(awReviews$review_text, 25, 
                  at.least=3, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft3)

ft4 <- freq_terms(awReviews$review_text, 25, 
                  at.least=7, 
                  stopwords=qdapDictionaries::Top25Words)
plot(ft4)

5.3 Sentiment Analysis

5.3.1 Afinn

A sentiment of 1.74 results positive considering the range [-5:5]. This is an overall sentiment for the whole time frame of the reviews.

quickSent1 <- words1 %>%
  inner_join(get_sentiments("afinn")) %>% 
  summarize(n = nrow(.), sentSum = sum(value)) %>% 
  mutate(sentiment = sentSum / n)
quickSent1

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1  5486    9557      1.74

5.3.2 Bing

The bing lexicon categorizes words in a binary fashion into positive and negative categories. The sentiment is 64% positive & 36% negative. In the plot the most common positive and negative words are displayed.

# Positive Negative using bing
quickPN1 <- words1 %>%
  inner_join(get_sentiments("bing")) %>% 
  count(sentiment) %>% 
  pivot_wider(values_from = n, names_from = sentiment) %>% 
  mutate(sentiment = positive - negative)
quickPN

## # A tibble: 1 x 3
##   negative positive sentiment
##      <int>    <int>     <int>
## 1     9987    18112      8125

# Frequent Words by sentiment
bingPN1 <- words1 %>%
  inner_join(get_sentiments("bing")) %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup()

# Plot
library(ggplot2)
bingPN1 %>%
  group_by(sentiment) %>%
  top_n(20) %>%
  ggplot(aes(reorder(word, n), n, fill = sentiment)) +
  geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free_y") +
  labs(y = "Contribution to sentiment", x = NULL) +
  coord_flip()

5.3.3 NRC

# NRC Sentiment
library(syuzhet)
nrcSent <- iconv(awReviews$review_text)

# get the emotions using the NRC dictionary
emotionsaw <- get_nrc_sentiment(nrcSent)
emaw = colSums(emotionsaw)
emawSum = data.frame(count=emaw, emotion=names(emaw))
# add percentage
emawSum$Perc <- (emawSum$count / sum(emawSum$count) * 100)
# factor and sort
emawSum$emotion = factor(emawSum$emotion, levels=emawSum$emotion[order(emawSum$Perc, decreasing = F)])

# Visualize the emotions from NRC sentiments
library(plotly)
plot_ly(emawSum, y=~emotion, x=~Perc, type="bar", color=~emotion, colors = "Set1",   
        orientation = 'h') %>%
         layout(showlegend=FALSE, yaxis=list(title=""), xaxis=list(title=""),
          title="NRC Sentiment - Apple Watch 6")

5.3.4 Customer Review Sentiment

In this section the lexicon hash_sentiment_huliu (Hu Liu, 2004) and hash_sentiment_jockers_rinker(2017) are used to calculate sentiment scores. Both lexicons were specifically developed for analyzing customer sentiment in reviews.

library(sentimentr)
library(lexicon)
library(magrittr)

# Split into words but keep the date
sentimentappleW <- data_frame(text = awReviews$review_text) %>% 
  group_by(awReviews$review_date) %>%
  unnest_tokens(tbl = ., output = word, input = text) 

colnames(sentimentappleW)[1] <- "date"

5.3.4.1 huliu

# Load lexicon huliu
sentimentHuliu1 <- hash_sentiment_huliu
colnames(sentimentHuliu1)[1] <- "word"
colnames(sentimentHuliu1)[2] <- "score"

# Get Overall sentiment
quickSent <- words1 %>%
  inner_join(sentimentHuliu1) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1  6030    3603     0.598

# Merge sentiment and call words by date
sentimentHuliu1 <- merge(sentimentappleW, sentimentHuliu1, by = 'word') 

# Group by date and sum scores
scoresHuliu1 <- sentimentHuliu1 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresHuliu1)

5.3.4.2 jockers_rinker

# Load lexicon jockers_rinker
sentimentJR1 <- hash_sentiment_jockers_rinker
colnames(sentimentJR1)[1] <- "word"
colnames(sentimentJR1)[2] <- "score"

# Get Overall sentiment
quickSent <- words1 %>%
  inner_join(sentimentJR1) %>% 
  summarize(n = nrow(.), sentSum = sum(score)) %>% 
  mutate(sentiment = sentSum / n)
quickSent

## # A tibble: 1 x 3
##       n sentSum sentiment
##   <int>   <dbl>     <dbl>
## 1  8539   3141.     0.368

# Merge sentiment and call words by date
sentimentJR1 <- merge(sentimentappleW, sentimentJR1, by = 'word') 

# Group by date and sum scores
scoresJR1 <- sentimentJR1 %>% 
  group_by(date) %>% 
  na.omit() %>% 
  summarize(score = mean(score))
DT::datatable(scoresJR1)

5.3.4.3 Sentiments Plot

# Plot Sentiment over time
library(dplyr)
plotData1 <- scoresHuliu1 %>%
  select(date,score) %>%
  distinct() %>%
  right_join(scoresJR1, by = 'date')

colnames(plotData1)[2] <- "Huliu"
colnames(plotData1)[3] <- "JR"


library(reshape2)
library(ggplot2)
library(lubridate)
library(tidyquant)
library(scales)
library(plotly)
library(hrbrthemes)

plotData1 <- melt(plotData1 ,  id.vars = 'date', variable.name = 'score')

ts2 <- ggplot(plotData1, aes(date, value)) +
  geom_line(aes(colour = score)) +
  geom_smooth(aes(colour = score)) +
  scale_x_date(date_labels = "%m", date_breaks = "1 month") +
  labs(title = "Sentiment Scores - Apple Watch 6",
       x = "Month",
       y = "") +
  theme_ipsum() 

ts2 <- ggplotly(ts2)
ts2

5.4 Topic Models

### Topic Models
library(stm)

set.seed(1001)
holdoutRows1 <- sample(1:nrow(awReviews), 100, replace = FALSE)
appleWText <- textProcessor(documents = awReviews$review_text[-c(holdoutRows1)],
                          metadata = awReviews[-c(holdoutRows1), ], 
                          stem = FALSE) # not stemming this time

## Building corpus... 
## Converting to Lower Case... 
## Removing punctuation... 
## Removing stopwords... 
## Removing numbers... 
## Creating Output...

appleWPrep <- prepDocuments(documents = appleWText$documents, 
                          vocab = appleWText$vocab,
                          meta = appleWText$meta)

## Removing 1452 of 3009 terms (1452 of 21740 tokens) due to frequency 
## Removing 2 Documents with No Words 
## Your corpus now has 1262 documents, 1557 terms and 20288 tokens.

# Determine how many topics is best
kTest <- searchK(documents = appleWPrep$documents,
                 vocab = appleWPrep$vocab,
                 K = c(3, 4, 5, 10), verbose = FALSE)
plot(kTest)

# Plot topics
topics5 <- stm(documents = appleWPrep$documents,
               vocab = appleWPrep$vocab, seed = 1001,
               K = 5, verbose = FALSE)

plot(topics5)

# full table
labelTopics(topics5)

## Topic 1 Top Words:
##       Highest Prob: love, phone, buy, app, band, easy, issue 
##       FREX: love, buy, band, easy, read, receive, box 
##       Lift: pixel, accessible, accessory, agent, allergic, answer, apart 
##       Score: love, phone, band, buy, easy, app, issue 
## Topic 2 Top Words:
##       Highest Prob: get, feature, time, sleep, work, will, heart 
##       FREX: never, add, item, iphone, plus, unless, choose 
##       Lift: analysis, associate, buggy, careful, comprehensive, equipment, feminine 
##       Score: get, feature, sleep, time, heart, first, health 
## Topic 3 Top Words:
##       Highest Prob: battery, like, life, day, track, upgrade, oxygen 
##       FREX: battery, life, upgrade, oxygen, monitor, last, blood 
##       Lift: adjust, aps, around, auto, base, batter, behind 
##       Score: battery, life, blood, oxygen, fast, upgrade, like 
## Topic 4 Top Words:
##       Highest Prob: good, great, use, product, series, purchase, work 
##       FREX: good, great, product, gift, worth, price, perfect 
##       Lift: advantage, brother, decent, gray, interface, meet, tire 
##       Score: great, good, product, perfect, wife, price, series 
## Topic 5 Top Words:
##       Highest Prob: new, can, one, just, nice, need, really 
##       FREX: thing, think, now, expect, red, try, cant 
##       Lift: actually, advertisement, appear, appointment, asleep, aspect, assume 
##       Score: can, think, one, nice, thing, new, awesome

5.5 Relationships between Words

5.5.1 Bigram (n-gram token)

library(dplyr)
library(tidytext)
library(janeaustenr)

# group words together
aw_bigrams <- data_frame(text = awReviews$review_text) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2)

# count and sort grouped words
aw_bigrams %>%
  count(bigram, sort = TRUE)

library(tidyr)
# separate a column into mutliple based on a delimiter
bigrams_separated <- aw_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# remove stopwords
bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# now we get a new & different bigram count
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

# create new df to unite them too
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

# Network graph
bigram_graph <- bigram_counts %>%
  filter(n > 15)

library(networkD3)
src <- bigram_graph$word1
target <- bigram_graph$word2
networkData <- data.frame(src, target)

# Plot
netApple <- simpleNetwork(networkData, fontSize = 12, zoom = TRUE, linkColour = "black", nodeColour = "navy", fontFamily = "serif", charge = -10)
netApple

5.6 Ratings vs Reviews (Stars)

Apple Watch 6´s ratings indicate a score of 4.8/5. The ratings for the reviews are on average 6% worse (4.5/5). Please click to expand the code and see the calculations.

### Reviews vs Stars

# Identify Reviews
table(awReviews$review_star)

## 
##    1    2    3    4    5 
##   99   33   60   96 1082

# Stars for ratings only 
# Total 39318 - 4.8/5

# 5 stars - 91% - 5 points: 4.55
# 4 stars - 6% - 4 points: 0.24
# 3 stars - 1%  - 3 points: 0.03
# 2 stars - 0% - 2 points: 0.00
# 1 star - 1% - 1 points: 0.01

# Stars for reviews only 
# Total 1370 - 4.48/5

# 5 1082    0.789781022 3.948905109
# 4 96  0.070072993 0.280291971
# 3 60  0.04379562  0.131386861
# 2 33  0.024087591 0.048175182
# 1 99  0.072262774 0.072262774

6 Weighted Scores

All scores were calculated in Excel. In the first table all inputs were gathered with weights applied to each. In the second table all scores are transformed to the same scale (0:1) and then getting multiplied by weights and added together.

library("readxl")
table1 <- read_excel("Score.xlsx")
table2 <- read_excel("Score.xlsx", sheet = 2)

library(gt)
table1 %>% gt() %>%
  tab_options(
    column_labels.font.size = "smaller",
    table.font.size = "smaller",
    data_row.padding = px(3)
  )

Product	Ratings	Reviews	Words	Dates	Days	StarRating (0:5)	StarReviews (0:5)	AFINN (-5:5)	Huliu (-2:1)	JockersRinker (-2:1)
Apple Watch 6	24338.00	1370.00	33780.00	9/25/2020 - 2/27/2021	155.00	4.80	4.4800	1.7420	0.59800	0.36800
Fitbit Inspire 2	13369.00	1470.00	39939.00	9/27/2020 - 2/25/2021	151.00	4.60	4.1000	1.4100	0.47700	0.29600
Amazon Halo	13617.00	2919.00	183377.00	12/14/2020 - 03/01/2021	80.00	3.70	3.0500	1.0350	0.29400	0.20600
Fitbit Charge 4	39318.00	2390.00	99770.00	4/15/2020 - 2/26/2021	317.00	4.50	3.1000	0.9030	0.22200	0.15500
Sum/Avg	90642.00	8149.00	356866.00	x	703.00	4.40	3.6825	1.2725	0.39775	0.25625
Weights	0.05	0.05	0.05	x	0.05	0.25	0.1000	0.2000	0.10000	0.15000

table2 %>% gt() %>%
  tab_header(
    title = "Final Scores") %>%
  fmt_number(columns = vars(Ratings), decimals = 2) %>%
  fmt_number(columns = vars(Reviews), decimals = 2) %>%
  fmt_number(columns = vars(Words), decimals = 2) %>%
  fmt_number(columns = vars(Days), decimals = 2) %>%
  fmt_number(columns = vars(StarRating), decimals = 2) %>%
  fmt_number(columns = vars(AFINN), decimals = 2) %>%
  fmt_number(columns = vars(Huliu), decimals = 2) %>%
  fmt_number(columns = vars(JockersRinker), decimals = 2) %>%
  fmt_number(columns = vars(Score), decimals = 2) %>%
  tab_options(
    column_labels.font.size = "smaller",
    table.font.size = "smaller",
    data_row.padding = px(3))

Final Scores

Product	Ratings	Reviews	Words	Days	StarRating	StarReviews	AFINN	Huliu	JockersRinker	Score	AdjRating	OrStarRating
Apple Watch 6	0.27	0.17	0.09	0.22	0.96	0.896	0.67	0.87	0.79	0.71	3.534649	4.8
Fitbit Inspire 2	0.15	0.18	0.11	0.21	0.92	0.820	0.64	0.82	0.77	0.67	3.351011	4.6
Amazon Halo	0.43	0.29	0.28	0.45	0.90	0.620	0.59	0.74	0.72	0.66	3.298348	3.7
Fitbit Charge 4	0.15	0.36	0.51	0.11	0.74	0.610	0.60	0.76	0.74	0.61	3.050917	4.5
x	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA
Weights (%)	5.00	5.00	5.00	5.00	15.00	15.000	20.00	15.00	15.00	0.66	3.308731	4.4

Text Analysis: Fitness Trackers

David Springhetti

1 Summary

1.1 Sentiment Plots

1.1.1 NRC

1.1.2 Huliu

1.1.3 JR

1.1.4 Halo Reviews vs Tweets: Sentiment

1.2 Networks & Scores

1.2.1 Network: Halo Reviews

1.2.2 Network: Halo Tweets

1.2.3 Scores

2 Amazon Halo

2.1 Data & Cleaning

2.2 Initial Text Analysis

2.2.1 Document Term Matrix

2.2.2 Frequent Words

2.3 Sentiment Analysis

2.3.1 Afinn

2.3.2 Bing

2.3.3 NRC

2.3.4 Customer Review Sentiment

2.3.4.1 huliu

2.3.4.2 jockers_rinker

2.3.4.3 Sentiment Plots

2.3.5 Twitter Sentiment Halo

2.4 Topic Models

2.5 Relationships between Words

2.5.1 Bigram Reviews

2.5.2 Bigram Twitter

2.6 Ratings vs Reviews (Stars)

3 Fitbit Inspire 2

3.1 Data & Cleaning

3.2 Initial Text Analysis

3.2.1 Document Term Matrix

3.2.2 Frequent Words

3.3 Sentiment Analysis

3.3.1 Afinn

3.3.2 Bing

3.3.3 NRC

3.3.4 Customer Review Sentiment

3.3.4.1 huliu

3.3.4.2 jockers_rinker

3.3.4.3 Sentiments Plot

3.4 Topic Models

3.5 Relationships between Words

3.5.1 Bigram (n-gram token)

3.6 Ratings vs Reviews (Stars)

4 Fitbit Charge 4

4.1 Data & Cleaning

4.2 Initial Text Analysis

4.2.1 Document Term Matrix

4.2.2 Frequent Words

4.3 Sentiment Analysis

4.3.1 Afinn

4.3.2 Bing

4.3.3 NRC

4.3.4 Customer Review Sentiment

4.3.4.1 huliu

4.3.4.2 jockers_rinker

4.3.4.3 Sentiments Plot

4.4 Topic Models

4.5 Relationships between Words

4.5.1 Bigram (n-gram token)

4.6 Ratings vs Reviews (Stars)

5 Apple Watch 6

5.1 Data & Cleaning

5.2 Initial Text Analysis

5.2.1 Document Term Matrix

5.2.2 Frequent Words

5.3 Sentiment Analysis

5.3.1 Afinn

5.3.2 Bing

5.3.3 NRC

5.3.4 Customer Review Sentiment

5.3.4.1 huliu

5.3.4.2 jockers_rinker

5.3.4.3 Sentiments Plot

5.4 Topic Models

5.5 Relationships between Words