WQD7001 Principle of Data Science
Group Assignment: Develop ShinyApps
| Member's Name | Student ID |
|---|---|
| Liow Wei Jie | S2016012 |
| Muhammad Umair | S2001767 |
| Teo Boon Long | 17198093 |
| Kong Mun Yeen | 17055182 |
Most of the common rating systems only showcase the average rating scale from 1 to 5. However, this type of rating showcase is too general and if buyer want to know the true comment/review of the interested product, they will have to browser through the comment section one by one.
Therefore, the purpose of this apps is to help buyers able to quickly browse through the reviews summary and gain a more detailed understanding on the product via word cloud. Two word clouds are being generated of this sentiment analysis of both positive reviews and negative reviews.
Positive review and negative review are differentiate by the product rating. Rating more or equal to 3 is consider as good review while rating less than 3 is consider as bad review. The whole review sentence is being reduced to the keyword that can well represent the specific review.
Below is the sample coding to retrieve the keywords in review as the data source for word cloud.
good_t_w <- t %>% filter(title == input$ProductSearch) %>%filter(overall>=3) %>%select(summary)
mycorpus <- Corpus(VectorSource(bad_t_w $summary))
mycorpus <-tm_map(mycorpus,content_transformer(tolower))
mycorpus <- tm_map(mycorpus, removeNumbers)
mycorpus <- tm_map(mycorpus, removeWords,stopwords("english"))
mycorpus <- tm_map(mycorpus,removePunctuation)
mycorpus <- tm_map(mycorpus,stripWhitespace)
mycorpus <- tm_map(mycorpus, removeWords, c("one","two","three","four","five", "star","stars"))
token_delim <- " \\t\\r\\n.!?,;\"()"
bitoken <- NGramTokenizer(mycorpus, Weka_control(min=2,max=2, delimiters = token_delim))
two_word <- data.frame(table(bitoken))
sort_two <- two_word[order(two_word$Freq,decreasing=TRUE),]
By taking in the filtered keywords as data source, it is then feeds into a render plot to generate the word cloud.
output$good_review <- renderPlot({
x <- datasource()$bitoken
y <- datasource()$Freq
minfreq_bigram <- 2
wordcloud(x,y,random.order=FALSE,scale = c(3,0.5),min.freq = minfreq_bigram,colors = brewer.pal(8,"Dark2"),
max.words=50)})
Due to the limiting processing power, only product data within Appliances category being selected for this apps. Future work can be carry out by including all categories of product within the Amazon dataset.
Link to shinyApps: https://group-a-pds.shinyapps.io/shiny-ui/#!/
Link to GitHub:link{target=“_blank”}