In analyzing iPhone 14 customer reviews using sentiment analysis, the process involved utilizing natural language processing (NLP) techniques to assess the overall sentiment expressed in the reviews, providing valuable insights into customer satisfaction and potential areas for improvement.
The dataset is from Kaggle. Here is the link iPhone 14 Customer Reviews
This dataset contains reviews of the iPhone 14 from various customers. Each entry in the dataset includes information such as the title of the review, the rating given by the customer, the detailed review text, the name of the customer, the date when the review was posted, and the location of the customer. These reviews offer insights into customer opinions, satisfaction levels, and preferences regarding the iPhone 14.
Column Descriptions:
Title: Title or headline of the review provided by the customer. It summarizes the main point or sentiment expressed in the review.
Rating: Numerical rating given by the customer for the iPhone 14. Ratings typically range from 1 to 5, with 5 indicating the highest satisfaction level.
Review: Detailed text of the review where the customer shares their experiences, opinions, and feedback about the iPhone 14.
Customer Name: Name or identifier of the customer who wrote the review. It helps in tracking individual reviewers and analyzing their feedback.
Dates: Date when the review was posted by the customer. It provides temporal information for analyzing trends and changes in customer sentiment over time.
Customer Location: Location of the customer who wrote the review.
iphone_14 = read.csv(file.choose())
# Having a look on the dataset
library(dplyr)
library(knitr)
iphone_14 %>%
str() %>%
kable()
## 'data.frame': 1024 obs. of 6 variables:
## $ title : chr "Terrific" "Fabulous!" "Great product" "Just wow!" ...
## $ rating : num 5 5 5 5 4 5 5 5 4 5 ...
## $ review : chr "I bought iPhone 14 in big billion days. Very happy. Excellent Product deliveryExcellent hapticsExcellent Perfor"| __truncated__ "Best smart phone under this price range compare to other phones in 2023 if you see overall build quality, perfo"| __truncated__ "Nice camera but battery drain fast specially on video recordingREAD MORE" "GoodREAD MORE" ...
## $ customer_name : chr "Sathvick Kumaran" "Rahul Prasad " "Tara singh mehra" "Avi Nash" ...
## $ dates : chr "4 months ago" "Jan, 2023" "11 months ago" "Feb, 2023" ...
## $ customer_location: chr " The Nilgiris District" " Debipur" " Ramnagar" " Bengaluru" ...
x | |
---|---|
title | 0 |
rating | 72 |
review | 0 |
customer_name | 0 |
dates | 0 |
customer_location | 0 |
iphone_14$customer_location %>%
table() %>%
as.data.frame() %>%
rename(Location = ".") %>%
dplyr::arrange(desc(Freq)) %>%
head(20) %>%
kable()
Location | Freq |
---|---|
New Delhi | 50 |
Bengaluru | 47 |
Hyderabad | 21 |
Kolkata | 20 |
Mumbai | 19 |
Ahmedabad | 16 |
Bhubaneswar | 14 |
Lucknow | 13 |
Chennai | 11 |
Patna | 11 |
Pune | 11 |
Indore | 10 |
Jaipur | 10 |
Nagpur | 10 |
Ghaziabad | 9 |
Bangalore | 8 |
Bharuch | 8 |
Gaya | 8 |
Gurugram | 8 |
Guwahati | 8 |
New Delhi and Bengeluri has more customers buying iPhone 14.
iPhone 14 company should open a branch in this locations to reach out to more customers and increase company sales
iphone_14$rating %>%
table() %>%
as.data.frame() %>%
rename(`Rating Score` = ".", Frequency = "Freq") %>%
dplyr::arrange(desc(Frequency)) %>%
kable()
Rating Score | Frequency |
---|---|
5 | 748 |
4 | 168 |
3 | 36 |
library(tm)
# Create a corpus
corpus <- VCorpus(VectorSource(iphone_14$title))
# Cleansing Text*
# Text preprocessing*
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))
# Creating a Term-Document Matrix
dtm = DocumentTermMatrix(corpus)
library(wordcloud)
invisible(wordcloud(corpus))
library(conflicted)
library(tidytext)
library(tidyverse)
sentiment_scores <- iphone_14 %>%
unnest_tokens(word, title) %>%
inner_join(get_sentiments("bing")) %>%
count(sentiment) %>%
spread(sentiment, n, fill = 0)
sentiment_scores %>%
kable()
negative | positive |
---|---|
4 | 862 |