2/25/2020
Iphone11 Sentiment Analysis
Business Context:
- Prospective modern-day customers read the product reviews and feedback given by experts and other users on social media before the purchase of new gadgets.
Problem Description:
- This is an analysis of the tech review and customer feedback on twitter, to get insights into what the customers seek from Apple’s iPhone Series.
- To identify the areas of concern that need more focus.
Data Collection
Recap Of Proposed Data Analytics Plan
- Use topic modelling techniques to create topics based on co-occurence of words in the document.
- Use text mining and sentiment analysis.
- Use visualization techniques to demonstrate distribution of sentiments, frequency of the tweets, frequent terms in the tweets and geography-wise tweet analysis.
Summary Of Peer Comments
Peer Comments(Contd.)
Data Summary
- Data has been uploaded at https://drive.google.com/uc?export=download&id=1kOlxzs8Bq9 WzXl8UlZDD8mOrTE9pyq6s
- Data consists of following columns: Sno,user_id,created_at,screen_name,text,source,is_retweet, favorite_count,retweet_count,reply_count,hashtags,place_fu ll_name,place_type,country,country_description,followers_c ount,account_lang
Data Exploration
- Frequency of tweets based on the source attributes

Data Exploration
- Frequency of tweets based on user location

Data Exploration
- Day-to-Day trend of tweets regarding iPhone11 series

NLP Procedure
- As the analysis is about the data extracted from twitter, there are no predefined sentiments tagged to the tweets in the data. So, Sentimental Analysis is performed.
- Since there is no scope for building models based on training and validation data sets, topic modelling techniques are used on the tweets.
- Latent Dirichlet allocation (LDA) topic modelling technique has been used in the analysis.
- Steps involved: Creating a corpus,Tokenization,Removing Stop Words,Removing Numbers,Removing rare words,Finding unigrams and bigrams,Finding correlation between words,Generating Document-Term Matrix,tf-idf.
Word Cloud of negative and positive words

Alt text
Frequency of various sentiments gathered through these tweets
Top 10 frequent terms for each sentiment

Alt text
5 topic LDA model

Alt text
Key Take-aways
- NLP has become increasingly popular over the past few years, and NLP researchers have achieved very insightful insights
- The Natural Language Tool Kit (NLTK) is one of the most popular Python libraries for NLP
- Regular Expressions are an important part of NLP, which can be used for pattern matching and filtering
- Common feature engineering techniques are removing stop words, stemming, lemmatization, and n-grams
- How you clean and preprocess your data will have a major effect on the conclusions you’ll be able to draw in your NLP classification problems