class: center, middle, inverse, title-slide .title[ # ANALYZING SENTIMENTS ON AVOCADO 🥑 USING TEXT MINING - 2023 WDSI conference ] .author[ ### Dr. Zhenning ‘Jimmy’ Xu; Dr. Di Wu, CSUB ] .date[ ### 2023/04/04 ] --- background-image: url(https://upload.wikimedia.org/wikipedia/commons/b/be/Sharingan_triple.svg) ??? Image credit: [Wikimedia Commons](https://commons.wikimedia.org/wiki/File:Sharingan_triple.svg) --- class: center, middle # xaringan ### /ʃaː.'riŋ.ɡan/ --- class: inverse, center, middle # Get Started --- # Outline - A simple search for "twitter data" returns 80,000+ results on Google Scholar - Avocado 🥑 is a popular fruit known for its creamy texture and popularity in various dishes. - Social media, especially Twitter, has become a rich source of information and opinions on different topics, including avocados🥑. - Objective - Exploring insights into people's opinions, emotions, and attitudes towards avocados. --- background-image: url(https://github.com/yihui/xaringan/releases/download/v0.0.2/karl-moustache.jpg) background-position: 50% 50% class: center, bottom, inverse # You only live once! --- # Introduction - Social media has seen application in a variety of contexts. - 80% of data will be unstructured in five years (King 2019). - Businesses in the food industry, policymakers, and researchers are looking forward to understanding consumer perceptions and sentiments towards avocados🥑. - This study may offer insights for food marketing promotions or innovations in avocado 🥑 retailing. --- # Literature review - Some elements of the Twitter conversation might be useful for improving product(service) design (Wu et al., 2019) - Social media provides companies with the following four advantages: sharing their expertise and knowledge, gaining customers' insights, enabling customers to help one another, and engaging prospective customers (Xu et al., 2020). - When it comes to text data analysis, we are often not sure what we are looking for. - Text Mining with R: A Tidy Approach (Silge and Robinson, 2017) --- # Method & Data - R allows us to scrap unstructured data using different APIs - R offers a varieties of open source libraries or packages for us to analyze unstructured data - Procedures - data scraping -> data cleaning -> data analysis - Beware of the restrictions and ethical issues of the Twitter API - Two types of Twitter Analytics that are commonly used in analyzing the tweets collected are descriptive and content analytics. - Using these two methods, we curled and analyzed about 110,000 Tweets that contain the search query “avocado” 🥑. - LDA (Latent Dirichlet Allocation) is a machine learning method that classifies documents based on different types of topics (David Blei, Andrew Ng, and Michael I. Jordan, 2003). --- # Co-occurrence analysis - an example .center[<img src="https://raw.githubusercontent.com/utjimmyx/resources/master/frequency.PNG" width='100%' align="middle"/>] --- # Co-occurrence analysis of Tweets on the search query “avocado” .center[<img src="https://raw.githubusercontent.com/utjimmyx/resources/master/Cooccurence.PNG" width='100%' align="middle"/>] --- # Topic modeling using LDA .center[<img src="https://raw.githubusercontent.com/utjimmyx/resources/master/Topics.PNG" width='100%' align="middle"/>] --- # Conclusions & limitations - A word Co-occurrence analysis shows that avocados are used along with words like “toast”, “cheese”, “sandwich”, etc. - The result shows there are two different clusters in the network. The first cluster is primarily about the most popular culinary recipes when it comes to making avocado dishes and the second cluster is primarily about the way the food is prepared. - The research provides valuable findings about the selection of culinary recipes for incorporating avocados in food preparation. - There are nine different categories according to the LDA analysis. Primarily, people talk about different avocado recipes in topics 2-5. People talked about other topics like holiday celebrations in other topics. - It might be interesting to compare consumer opinions toward avocados 🥑 vs. other types of vegetables using Twitter data. --- ## Thank you all for your participation! ### Questions ### References - Text Mining with R: A Tidy Approach (Silge and Robinson, 2017). open access at tidytextmining.com - R for Data Science (Wickham and Grolemund, 2021). open access at https://r4ds.had.co.nz