Sentiment Analysis 1

Author

Lindsey Utterback

General Question

Overall, I hope to explore the sentiments of the movies Forest Gump and The Matrix. These are both movies I have watched and collected data on through IMDb. They both appear on IMdB top 250 movies so I am interested in exploring if their reviews match this idea

Data Collection

I intend to collect review data from the IMDb page of Forest Gump and The Matrix to determine the sentiments of how viewers feel about these two movies. This data will provide evidence on possibly how IMBD is scoring their top 250 movies and if it relates at all to the reviews they receive.

Question: Do The Matrix reviews or Forest Gump reviews tend to use more positive words according to Bing and what words should be excluded from analysis?

Interpretation: Excluding words such as fiction that is not positive or negative, funny which is not a negative sentiment, miss which is not used in and “I miss you” context, and plot which would not necessarily have a negative sentiment, The Matrix has a higher positive sentiment count. Forest Gump has a higher negative sentiment count. In terms of the Bing sentiment, overall the reviews seem to spread more positivity on the Matrix reviews.

Question: How does the emotional sentiment of Forest Gump reviews compare to The Matrix reviews?

`summarise()` has grouped output by 'sentiment'. You can override using the
`.groups` argument.

What Bar Chart Shows?: the plot is a grouped bar chart that visually compares the distribution of emotional sentiment scores in movies. Each group of bars represents a combination of sentiment and movie, and the heights of the bars show the total count of emotive words using nrc for each combination. The colors distinguish between different movie names.

Interpretation: Overall Forest Gump reviews use slightly more positive and slightly more negative words than The Matrix. That can be seen as a contradiction but really may indicate that there was just more emotional words used in Forest Gump reviews overall. Diving into some of the other emotional lexicons, anger and disgust words are used about the same amount. There are more anticipation, joy, sadness, surprise, and trust words used in Forest Gump. The content of Forest Gump has many happy moments but is also a sad movie at certain parts which may be why the reviews reflect this. The Matrix is a scientific fiction movie that may induce fear of the future so it scoring high on fear words is not surprising.

Question: Overtime has the emotional sentiment of Forest Gump and The Matrix shifted to positive or negative using Bing.

`summarise()` has grouped output by 'movie_name', 'new_date'. You can override
using the `.groups` argument.

What Bar Plot Shows: This graph shows every year there were reviews on these movies and their positivity score for each year given there was a review using bing. To create this graph I had to create a new column called new_date that just displays year rather than the whole date.

Interpretation: In earlier years it seems as the positivity score for the Matrix was much lower but by 2005 it seems to increase and in 2018 and 2019 had a very positive score indicating a possible revival of the movie during that time. In recent years its positivity score has gone down and may be due to the fear sentiment that was seen in the nrc data. Forest Gump had its most positive scores in 2002 and 2023 and varying scores in the year after. We see an extreme decline in positive sentiment in the 2012. With scores for each movie not shown every year, it is difficult to compare the two. But overall it seemed that The Matrix reached a higher positive bing score than Forest Gump ever did based on years. Though Forest Gump’s positivity score in 2023 remains significantly higher than The Matrix’s. So there has been in a shift in the review of these movies regarding positivity.