We found articles from eight different newspapers across the country. We focused our analysis on a few of the major cities since they would likely have the most articles regarding climate change and we made sure to have newspapers from the major regions of the US to see if there were any differences. Since the issue of climate change gained momentum starting in 2006, we focused our analysis on the time period of 2006 to 2020. In order to conduct our analysis, we selected 100 articles from each of the papers we analyzed, making sure to include a variety of dates across the selected time frame. When downloading the articles we then stripped the article of everything except for the title of the article and the article’s body of text. Among the items stripped out were the publication type classification of each article, the language, and main topics, as well as other noise in the text file. In all cases, the file downloaded from NexisUniversity was a Rich Text File, which we then converted into a plain text .txt file in order to read into R. Once the .txt file was read into R, we stripped out any parts of the .txt file that were extraneous and performed the sentiment analysis.
## # A tibble: 12,477 x 2
## word n
## <chr> <int>
## 1 climate 2139
## 2 change 1408
## 3 global 535
## 4 warming 422
## 5 carbon 404
## 6 times 376
## 7 york 369
## 8 emissions 315
## 9 energy 307
## 10 people 305
## # … with 12,467 more rows
##
## negative positive
## 1105 599
##
## anger anticipation disgust fear joy negative
## 394 367 279 491 256 987
## positive sadness surprise trust
## 955 376 199 558
## # A tibble: 1,036 x 3
## word n value
## <chr> <int> <dbl>
## 1 united 299 1
## 2 fire 156 -2
## 3 risk 105 -2
## 4 natural 89 1
## 5 threat 69 -2
## 6 risks 65 -2
## 7 clean 64 2
## 8 crisis 61 -3
## 9 agreement 59 1
## 10 increase 56 1
## # … with 1,026 more rows
Overall, the sentiment found in the New York Times was very negative. We found that overall in words that can be classified as “negative” or “positive” 1105 words were found to be negative while only 599 were considered to be positive. Based on these numbers 65% of the words that are able to be classified have a negative connotation. Furthermore, looking more specifically at the sentiment of the articles beyond just positive or negative interestingly the most common adjective found with 558 words associated was “trust”. This is fairly surprising given that in reality there seems to be a lack of trust on this issue. However, the likely cause of this is that the word “United” shows up a lot due to “United States” showing up so this is likely taken out of context. Another adjective found a lot was “fear” which makes sense since a lot of people are worried and fearful about the effects of climate change over the next few years. The sentiment also may be a bit more negative than what the analysis shows since the most common “positive” and “trust” word is united which likely just comes from “United States” so in reality it should be seen as a neutral word and not included. One word that really stood out as appearing a lot was “fire” in the affin anaysis which appeared 156 times across all the articles and is clearly a huge concern of climate change. This also shows us that The New York Times is concerned with climate change outside of their own region since much of the fires are going on in California. The words threat, risk, and crisis all also appear as 3 of the top 10 words seen in the afinn analysis and are each are a -2 or -3 on the scale from -5 to 5.
## # A tibble: 9,850 x 2
## word n
## <chr> <int>
## 1 climate 1652
## 2 change 1124
## 3 global 297
## 4 warming 288
## 5 washington 271
## 6 post 255
## 7 scientists 214
## 8 people 208
## 9 report 201
## 10 science 200
## # … with 9,840 more rows
##
## negative positive
## 780 433
##
## anger anticipation disgust fear joy negative
## 290 312 171 371 212 703
## positive sadness surprise trust
## 782 291 170 465
## # A tibble: 842 x 3
## word n value
## <chr> <int> <dbl>
## 1 united 130 1
## 2 natural 68 1
## 3 increase 62 1
## 4 risk 57 -2
## 5 risks 52 -2
## 6 threat 49 -2
## 7 growing 44 1
## 8 support 43 2
## 9 clean 38 2
## 10 disasters 37 -2
## # … with 832 more rows
The results found after doing a sentiment analysis of Washington Post articles related to climate change found very similar results to New York. Of the words that could be classified 780 words were considered with 433 considered positive totaling to roughly 64% of words were classified as negative. It is interesting that less of the words from the articles can be classified as either positive or negative compared to the New York Times articles. This likely means that either the New York Times articles on climate change were longer on average or the New York Times articles were more opinionated. Looking at sentiment on a spectrum words with a sentiment classified as a -2 from -5 to 5 was the most common. Similarly to New York going beyond positive and negative the next two most common classifications were “trust” and “fear”. Looking closer at the affin analysis it was interesting that we see words like “increase” or “growing” more and the word “crisis” is not nearly as common as we saw in New York. Because of this it appears that the issue is a growing conern in DC but maybe not quite at the forefront of their attention like it is in New York.
## # A tibble: 7,364 x 2
## word n
## <chr> <int>
## 1 climate 1018
## 2 change 741
## 3 global 230
## 4 warming 202
## 5 chicago 180
## 6 people 142
## 7 daily 124
## 8 herald 120
## 9 carbon 119
## 10 u.s 118
## # … with 7,354 more rows
##
## negative positive
## 564 345
##
## anger anticipation disgust fear joy negative
## 196 250 137 252 181 508
## positive sadness surprise trust
## 620 207 127 358
## # A tibble: 687 x 3
## word n value
## <chr> <int> <dbl>
## 1 united 51 1
## 2 clean 46 2
## 3 agreement 44 1
## 4 increase 39 1
## 5 support 35 2
## 6 natural 32 1
## 7 care 27 2
## 8 risk 27 -2
## 9 poor 23 -2
## 10 deniers 22 -2
## # … with 677 more rows
Results are extremely similar to New York, DC and Philadelphia. When classifying results that are negative or positive we see that 564 negative words appear with just 345 positive words appearing giving a rough percentage of 62% of the words being negative. Once again similar to the Washington Post we see much less negative and positive words as a whole so this could mean Chicago has a more neutral perspective compared to New York despite similar percentages. One word that shows up a lot is the word “lake” which appears 117 times so it appears that is one of the largest concerns Chicago has with regards to climate change given their proximity to the Great Lakes. To back this up in the affin analysis the word “clean” is used the 2nd most amount of times of any word categorized right behind “united” which can likely be mostly ignored. Another word that appeared in the top 10 most common words was “deniers” and “agreement” which shows that Chicago is likely trying to get everybody behind climate change and attempting to convince the people who are still in doubt about the seriousness of the issue.
## # A tibble: 8,374 x 2
## word n
## <chr> <int>
## 1 climate 1199
## 2 change 901
## 3 san 439
## 4 diego 387
## 5 global 293
## 6 warming 288
## 7 union 223
## 8 tribune 213
## 9 carbon 197
## 10 report 193
## # … with 8,364 more rows
##
## negative positive
## 651 398
##
## anger anticipation disgust fear joy negative
## 246 263 154 305 177 586
## positive sadness surprise trust
## 684 241 141 413
## # A tibble: 726 x 3
## word n value
## <chr> <int> <dbl>
## 1 united 73 1
## 2 risk 58 -2
## 3 threat 45 -2
## 4 natural 43 1
## 5 increase 41 1
## 6 increased 36 1
## 7 support 36 2
## 8 growing 34 1
## 9 clean 32 2
## 10 cut 30 -1
## # … with 716 more rows
The sentiment of San Diego was very similar to the previous newspapers as we saw 651 words that could be classified as negative and 398 that could be classified as positive giving us a percentage of about 62% negative. Looking at the afinn analysis ignoring the word “United” the two most commonly used words were “threat” and “risk” showing up 58 and 45 times respectively. Although these only show up as a -2 on the spectrum from -5 to 5 these words clearly show that there is a lot of concern in this area. Furthermore, in the affin analysis it was surprising that “fire” was not one of the most common of the categorized words as it only appeared 24 times and was much further down the list than it was in New York.
## # A tibble: 5,584 x 2
## word n
## <chr> <int>
## 1 climate 927
## 2 change 671
## 3 oklahoma 486
## 4 city 239
## 5 oklahoman 225
## 6 global 174
## 7 water 161
## 8 report 150
## 9 warming 149
## 10 weather 143
## # … with 5,574 more rows
##
## negative positive
## 380 278
##
## anger anticipation disgust fear joy negative
## 166 199 115 212 138 384
## positive sadness surprise trust
## 502 163 94 300
## # A tibble: 502 x 3
## word n value
## <chr> <int> <dbl>
## 1 severe 55 -2
## 2 united 48 1
## 3 natural 44 1
## 4 increase 30 1
## 5 fear 22 -2
## 6 hoax 22 -2
## 7 intense 21 1
## 8 risk 21 -2
## 9 disasters 19 -2
## 10 increased 19 1
## # … with 492 more rows
The results in Oklahoma City are slightly more positive than the rest of the newspapers examined so far. Running the basic bing analysis we see that 380 words can be classified as negative and 278 words can be classified as positive giving us a total percentage of about 58% which is 4% less than any of the other newspapers already examined. It is also noteworthy that the number of words that can be classified as negative or positive is relatively small so these articles are likely shorter than the articles in the New York Times and Washington Post. This can either be because the articles in this newspaper in general are shorter or because this is not as big of an issue in Oklahoma City. Furthermore, one of the most common words to appear was “hoax” appearing 22 times which was not seen in the top 10 categorized words in the affin analysis in New York, Philadelphia, San Diego, Chicago, or DC.
## # A tibble: 9,653 x 2
## word n
## <chr> <int>
## 1 climate 1621
## 2 change 1266
## 3 global 401
## 4 energy 351
## 5 florida 318
## 6 times 315
## 7 warming 306
## 8 trump 234
## 9 tampa 229
## 10 national 207
## # … with 9,643 more rows
##
## negative positive
## 755 420
##
## anger anticipation disgust fear joy negative
## 294 268 192 356 171 700
## positive sadness surprise trust
## 705 285 134 428
## # A tibble: 801 x 3
## word n value
## <chr> <int> <dbl>
## 1 united 117 1
## 2 natural 79 1
## 3 clean 73 2
## 4 hoax 71 -2
## 5 increase 59 1
## 6 threat 58 -2
## 7 support 53 2
## 8 true 46 2
## 9 free 41 1
## 10 agreement 37 1
## # … with 791 more rows
The sentiment of Tampa Bay is similar to that of Chicago as a basic analysis of the words used in the articles found that 755 of the words were negative and just 420 words were positive giving us a percentage of about 64% negative. Furthermore, Tampa Bay has much more words that can be classified as either negative or positive meaning that this is likely a bigger issue in Tampa Bay than it is in places like Oklahoma City. However, it is interesting that the 4th most common word that could be classified was “hoax” which is more in line with the Oklahoma point of view. The words natural and clean were the 2nd and 3rd which is tough to tell the context of but it seems that similar to Chicago climate change maybe is not seen as a current “crisis” but is more of a growing issue.
## # A tibble: 9,366 x 2
## word n
## <chr> <int>
## 1 climate 292
## 2 change 214
## 3 people 189
## 4 president 146
## 5 u.s 142
## 6 world 117
## 7 global 106
## 8 time 99
## 9 house 97
## 10 obama 94
## # … with 9,356 more rows
##
## negative positive
## 820 459
## Joining, by = "word"
##
## anger anticipation disgust fear joy negative
## 309 330 204 394 246 747
## positive sadness surprise trust
## 793 322 176 464
## # A tibble: 840 x 3
## word n value
## <chr> <int> <dbl>
## 1 fire 90 -2
## 2 united 62 1
## 3 risk 37 -2
## 4 care 35 2
## 5 growing 35 1
## 6 natural 33 1
## 7 increase 28 1
## 8 war 28 -2
## 9 paradise 26 3
## 10 support 25 2
## # … with 830 more rows
Overall, the sentiment found in the examined articles published by the USA Today was fairly negative. Out of 1,369 words classified as negative or positive, our Bing analysis found 820 to be negative, meaning that approximately 59.90% of the words were found to have a negative connotation. The NRC analysis seems to dispute this, with positive being the single category with the highest score. However, we feel this may be deceiving, as terms such as “U.S.,” “President,” “World,” and “Change,” all words which generally and historically may have a positive connotation, were likely used in a negative context in many of these pieces. Like previously analyzed papers, a similar caveat must be noted with the high score trust received in the NRC analysis due to the likelihood that this is a case of our analysis tools not being able to understand the context in which words are used. Other emotions which had strong representations include fear, sadness, and anticipation, which is unsurprising given the general attitude toward climate change in the U.S. today. Finally, our Affin analysis of the USA Today revealed a negatively skewed distribution of sentiment scores, with “fire” being the scored word receiving the most mentions. Additional negative sentiment words included “risk” and “war.” It should be noted, that as was mentioned in the analysis for previous papers, the word “united” appears as the second most frequent word, earning a positive sentiment score. However, it is likely that some significant percentage of the instances of “united” come in reference to the United States, which should probably be scored as neutral as opposed to positive. Overall, the analyses indicate a decidedly negative sentiment across the USA Today articles we reviewed, which is in line with similar papers published in major East Coast markets such as the New York Times, Washington Post and Philadelphia Inquirer.
## # A tibble: 11,788 x 2
## word n
## <chr> <int>
## 1 climate 277
## 2 change 239
## 3 people 172
## 4 time 144
## 5 city 135
## 6 energy 132
## 7 philadelphia 131
## 8 environmental 103
## 9 johnson 102
## 10 epa 99
## # … with 11,778 more rows
##
## negative positive
## 878 579
##
## anger anticipation disgust fear joy negative
## 360 387 240 435 300 843
## positive sadness surprise trust
## 959 365 203 544
## # A tibble: 943 x 3
## word n value
## <chr> <int> <dbl>
## 1 natural 46 1
## 2 united 45 1
## 3 support 43 2
## 4 top 40 2
## 5 care 34 2
## 6 pay 29 -1
## 7 hard 27 -1
## 8 crisis 26 -3
## 9 clean 23 2
## 10 benefits 22 2
## # … with 933 more rows
Based on the Bing sentiment analysis, the sentiment of climate change pieces written by the Philadelphia Inquirer over the past 14 years has been mostly negative, with 878 out of 1,457 scored words being categorized as negative. These 878 words represent approximately 60.26% of the words rated positive or negative. Of the most commonly used words in the articles, none stand out as being overly positive or negative, with all being fairly neutral words such as “climate,” “time,” “city,” etc. NRC analysis reveals a very high rating for trust, which as previously mentioned could be the result of a lack of contextual interpretation of words like “united” and “trump,” both of which the sentiment analysis scores positively despite the fact that they can have very different contextual usages. Also scoring high in the NRC sentiment analysis were fear, anticipation, sadness and anger. All of these emotions are unsurprising given the amount of public concern and uncertainty over the implications of climate change. In line with the other sentiment analyses, the distribution of Affin scores for the Inquirer articles analyzed is bimodal with a decided negative skew. However, many of the most prevalent scored words were positive, including “natural,” “support,” “clean,” etc. Top negative words included “crisis,” “pay,” and “hard.” As with the other papers, the caveat must be made that some of the positively scored words, such as “united,” may be taken out of context, but it is certainly possible that the same can be said for certain negative words. Overall, it appears that the sentiment around climate change in the Philadelphia Inquirer is similar to that expressed in other papers in similar East Coast cities such as the New York Times, Washington Post, and USA Today.
Given that the articles from the New York Times has the highest percentage of words that are categorized as negative at 65%, the Northeast is where we recommend we focus a significant amount of our effort. Furthermore, the word “crisis” was in the top 10 most commonly used words and was categorized by affin analysis as a -3 where as in other cities we saw words such as “growing” and “increasing” which tells us that while the issue may be higher on the radar of those cities, it may not be the primary issue like it is in New York. Specifically one of the areas that we can really focus on in the Northeast is looking at policy to help reduce the level of emissions from cities like New York, Boston and Philadelphia, all of which are causing a lot of air pollution. This is clearly a central issue for New York especially since the words “emissions” and “carbon” showed up 315 and 404 times respectively across the 100 articles from the New York Times. A focus on changing laws in this area to reduce air pollution should be one of the primary goals of our efforts.
Since the word “fire” was the 2nd most common word categorized in the affin analysis in the New York Times it is a good idea to provide support for not just the northeast region but for the regions in California that are being hit hardest by the fires. It appears that the New York Times not only cares about what is happening in its specific region but cares more about climate change as a whole in the United States and since the issue of California fires seems to be their top priority, this is an issue where they would really support our efforts to help. Although the San Diego newspaper did not see the word “fire” as often this is likely a big issue for them too and cities such as LA and San Francisco that we did not examine in our analysis would likely see this as their top priority.
After investing reducing air pollution in the northeast, and reducing the forest fires in California my 3rd recommendation and 3rd priority would be helping with water pollution in places like Tampa Bay and Chicago. Chicago and Tampa Bay had a negativity percentage based on our bing analysis of 62% and 64% respectively meaning that this is an important issue for them. Although it may not be seen as the crisis as it is in New York, it is a growing issue in both cities as words such as “increasing” and “growing” are commonly used in the articles. Furthermore, the word “lake” appears 117 times in the 100 articles of Chicago while the word “sea” appears 141 times in the Tampa newspaper. Given how common these words appear and how clearly climate change is a growing issue in the two cities it will be worthwhile to reduce pollution into the nearby bodies of water which can be done through various ad campaigns or by encouraging new community organizations to form focused on trash pickup in these water ways.
It appears that Oklahoma City and likely places nearby do not care as much about these environmental issues as the other cities we examined. This hypothesis is based on our bing analysis which saw only 58% of the categorized words were negative which is significantly less than the other newspaper and the word hoax was the 6th most common seen in our affin analysis and used 22 over the 100 articles. While investing in anything specific would probably not get much support in these regions it may be worthwhile to make an effort attempting to raise awareness of environmental issues as a whole by having TV ads that show the fires in Califronia or how polluted specific water ways are to try to gain more support for environmental issues in this region.