This truth discovery analysis will focus on the role of the media on the current administration in today’s society. In particular it will focus on trustworthy information by detecting unreliable sources. This senario will analyze the amount of coverage and whether it is positive or negative on the current administration from sources such as the big media corporations, and the Big Tech content producers (Google, YouTube). Negative coverage will be determined by words associated with negative connotation or context, and positive coverage will be determined by words associated with positive connotation or anything other than negative connotation. The process of gathering methods regarding the media outlets will be based on the same search words and the first 10 search results from both Google and YouTube. The size of the sample is based off a 5 to 25-person recommendation of sample size from “Sample Size Planning for Classification Models” research paper. https://arxiv.org/pdf/1211.1323.pdf.

source1Corpus <- Corpus(VectorSource(source0))
docs <- tm_map(source1Corpus, removePunctuation)
## Warning in tm_map.SimpleCorpus(source1Corpus, removePunctuation): transformation
## drops documents
docs <- tm_map(docs, removeNumbers)
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
docs <- tm_map(docs, tolower)
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
docs <- tm_map(docs, stripWhitespace)
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
tdm <- term_stats(docs, ngrams = 10, types = TRUE, subset = type1 == "warning")

tdm
##   term                                                              type1  
## 1 warning signs with some of trumps public comments interspersed in warning
## 2 warning that apparently wasnt heeded in the early days of         warning
##   type2 type3      type4 type5  type6  type7  type8    type9        type10 count
## 1 signs with       some  of     trumps public comments interspersed in         1
## 2 that  apparently wasnt heeded in     the    early    days         of         1
##   support
## 1       1
## 2       1

In the first link from the google search “trump coronavirus response” Is a link from Vox titled “The Trump administration’s botched coronavirus response, explained.” From the document search we can see various negative words an connotations within the results. The prediction was for negative and as displayed above the actual is a negative bias.

## Warning in tm_map.SimpleCorpus(source1Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                        type1   type2   type3    type4 type5 count support
## 1 botched its response to the botched its     response to    the       1       1
## 2 botched testing process â € botched testing process  â     €         1       1

The second link from the google search “trump coronavirus response” Is a link from the Washington Post titled “The Trump administration’s botched coronavirus response, explained.” From the document search we can see various negative words and connotations within the results. The prediction was for a negative response and the actual is negative.

## Warning in tm_map.SimpleCorpus(source2Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                      type1       type2 type3     
## 1 downplaying the escalating outbreak trump downplaying the   escalating
##   type4    type5 count support
## 1 outbreak trump     1       1

The third link from the google search is an article from Politico titled “Trump reworks to write narrative on Coronavirus response.” Here we have a couple of negative conotations such as in the title and also in the results of the doc search. The prediction was for a negative response and the actual is negative.

## Warning in tm_map.SimpleCorpus(source3Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                           type1    type2 type3       
## 1 response at headquarters regional headquarters response at    headquarters
## 2 response to the coronavirus pandemic           response to    the         
## 3 response to the outbreak has                   response to    the         
##   type4       type5        count support
## 1 regional    headquarters     1       1
## 2 coronavirus pandemic         1       1
## 3 outbreak    has              1       1

The fourth search is an article by CNN Titled “WHO defends coronavirus response after Trump criticism.” Within the search there is no negative connotation or negative wording. Article is in regards only to the World Health Organization’s response. The prediction was for a negative bias however the actual result is not negative.

## Warning in tm_map.SimpleCorpus(source4Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                type1     type2 type3
## 1 continued to try to shift blame for his response to continued to    try  
##   type4 type5 type6 type7 type8 type9    type10 count support
## 1 to    shift blame for   his   response to         1       1

The fifth article is from the news outlet ABC regarding the government’s response to the Coronavirus outbreak. With the word search we can see some negative connotation regarding how the President is trying to shift blame. The prediction was for a negative bias and the actual is negative results.

## Warning in tm_map.SimpleCorpus(source5Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                    type1 type2   type3
## 1 cast himself as the wise leader who rejected the advice cast  himself as   
##   type4 type5 type6  type7 type8    type9 type10 count support
## 1 the   wise  leader who   rejected the   advice     1       1

This article by CNN titled “Fact-checking Trump’s attempt to erase his previous coronavirus response” we can immediately see in the title and in some of the content within the article the negative connotation. The prediction was for a negative bias and the actual result shows a negative bias.

## Warning in tm_map.SimpleCorpus(source6Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                 type1  type2 type3 type4
## 1 burden is on the governor and her team to distribute burden is    on    the  
##   type5    type6 type7 type8 type9 type10     count support
## 1 governor and   her   team  to    distribute     1       1

This article by Fox news shows in the search that there are no negative connotations. The prediction was for a positive article and this article shows a false positive, which means that while it is not positive it is not negative either.

## Warning in tm_map.SimpleCorpus(source7Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                               type1    
## 1 concerned about infection has risen steadily since the virus began concerned
## 2 concerned about the economy grew dramatically in the second half   concerned
## 3 concerned americans are and what they think about the governmentâ  concerned
## 4 concerned americans say they are about the coronavirusâ € ™        concerned
## 5 concerned americans say they are that they someone in their        concerned
## 6 concerned that you or someone youâ € ™ re close                    concerned
##   type2     type3     type4   type5   type6        type7 type8        type9 
## 1 about     infection has     risen   steadily     since the          virus 
## 2 about     the       economy grew    dramatically in    the          second
## 3 americans are       and     what    they         think about        the   
## 4 americans say       they    are     about        the   coronavirusâ €     
## 5 americans say       they    are     that         they  someone      in    
## 6 that      you       or      someone youâ         €     ™            re    
##   type10      count support
## 1 began           1       1
## 2 half            1       1
## 3 governmentâ     1       1
## 4 ™               1       1
## 5 their           1       1
## 6 close           1       1

This article is from the site fivethiryfive titled “How Americans View The Coronavirus Crisis And Trump’s Response.” There is no indication from a search of the article that shows a negative or dishonest message. The prediction was for a negative article however the actual shows a false positive, while the article is not positive it is not negative either.

## Warning in tm_map.SimpleCorpus(source8Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                            type1    type2
## 1 messages from the mercurial president have left state and local messages from 
##   type3 type4     type5     type6 type7 type8 type9 type10 count support
## 1 the   mercurial president have  left  state and   local      1       1

In this article from our word search there is a negative message in the article. This was a prediction for a negative article and the actual coverage is negative.

## Warning in tm_map.SimpleCorpus(source9Corpus, removePunctuation): transformation
## drops documents
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(docs, tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
##   term                                                        type1   type2
## 1 calling on congress to increase funding for this program by calling on   
##   type3    type4 type5    type6   type7 type8 type9   type10 count support
## 1 congress to    increase funding for   this  program by         1       1

This article is from the white house offical website. From the doc word search we can not see any overt bias. The Prediction was negative however the actual result is a false positive.

library(readr)
## 
## Attaching package: 'readr'
## The following object is masked from 'package:tau':
## 
##     tokenize
library(caTools)
library(caret)
## Warning: package 'caret' was built under R version 3.6.3
## Loading required package: lattice
## Loading required package: ggplot2
## 
## Attaching package: 'ggplot2'
## The following object is masked from 'package:NLP':
## 
##     annotate
print("Google Search")
## [1] "Google Search"
# Sample Data
predicted <- c(0,0,0,0,0,0,0,0,1,0) # first 10 elements
actual <-    c(0,0,0,1,0,0,1,0,1,0) # first 10 elements

u <- union(predicted, actual)
t <- table(factor(predicted, u), factor(actual, u))
print(confusionMatrix(t))
## Confusion Matrix and Statistics
## 
##    
##     0 1
##   0 7 2
##   1 0 1
##                                           
##                Accuracy : 0.8             
##                  95% CI : (0.4439, 0.9748)
##     No Information Rate : 0.7             
##     P-Value [Acc > NIR] : 0.3828          
##                                           
##                   Kappa : 0.4118          
##                                           
##  Mcnemar's Test P-Value : 0.4795          
##                                           
##             Sensitivity : 1.0000          
##             Specificity : 0.3333          
##          Pos Pred Value : 0.7778          
##          Neg Pred Value : 1.0000          
##              Prevalence : 0.7000          
##          Detection Rate : 0.7000          
##    Detection Prevalence : 0.9000          
##       Balanced Accuracy : 0.6667          
##                                           
##        'Positive' Class : 0               
## 
total = 10
googlescore = 8 / total
print("Google Score")
## [1] "Google Score"
print(googlescore)
## [1] 0.8
print("Coverage either positive or negative")
## [1] "Coverage either positive or negative"
faircoverage = .5
print(faircoverage)
## [1] 0.5

The Confusion Matrix above is regarding the Google search “Trump Coronavirus response.” As shown above there is an accuracy rate of 80% regarding media coverage that is viewed negatively. The 80% is represented by the prediction of which links would be trusted to report news unbiasedly. From the search results of the first 10 links that were provided by Google, 8 links displayed the current administration in a negative light, while 2 links showed no negative connotation. There were no indiciations of any positive coverage from any of the news organizations from Google’s search engine. From the search seven were major media or newspaper outlets. Out of the search 1 is determined to be conservative while the other’s are mainly viewed as left leaning. Six of the seven major media outlets were predicted correctly as negative reporting while one of the six results was a false positive. What it also shows is the positive prediction value(pos pred value) at which the model predicted google would provide a link with an unfavoriable view towards the current administration 77% of the time. It also showed the (neg pred value) which showed Google provided a link to a positive article 1 out of every 10 times. We can see that the specificity is at 33% which shows us that Google’s search engine will pull up 3 stories that are a false negative which means that 3 out of every 10 search results mislabels the links negative but are either neutral or positive. We can also see that the balanced accuracy of the model shows it at a rate of 80%, if we compare that to a set positive or negative news rating of 50% for the administration, it shows that Google will provide links to negative articles almost 30% more often than a positive link. Lastly, we can conclude from these results that Google’s algorithm tend’s to favor articles with more leftward or liberal leanings. Below is the link for the Google search results.

https://www.google.com/search?sxsrf=ALeKk02B1-MIZPH-EeZSLkkPlpKI8FukPg%3A1586421191645&ei=x92OXqmFJ4HL0PEP4NOKsA8&q=tumps+coronavirus+response&oq=tumps+coronavirus+response&gs_lcp=CgZwc3ktYWIQAzoECCMQJzoCCAA6BAgAEEM6BAgAEAM6BwgjELACECc6BAgAEA1KCggXEgYxMi0xMjlKCAgYEgQxMi03UL2dAVjVqAFgvK8BaABwAHgAgAGwAYgBwweSAQMyLjaYAQCgAQGqAQdnd3Mtd2l6&sclient=psy-ab&ved=0ahUKEwjphN6899roAhWBJTQIHeCpAvYQ4dUDCAs&uact=5

library(readr)
library(caTools)
library(caret)
print("YouTube")
## [1] "YouTube"
# Sample Data
predicted <- c(0,0,0,0,0,0,0,0,0,0) # elements 1 - 10
actual <-    c(1,1,1,0,0,0,0,1,0,0) # elements 1 - 10

u <- union(predicted, actual)
t <- table(factor(predicted, u), factor(actual, u))
print(confusionMatrix(t))
## Confusion Matrix and Statistics
## 
##    
##     0 1
##   0 6 4
##   1 0 0
##                                           
##                Accuracy : 0.6             
##                  95% CI : (0.2624, 0.8784)
##     No Information Rate : 0.6             
##     P-Value [Acc > NIR] : 0.6331          
##                                           
##                   Kappa : 0               
##                                           
##  Mcnemar's Test P-Value : 0.1336          
##                                           
##             Sensitivity : 1.0             
##             Specificity : 0.0             
##          Pos Pred Value : 0.6             
##          Neg Pred Value : NaN             
##              Prevalence : 0.6             
##          Detection Rate : 0.6             
##    Detection Prevalence : 1.0             
##       Balanced Accuracy : 0.5             
##                                           
##        'Positive' Class : 0               
## 
total = 10
YouTubeScore = 6 / total
print("YouTube Score")
## [1] "YouTube Score"
print(YouTubeScore)
## [1] 0.6
print("Coverage either positive or negative")
## [1] "Coverage either positive or negative"
faircoverage = .5
print(faircoverage) 
## [1] 0.5

The Confusion Matrix above is regarding the YouTube search “Trump Coronavirus response”. As shown above there is an accuracy rate of 60% regarding media coverage that is viewed negatively. The 60% is represented by the prediction of which links would be trusted to report news unbiasedly. From the search results of the first 10 links that were provided by YouTube, 6 links displayed the current administration negatively and 4 links that had no positive or negative connotation. There were no indiciations of any positive coverage from any of the search results from YouTube’s search engine. From the search however 10 were major media outlets. We can also see the positive prediction value(pos pred value) which shows the model predicted YouTube would provide a link with an unfavorable view towards the current administration 60% of the time. It also showed the (neg pred value) as NAN which shows that YouTube provided a link to a positive article at least for the first 10 search results 0 out of 10 times. We can see that the prevalence or our false negative is at 60% which shows that YouTube’s search engine for the first 10 results mislabeled them 4 out of every 10 times. We can also see that the balanced accuracy of the model shows it at a rate of 60%, if we compare that to a set positive or negative news rating of 50% for the administration, it shows that YouTube will provide links to negative articles approximately 10% more often than provide links to positive ones. Lastly, if we take a look at YouTube itself as a search engine we can see that all 10 results come from left leaning outlets and 0 come from outlets deemed conservative, of the 10 outlets, 3 are MSNBC, 2 are NBC, 2 are CBS, 1 is CNN, 1 is Global News, and 1 is The Late Show with Stephen Colbert. What we can speculate from here is that YouTube’s algorithm tends to favor more links with more leftward or liberal leanings. Below is the link to the first ten YouTube Search results.

https://www.youtube.com/results?search_query=Trump+coronavirus+response

library(readr)
library(caTools)
library(caret)
print("Both Google and YouTube")
## [1] "Both Google and YouTube"
# Sample Data
predicted <- c(0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0) # elements 1 - 20
actual <-    c(0,0,0,1,0,0,1,0,1,0,1,1,1,0,0,0,0,1,0,0) # elements 1 - 20

u <- union(predicted, actual)
t <- table(factor(predicted, u), factor(actual, u))
print(confusionMatrix(t))
## Confusion Matrix and Statistics
## 
##    
##      0  1
##   0 13  6
##   1  0  1
##                                           
##                Accuracy : 0.7             
##                  95% CI : (0.4572, 0.8811)
##     No Information Rate : 0.65            
##     P-Value [Acc > NIR] : 0.41663         
##                                           
##                   Kappa : 0.1781          
##                                           
##  Mcnemar's Test P-Value : 0.04123         
##                                           
##             Sensitivity : 1.0000          
##             Specificity : 0.1429          
##          Pos Pred Value : 0.6842          
##          Neg Pred Value : 1.0000          
##              Prevalence : 0.6500          
##          Detection Rate : 0.6500          
##    Detection Prevalence : 0.9500          
##       Balanced Accuracy : 0.5714          
##                                           
##        'Positive' Class : 0               
## 
total = 20
Both = 14 / total
print("Both Search Engines Together")
## [1] "Both Search Engines Together"
print(Both)
## [1] 0.7
print("Coverage either positive or negative")
## [1] "Coverage either positive or negative"
faircoverage = .5
print(faircoverage) 
## [1] 0.5

Above is the presentation of a Confusion Matrix regarding both results from the Google and YouTube search. What the Confusion Matrix shows is that this model has an accuracy rate of 70%. We can also see the positive prediction value(pos pred value) which shows the model predicted both would provide a link with an unfavoriable view towards the current administration 68% of the time. It also shows the (neg pred value) as 1.0, so both YouTube and Google provided a link to a positive article given the first 20 search results 1 out of 20 times. We can see that the prevalence or our false negative is at 65% which shows that between both companies the search engines mislabeled the predictions 6.5 out of every 20 times. We can also see that the balanced accuracy of both search engines shows it at a rate of 70%. If we compare that to a set news rating of a positive or negative 50% for the administration, it shows that between both companies the result will present a negative article approximately 20% more often then a positive one.
This shows that both search results tend to favor articles and videos with more of leftward view, therefore displaying a tendency to have content that is more critical towards the current administration. What we can derive from these results is that if these companies are considered to be places of truth and sources of reliability, we have to take a look at the nature of truth thought and consider that truth is relative and how one views the world will influence their actions. If we look beyond these companies we see only individuals, all of whom have their own idologies and beliefs which can influence writing code, making programs, and practicing mathmatical models. While it may seem that mathmatics is objective, the way in which it can be implemented, or the way code may be written, will always be reflections of the indvidual or individuals who created it.