I conducted an analysis of how much content from junk news websites is spread by public Facebook pages and groups that support US Presidential Candidates Donald Trump and Joe Biden. For the purposes of this analysis, “junk news” websites are domains that posted false or misleading content in the past according to FactCheck.org

To identify pages and groups that support the candidates, I used the CrowdTangle API to search for pages and groups that have shared posts from the official Facebook pages of the candidates any time in the past year (October 2018 and October 2019). Below is the number of pages that were found which comprise the support “ecosystems” of pro-Trump and pro-Biden used in this analysis:

Of the nearly one million posts in the data, here are the number of posts that shared links of any kind (excluding links to Facebook posts):

Here are the number of links to “junk news” websites that were found in each ecosystem:

In the pro-Trump Pages Ecosystem, there were 213 unique links shared from 10 “junk news” websites. Of the 213 links, 163 of them came from the Geller Report, which according to FactCheck.org “is run by Pamela Geller, an anti-Islamic activist.” Other sources of “junk news” that were shared were:

The links from “junk news” websites that were most shared within the pro-Trump Pages ecosystem were as follows:

A text analysis of the Descriptions accompanying the links highlights the dominant topics of the “junk news” shared: borders and Islam. Mentions of Islam are correlated with prominent Democratic congresswomen Ilhan Omar and Rashida Tlaib as well as sharia and jihad.

Of the 100 pages in the pro-Trump Pages ecosystem, 34 of them shared a “junk news” link. The most prolific sharer is “Wake Up America, the Original”, which was responsible for nearly one-third of the “junk news” links and has more than 500,000 followers.

Only a handful of junk news links (nine) were shared by multiple pages so it is difficult to assess widespread coordinated behavior among pages. However, a social network analysis of “junk news” links that were shared indicates apparent coordination among four pro-Trump pages to amplify content:

A time-series analysis indicates that the “Stand With [blank] Against the Illegal Alien Invasion” pages are clearly coordinating to share junk news links related to immigration. The x axis represents the time elapsed between posts (in seconds), thus the more vertical the line, the shorter the time between posts of the same link. A lesser substantial trend (given so few data points) is seen in relation to junk news links that reference Jeffrey Epstein and Elijah Cummings, indicating that the “Trump ‘The People’s President’” page may be an influencor of at least two other pages: “Republicans For Trump” and “Trump, American Patriot.”

In the pro-Trump Groups Ecosystem, there were 506 unique links shared from different 16 “junk news” websites. Of the 506 links, 129 were from “neonnettle.com.” Multiple “junk news” websites found in the pro-Trump Pages ecosystem, e.g. “puppetstringnews.com”, “newspunch.com”, “gellerreport.com” - were widely shared in pro-Trump Groups. Three other “junk news” websites also featured prominently:

The links from “junk news” websites that were most shared within the pro-Trump Groups ecosystem were as follows:

A text analysis of the Descriptions accompanying the links highlights Ukraine as a key topic of conversation, specifically in relation to the Bidens. Other “junk news” topics relate to the ongoing impeachment inquiry and two prominent House Democrats - Speaker Nancy Pelosi and Intelligence Committee Chairman Adam Schiff. The Clintons are less prominent topics of “junk news”" circulating in pro-Trump groups.

Of the 226 groups in the pro-Trump Groups ecosystem, at least one “junk news” links was found to have been posted in 162 of them. Two large groups with the name “Drain The Swamp” are particularly rife with junk news links.

A social network analysis of “junk news” links posted within the pro-Trump Groups ecosystem indicates a lot of overlap in “junk news” content from group to group although it is unclear how much coordination, if any, exists.

Note: With consideration to privacy protection of Facebook users, CrowdTangle does not permit you to see which specific user accounts posted to groups or whether the link was shared from a particular page. Thus it is not possible to definitively determine whether specific accounts are engaging in coordinated behavior to amplify content across groups. However, it is possible to see what content is shared within groups by timestamp, providing some insight as to whether there may be some coordination among users across groups.

A further analysis of nine of the top 10 junk news links shared in pro-Trump Groups indicates that there is seemingly little coordination. Instead, the sharing of junk news links, while pervasive, appears to be more or less organic. However, it is not possible to definitively say whether there is some level of coordination among specific group members given that the identities of members are privacy protected.

In the pro-Biden Groups Ecosystem, there were 26 unique links shared from six different “junk news” websites - all of which appear in the pro-Trump Pages or pro-Trump Groups ecosystems. Of the 26 links, 13 were from “neonnettle.com”, seven were from “newspunch.com”, two from “yournewswire.com” and one each from “conservativepost.com”, “gellerreport.com”, “infowars.com”, and “puppetstringnews.com”.

The links from “junk news” websites that were most shared within the pro-Biden Groups ecosystem were as follows:

Of the 77 groups in the pro-Biden Groups ecosystem, at least one “junk news” link was posted in 18 of them. Most links were only posted one time in a single group. In six groups, two “junk news” links were found. The outlier is a group named “Biden for president”, in which 13 different “junk news” links were found.

The “junk news” links in the “Biden for President” group feature links to anti-Biden, anti-Democrat, anti-immigrant and anti-Islam content, which suggests that the “Biden for president” group is actually an anti-Biden group.

---
title: "Junk News Sharing in Pro-Trump & Pro-Biden Facebook Ecosystems"
output: html_notebook
---

I conducted an analysis of how much content from junk news websites is spread by public Facebook pages and groups that support US Presidential Candidates Donald Trump and Joe Biden. For the purposes of this analysis, "junk news" websites are domains that posted false or misleading content in the past according to [FactCheck.org](https://www.factcheck.org/2017/07/websites-post-fake-satirical-stories/) 

To identify pages and groups that support the candidates, I used the CrowdTangle API to search for pages and groups that have shared posts from the official Facebook pages of the candidates any time in the past year (October 2018 and October 2019). Below is the number of pages that were found which comprise the support "ecosystems" of pro-Trump and pro-Biden used in this analysis: 

* 100 pro-Trump Pages (299,996 Posts)
* 216 pro-Trump Groups (299,974 Posts)
* 48 pro-Biden Pages (86,387 Posts)
* 77 pro-Biden Groups (299,995 Posts)

```{r, echo=FALSE, eval=FALSE, eval=FALSE}
#0_LOAD PACKAGES
require("dplyr")
require("stringr")
require("ggplot2")
require("gridExtra")
require("scales")
require("DT")
require("igraph")
require("ggraph")
#require("rsconnect")
```

```{r, echo=FALSE, eval=FALSE, warning=FALSE}
#1_READ IN DATA AND LOAD NEEDED FUNCTIONS
dtpages <- readRDS(file="dtpages_oct18-oct19.rda")
dtgroups <- readRDS(file="dtgroups_oct18-oct19.rda")
jbpages <- readRDS(file="jbpages_oct18-oct19.rda")
jbgroups <- readRDS(file="jbgroups_oct18-oct19.rda")

#Function for Domain Extraction
domain <- function(x) strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]

#Domains to exclude through filtering
exclude <- c("facebook.com", "youtube.com", "youtu.be", "twitter.com", "reuters.com", "trib.al", "bit.ly", "m.youtube.com", "dlvr.it", "t.co", "ow.ly", "buff.ly", "newsteam.ro", "reverbnation.com")

#Domains Known To Peddle "Fake News"
#source = https://www.factcheck.org/2017/07/websites-post-fake-satirical-stories/
fakenews <- c("americasfreedomfighters.com", "americaslastlineofdefense.org", "trumpbetrayed.us", "worstpot.us", "nofakenewsonline.us", "theamericanews.co", "americanjournalreview.com", "americatalks.com", "thepedogate.com", "bannedinformation.com", "blingnews.com", "cbinfo24.com","channel23news.com","conservativeangle.com", "consnation.com", "conservativepost.com","theconservativetreehouse.com", "daily-vine.com", "empirenews.net", "en-volve.com", "fbnewscycle.com", "fellowshipoftheminds.com", "gellerreport.com", "infowars.com", "kagfeed.com", "thelibertyraise.com", "londonwebnews.com", "mminfo24.com", "neonnettle.com", "newspunch.com", "yournewswire.com", "politicops.com", "politicot.com", "nyeveningnews.com", "patriotswalk.us", "policetask.com", "politicsfocus.com", "the-postillon.com", "puppetstringnews.com", "realnewsrightnow.com", "rightwingtribune.com", "rwnofficial.com", "therightists.com", "specialnewsuse.com", "7newspolitical.site", "stgeorgegazette.com", "teddystick.com", "topalertnews.com", "truthfeednews.com", "universaleinfo.com", "ussanews.com", "viralcords.com", "viralnewspbs.site", "viralitythings.us", "webviners.com", "worldnewsdailyreport.com")

text_clean <- function(data, text) {
  require("dplyr")
  text <- enquo(text)
  data <- data %>% select(!!text) %>% filter(!!text != "") #remove empty messages
  data <- data %>% mutate(text=as.character(!!text))
  data <- data %>% mutate(text=gsub("http[^[:space:]]*", '', text))
  data <- data %>% mutate(text=gsub("#\\w+ *", '', text))
  data <- data %>% mutate(text=gsub("@\\w+ *", '', text))
  data <- data %>% mutate(text=gsub("RT\\w+ *:", '', text)) 
  data <- data %>% mutate(text=gsub("[[:digit:]]+", '', text))
  data <- data %>% mutate(text=gsub("[[:punct:]]+", '', text)) 
  data <- data %>% mutate(text=gsub("[^[:alnum:]]",' ', text))
  data <- data %>% filter(text != "") %>% mutate(id = row_number()) %>% transmute(id=id, text=text)  
  return(data)
}

text_tokenize <- function(data, text, stopwords=NULL, n=NULL) {
  require("dplyr")
  require("tidyr")
  require("tidytext")
  text=enquo(text)
  if(is.null(stopwords)) { stopwords <- tidytext::stop_words }
  if(is.null(n)) { n <- 1 }
  tokens <- data %>% unnest_tokens(word, !!text)
  tokens <- tokens %>% filter(!(nchar(word)==1)) %>% anti_join(stopwords) 
  tokens <- tokens %>% mutate(ind=row_number())
  tokens <- tokens %>% group_by(id) %>% mutate(ind=row_number()) %>% spread(key=ind, value=word) 
  tokens[is.na(tokens)] <- ""
  tokens <- unite(tokens, text, -id, sep=" ")
  tokens$text <- trimws(tokens$text)
  tokens <- tokens %>% unnest_tokens(ngram, text, token="ngrams", n=n)
  return(tokens)
}

text_correlation <- function(data, ngram, top=NULL, mentions=NULL) {
  require("dplyr")
  require("widyr")
  ngram <- enquo(ngram)
  if(is.null(top)) { top <- 20 }
  if(is.null(mentions)) { mentions <- 1 }
  top_ngrams <- data %>% group_by(!!ngram) %>% count() %>% arrange(desc(n)) %>% head(top)
  top_ngrams <- top_ngrams$ngram
  correlation <- data %>% group_by(!!ngram) %>% filter(n() >= mentions) %>% pairwise_cor(ngram, id, sort=T)
  top_topics <- correlation %>% filter(item1 %in% top_ngrams) %>% transmute(topic=item1, mention=item2, correlation=correlation)
  return(top_topics)
}

topic_reduce <- function(data, topic, mentions=NULL) {
  require("dplyr")
  topic <- enquo(topic)
  #sort <- enquo(sort)
  #if(is.null(sort)) { sort <- correlation }
  if(is.null(mentions)) { mentions <- 10 }
  clean <- data %>% group_by(!!topic) %>% top_n(mentions, wt=correlation) %>% ungroup() %>% arrange(!!topic, correlation) %>% mutate(.r = row_number())
  return(clean)
}
```

Of the nearly one million posts in the data, here are the number of posts that shared links of any kind (excluding links to Facebook posts): 

* pro-Trump Pages: 50,222 Links
* pro-Trump Groups: 114,032 Links
* pro-Biden Pages: 12,680 Links
* pro-Biden Groups: 128,008 Links

```{r, echo=FALSE, eval=FALSE, warning=FALSE}
#Trump Pages
dtpages_junk_dom <- dtpages %>% filter(Final.Link!="")
dtpages_junk_dom$domain <- sapply(dtpages_junk_dom$Final.Link, domain)
dtpages_junk_dom$domain <- tolower(dtpages_junk_dom$domain)
rm(dtpages)

#Trump Groups
dtgroups_junk_dom <- dtgroups %>% filter(Link!="")
dtgroups_junk_dom$domain <- sapply(dtgroups_junk_dom$Link, domain)
dtgroups_junk_dom$domain <- tolower(dtgroups_junk_dom$domain)
dtgroups_junk_dom_nofb <- dtgroups_junk_dom %>% filter(domain != "facebook.com")
rm(dtgroups)

#Biden Pages
jbpages_junk_dom <- jbpages %>% filter(Final.Link!="")
jbpages_junk_dom$domain <- sapply(jbpages_junk_dom$Final.Link, domain)
jbpages_junk_dom$domain <- tolower(jbpages_junk_dom$domain) 
rm(jbpages)

#Biden Groups
jbgroups_junk_dom <- jbgroups %>% filter(Link!="")
jbgroups_junk_dom$domain <- sapply(jbgroups_junk_dom$Link, domain)
jbgroups_junk_dom$domain <- tolower(jbgroups_junk_dom$domain)
jbgroups_junk_dom_nofb <- jbgroups_junk_dom %>% filter(domain != "facebook.com")
rm(dtgroups)
```
Here are the number of links to "junk news" websites that were found in each ecosystem:

* pro-Trump Pages: 248 Total Links (0.5% of all links)
* pro-Trump Groups: 1,343 Total Links (1.1% of all links)
* pro-Biden Pages: 0 Total Links (0% of all Links)
* pro-Biden Groups: 38 Total Links (0.02% of all links) 

```{r, echo=FALSE, eval=FALSE, warning=FALSE}
#All Junk News Links 
dtpages_junk_dom %>% select(Page.Name, Created, domain, Link.Text, Page.Likes.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(reach=cumsum(Page.Likes.at.Posting), index=seq_along(Link.Text)) %>% arrange(Link.Text, index)

jbpages_junk_dom %>% select(Page.Name, Created, domain, Link.Text, Page.Likes.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(reach=cumsum(Page.Likes.at.Posting), index=seq_along(Link.Text)) %>% arrange(Link.Text, index)
```

In the **pro-Trump Pages Ecosystem**, there were 213 unique links shared from 10 "junk news" websites. Of the 213 links, 163 of them came from the Geller Report, which according to FactCheck.org "is run by Pamela Geller, an anti-Islamic activist." Other sources of "junk news" that were shared were: 

* "puppetstringnews.com", which per its homepage was "created by a US NAVY vet, who served four years in the military, got out because he was tired of the world and it’s current state. Decided to get into alternative news to tell the truth to best of ability."
* "neonnettle.com", which is a British website with an "About Us" page that states: "we believe the mainstream has become less valid as it continues its ongoing practices of censorship and engineered narratives."
* "newspunch.com", which FactCheck.org says "is the new site for Yournewswire.com, which has been a prolific poster of misinformation and conspiracy theories."

The links from "junk news" websites that were most shared within the pro-Trump Pages ecosystem were as follows:

```{r, echo=FALSE, eval=TRUE, warning=FALSE}
#Most Shared "Junk News" Links (n > 1)

#dtpages_junk_dom %>% filter(domain %in% fakenews) %>% group_by(domain) %>% count() %>% arrange(desc(n)) %>% head(25)
oddball <- c("Pamela Geller", "| Neon Nettle", 
"Conservative Post", "Angel", "Dennis", "Larry", "Lee", "Show Archives", "Brian Kolfage")
dtp_dt <- dtpages_junk_dom %>% select(Page.Name, Created, domain, Link.Text, Page.Likes.at.Posting) %>% filter(domain %in% fakenews) %>% filter(!Link.Text %in% oddball) %>% group_by(Link.Text, domain) %>% count() %>% arrange(desc(n))

#dtp_dt %>% group_by(domain) %>% count() %>% arrange(desc(n))

datatable(dtp_dt, options=list(order=list(list(3, 'desc'))))
```

A text analysis of the Descriptions accompanying the links highlights the dominant topics of the "junk news" shared: borders and Islam. Mentions of Islam are correlated with prominent Democratic congresswomen Ilhan Omar and Rashida Tlaib as well as sharia and jihad. 

```{r, echo=FALSE, eval=FALSE, warning=FALSE}
dtpages_junk_dom %>% filter(domain %in% fakenews) %>% select(Link.Text, Description) %>% head(100)

dtpages_junk_linktext <- dtpages_junk_dom %>% filter(domain %in% fakenews) %>% group_by(Page.Name) %>% ungroup() %>% mutate(id=row_number(), Description=as.character(Description)) %>% select(id, Description)
dtpages_junk_linktext <- text_clean(data=dtpages_junk_linktext, text=Description)
dtpages_junk_linktokens <- text_tokenize(data=dtpages_junk_linktext, text=text, stopwords=NULL, n=1) 
#dtpages_linktokens2 <- text_tokenize(data=dtpages_linktext, text=text, stopwords=NULL, n=2) 
dtpages_junk_linkcorrelation <- text_correlation(data=dtpages_junk_linktokens, ngram=ngram, top=20, mentions=5)

#Clean Text for Words That Are Not Relevant
dtpages_junk_linkcorrelation %>% group_by(topic) %>% top_n(n=10, wt=correlation) %>% arrange(topic)
dtpages_junk_linktopics_clean <- dtpages_junk_linkcorrelation %>% filter(!topic %in% c("nettle", "neon", "true", "pundit") | !mention %in%  c("nettle", "neon", "true", "pundit")) 
#dtpages_linkcorrelation2 %>% group_by(topic) %>% top_n(n=10, wt=correlation) %>% arrange(topic)
#Reduce Topics & Prep For Plotting
dtpages_junk_linktopics <- topic_reduce(dtpages_junk_linktopics_clean, topic=topic, mentions=10)
```

```{r, echo=FALSE, warning=FALSE, eval=TRUE}
set.seed(2019)
dtpages_junk_linktopics %>%
  filter(correlation > .13) %>%
  graph_from_data_frame() %>%
  ggraph(layout = "kk") +
  geom_edge_link(aes(edge_alpha = correlation), show.legend = FALSE) +
  geom_node_point(color = "red", size = 2) +
  geom_node_text(aes(label = name), repel = TRUE) +
  theme_void()

```

Of the 100 pages in the pro-Trump Pages ecosystem, 34 of them shared a "junk news" link. The most prolific sharer is "Wake Up America, the Original", which was responsible for nearly one-third of the "junk news" links and has more than 500,000 followers.     

```{r, echo=FALSE, eval=TRUE, warning=FALSE}
#Biggest Sharers of "Junk News Domains" - Arranged by number of posts shared then followers 
dtp_dt_leaders <- dtpages_junk_dom %>% select(Page.Name, Created, domain, Link.Text, Page.Likes.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Page.Name) %>% mutate(Posts=length(Page.Name), Followers=max(Page.Likes.at.Posting)) %>% select(Page.Name, Followers, Posts) %>% distinct() %>% arrange(desc(Posts), desc(Followers))

datatable(dtp_dt_leaders, options=list(order=list(list(3, 'desc'))))


```

Only a handful of junk news links (nine) were shared by multiple pages so it is difficult to assess widespread coordinated behavior among pages. However, a social network analysis of "junk news" links that were shared indicates apparent coordination among four pro-Trump pages to amplify content: 

+ Silence is Consent - US Border Invasion
+ Stand With Illinois Against The Illegal Alien Invasion
+ Stand With North Dakota Against The Illegal Alien Invasion
+ Stand With South Dakota Against The Illegal Alien Invasion

```{r, echo=FALSE, warning=FALSE, eval=FALSE, include=FALSE}
dtpages_junk_facebook <- dtpages_junk_dom %>% filter(domain %in% fakenews) %>% ungroup() %>% select(Link.Text, Page.Name, Created)
dtpages_junk_facebook_network <- dtpages_junk_facebook %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(first=Page.Name, second=lead(Page.Name,1), index=row_number()) %>% filter(second != "" & first != second) %>% ungroup(Link.Text) %>% select(first, second) %>% mutate(second=as.character(second))
#dtpages_junk_facebook_network %>% head(20); #dim(dtgroups_junk_facebook_network)
dtpages_junk_counts <- dtpages_junk_facebook_network %>% count(first) %>% arrange(desc(n))
#dtpages_junk_counts %>% head(20)
dtpages_junk_correlations <- dtpages_junk_facebook_network %>% semi_join(dtpages_junk_counts)
#dtpages_junk_correlations %>% head(20)
dtpages_junk_correlations1 <- dtpages_junk_correlations %>% pairwise_cor(first, second, sort=T, upper=F)
dtpages_junk_correlations1
```

```{r, echo=FALSE, warning=FALSE, eval=FALSE}
set.seed(2019)
dtpages_junk_correlations1 %>%
  filter(correlation > .15) %>%
  graph_from_data_frame(directed=T) %>%
  ggraph(layout = "kk") +
  geom_edge_link(aes(edge_alpha = correlation), show.legend = FALSE) +
  geom_node_point(color = "red", size = 2) +
  geom_node_text(aes(label = name), repel = TRUE) +
  theme_void()

#dtpages_junk_correlations1 %>%
#  filter(correlation > .2) %>%
#  graph_from_data_frame(directed=T) %>%
#  ggraph(layout = "linear") +
#  geom_edge_arc(aes(alpha=correlation), color="gray", width=0.5) +
  #geom_edge_arc(edge_colour="black", edge_alpha=0.2, edge_width=0.3, fold=T) +#..index..)) +
  #geom_edge_link(aes(edge_alpha = correlation), show.legend = FALSE) +
#  geom_node_point(color = "red", size = 1) +
#  geom_node_text(aes(label=name), angle=90, hjust=0.1, nudge_y = -0.5, size=1, color="red") +
#  theme(axis.title.x = element_blank(),
#          axis.text.x = element_blank(),
#          axis.title.y = element_blank(),
#          axis.text.y = element_blank(),
#          legend.title= element_blank(),
#          legend.text = element_blank(),
#          legend.position = "none",
#          plot.title = element_text(face = "bold", size=8),
#          plot.subtitle = element_text(face = "bold", size=8),
#          plot.caption = element_text(face = "italic", size=8),
#          panel.grid.minor=element_blank(),
#          panel.grid.major=element_blank(),
#          panel.background = element_blank(), 
#          axis.line = element_blank(),
#          axis.ticks = element_blank())
```

A time-series analysis indicates that the "Stand With [blank] Against the Illegal Alien Invasion" pages are clearly coordinating to share junk news links related to immigration. The x axis represents the time elapsed between posts (in seconds), thus the more vertical the line, the shorter the time between posts of the same link. A lesser substantial trend (given so few data points) is seen in relation to junk news links that reference Jeffrey Epstein and Elijah Cummings, indicating that the "Trump 'The People's President'" page may be an influencor of at least two other pages: "Republicans For Trump" and "Trump, American Patriot." 
```{r, echo=FALSE, warning=FALSE, eval=TRUE}
#, elapsed=ifelse(is.na(elapsed), 0, elapsed)) %>% arrange(Link.Text, index)  
dtpages_junk_plot <- dtpages_junk_dom %>% select(Page.Name, Created, domain, Link.Text, Page.Likes.at.Posting) %>% filter(domain %in% fakenews & !Link.Text %in% c("Pamela Geller", "Conservative Post")) %>% group_by(Link.Text, domain) %>% arrange(Created) %>% mutate(reach=cumsum(Page.Likes.at.Posting), index=seq_along(reach), n=max(index)) %>% filter(n > 1) %>% mutate(Article=ifelse(index==n, as.character(Link.Text), "")) 

dtpages_junk_plot <- dtpages_junk_plot %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(previous=lag(Created, 1), elapsed=Created-lag(Created,1), elapsed=ifelse(is.na(elapsed), 0, as.integer(elapsed)), time=cumsum(elapsed), Name=Page.Name)
#dtpages_junk_plot

dtpages_junk_plot %>% ggplot(aes(x=time, y=reach)) +
  geom_jitter(aes(color=domain)) + 
  geom_path(aes(colour=domain), size=0.5, alpha=0.5) +
  geom_text(aes(x=time, y=reach, label=Name), size=1, hjust="inward") +
  #xlim(0, 5000000) +
  labs(x="Time Elapsed (in Seconds)", y="Followers", colour="Domain") +  
  facet_wrap(~ Link.Text, scales="free_y", labeller = label_wrap_gen(width = 75, multi_line =T)) +
  scale_x_continuous(labels = comma) +
  scale_y_continuous(labels = comma) +
  theme(axis.title.x = element_text(face = "bold", size=4),
          axis.text.x = element_text(face = "bold", size=4),
          axis.title.y = element_text(face = "bold", size=4),
          axis.text.y = element_text(face = "bold", size=4),
          strip.text = element_text(size = 3),
          #panel.margin.y = unit(-3, 'cm'),
          legend.title=element_text(face = "bold", size=4),
          legend.text = element_text(face = "bold", size=4),
          legend.position = "none",
          plot.title = element_blank(),
          plot.subtitle = element_text(face = "bold", size=4),
          plot.caption = element_text(face = "italic", size=4),
          #panel.grid.minor=element_blank(),
          #panel.grid.major=element_blank(),
          panel.background = element_blank(), 
          #axis.line = element_blank(),
          #plot.margin = margin(-0.25, -0.25, -0.25, -0.25, "cm"),
          axis.ticks.y=element_blank()) 

```

In the **pro-Trump Groups Ecosystem**, there were 506 unique links shared from different 16 "junk news" websites. Of the 506 links, 129 were from "neonnettle.com." Multiple "junk news" websites found in the pro-Trump Pages ecosystem, e.g. "puppetstringnews.com", "newspunch.com", "gellerreport.com" - were widely shared in pro-Trump Groups. Three other "junk news" websites also featured prominently:   

* "conservativetreehouse.com", which notes on its "About Us" page that “fear is at the core of liberalism, and love/trust is at the core of conservatism. Liberalism is about control. Conservatism is about self-empowerment.”
* "infowars.com", which is the website for the popular conspiracy theory radio show by the same name hosted by Alex Jones. 
* "rightwingtribune.com", which FactCheck.org says "doesn’t have a disclaimer, but it does say on one of at least five associated Facebook pages: 'If You Are A Right Winger And PISSED You’re In The Right Place. We Provide High Quality Journalism With Viral Memes and Video In Between! So Give Us A Like! You Won’t Be Dissapointed.'"

The links from "junk news" websites that were most shared within the pro-Trump Groups ecosystem were as follows:

```{r, echo=FALSE, eval=TRUE, warning=FALSE}
#Most Shared "Junk News" Links (n > 1)
#dtgroups_junk_dom_nofb %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(reach=cumsum(Members.at.Posting), index=seq_along(Link.Text)) %>% arrange(Link.Text, index)

#dtgroups_junk_dom_nofb %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Link.Text) %>% count() %>% arrange(desc(n))

#dtgroups_junk_dom %>% filter(domain %in% fakenews) %>% group_by(domain) %>% count() %>% arrange(desc(n)) %>% head(25)
oddball <- c("Pamela Geller", "| Neon Nettle", "Conservative Post", "Angel", "Dennis", "Larry", "Lee", "Show Archives", "Brian Kolfage", "Dina", "Dino", "Lisa", "Chon", "Merrill", "Pat", "Sarah", "Bill", "Daniel", "Elizabeth", "Micky", "Kimberley", "Ron", "Darlene", "Jerry", "John", "Mark", "Dan", "Frank", "Gayle", "Gene", "Gregg", "Karen",  "Linda", "Michealina", "Roxanne", "Theresa", "Alice", "Alicia", "Astrid", "Cathy", "Cher", "Clint", "David", "Daniel", "Denise", "Diane", "Donna", "Doug", "Gary", "Gennady", "Helen", "Holton", "James-Christina", "Jane", "Jennifer", "Jeri", "Jewel", "Jose",  "Katie", "Ken", "Kenneth", "Kevin", "Marie", "Mary", "Mel", "Michael",  "Monty", "Patrick", "Patty", "Ralph William", "Randy", "Senceria", "Salman", "Sharon", "Stacie", "Shankar", "Stephen", "Steven", "Steve", "Theresa", "Tommy", "Treg", "Walter")
dtg_dt <- dtgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% filter(!Link.Text %in% oddball) %>% group_by(Link.Text, domain) %>% count() %>% arrange(desc(n))

#dtg_dt %>% group_by(domain) %>% count() %>% arrange(desc(n))

datatable(dtg_dt, options=list(order=list(list(3, 'desc'))))
```

A text analysis of the Descriptions accompanying the links highlights Ukraine as a key topic of conversation, specifically in relation to the Bidens. Other “junk news” topics relate to the ongoing impeachment inquiry and two prominent House Democrats - Speaker Nancy Pelosi and Intelligence Committee Chairman Adam Schiff. The Clintons are less prominent topics of "junk news"" circulating in pro-Trump groups. 

```{r, include=FALSE}
#dtgroups_junk_dom %>% filter(domain %in% fakenews) %>% select(Link.Text, Description) %>% head(100)

dtgroups_junk_linktext <- dtgroups_junk_dom %>% filter(domain %in% fakenews & Link.Text !="Petition to ‘Impeach Nancy Pelosi for Treason’ Goes Viral") %>% group_by(Group.Name) %>% ungroup() %>% mutate(id=row_number(), Description=as.character(Description)) %>% select(id, Description)
dtgroups_junk_linktext <- text_clean(data=dtgroups_junk_linktext, text=Description)
dtgroups_junk_linktokens <- text_tokenize(data=dtgroups_junk_linktext, text=text, stopwords=NULL, n=1) 
#dtgroups_linktokens2 <- text_tokenize(data=dtgroups_linktext, text=text, stopwords=NULL, n=2) 
dtgroups_junk_linkcorrelation <- text_correlation(data=dtgroups_junk_linktokens, ngram=ngram, top=20, mentions=10)

#Clean Text for Words That Are Not Relevant
#dtgroups_junk_linkcorrelation %>% group_by(topic) %>% top_n(n=10, wt=correlation) %>% arrange(topic)
dtgroups_junk_linktopics_clean <- dtgroups_junk_linkcorrelation %>% filter(!topic %in% c("nettle", "neon", "news", "report") | !mention %in%  c("gateway", "pundit", "nettle", "neon", "true", "pundit")) 
#dtgroups_linkcorrelation2 %>% group_by(topic) %>% top_n(n=10, wt=correlation) %>% arrange(topic)
#Reduce Topics & Prep For Plotting
dtgroups_junk_linktopics <- topic_reduce(dtgroups_junk_linktopics_clean, topic=topic, mentions=10)

```

```{r, echo=FALSE, warning=FALSE, eval=TRUE}
set.seed(2019)
dtgroups_junk_linktopics %>%
  filter(correlation > .35) %>%
  graph_from_data_frame() %>%
  ggraph(layout = "kk") +
  geom_edge_link(aes(edge_alpha = correlation), show.legend = FALSE) +
  geom_node_point(color = "red", size = 2) +
  geom_node_text(aes(label = name), repel = TRUE) +
  theme_void()

```

Of the 226 groups in the pro-Trump Groups ecosystem, at least one "junk news" links was found to have been posted in 162 of them. Two large groups with the name "Drain The Swamp" are particularly rife with junk news links.

```{r, echo=FALSE, eval=TRUE, warning=FALSE}
#Biggest Sharers of "Junk News Domains" - Arranged by number of posts shared then followers 
dtg_dt_leaders <- dtgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Group.Name) %>% mutate(Posts=length(Group.Name), Members=max(Members.at.Posting)) %>% select(Group.Name, Members, Posts) %>% distinct() %>% arrange(desc(Posts), desc(Members))

datatable(dtg_dt_leaders, options=list(order=list(list(3, 'desc'))))

```

A social network analysis of "junk news" links posted within the pro-Trump Groups ecosystem indicates a lot of overlap in "junk news" content from group to group although it is unclear how much coordination, if any, exists.   

Note: With consideration to privacy protection of Facebook users, CrowdTangle does not permit you to see which specific user accounts posted to groups or whether the link was shared from a particular page. Thus it is not possible to definitively determine whether specific accounts are engaging in coordinated behavior to amplify content across groups. However, it is possible to see what content is shared within groups by timestamp, providing some insight as to whether there may be some coordination among users across groups.

```{r, include=FALSE}
dtgroups_junk_facebook <- dtgroups_junk_dom %>% filter(domain %in% fakenews) %>% ungroup() %>% select(Link.Text, Group.Name, Created)
dtgroups_junk_facebook_network <- dtgroups_junk_facebook %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(first=Group.Name, second=lead(Group.Name,1), index=row_number()) %>% filter(second != "" & first != second) %>% ungroup(Link.Text) %>% select(first, second) %>% mutate(second=as.character(second))
#dtpages_junk_facebook_network %>% head(20); #dim(dtgroups_junk_facebook_network)
dtgroups_junk_counts <- dtgroups_junk_facebook_network %>% count(first) %>% arrange(desc(n))
#dtpages_junk_counts %>% head(20)
dtgroups_junk_correlations <- dtgroups_junk_facebook_network %>% semi_join(dtgroups_junk_counts)
dtgroups_junk_correlations %>% head(20)
dtgroups_junk_correlations1 <- dtgroups_junk_correlations %>% pairwise_cor(first, second, sort=T, upper=F)
```

```{r, echo=FALSE, warning=FALSE, eval=TRUE}
set.seed(2019)
dtgroups_junk_correlations1 %>%
  filter(correlation > .2) %>%
  graph_from_data_frame(directed=T) %>%
  ggraph(layout = "linear") +
  geom_edge_arc(aes(alpha=correlation), color="gray", width=0.5) +
  #geom_edge_arc(edge_colour="black", edge_alpha=0.2, edge_width=0.3, fold=T) +#..index..)) +
  #geom_edge_link(aes(edge_alpha = correlation), show.legend = FALSE) +
  geom_node_point(color = "red", size = 1) +
  geom_node_text(aes(label=name), angle=90, hjust=1, nudge_y = -0.5, size=1, color="red") +
  theme(axis.title.x = element_blank(),
          axis.text.x = element_blank(),
          axis.title.y = element_blank(),
          axis.text.y = element_blank(),
          legend.title= element_blank(),
          legend.text = element_blank(),
          legend.position = "none",
          plot.title = element_text(face = "bold", size=8),
          plot.subtitle = element_text(face = "bold", size=8),
          plot.caption = element_text(face = "italic", size=8),
          panel.grid.minor=element_blank(),
          panel.grid.major=element_blank(),
          panel.background = element_blank(), 
          axis.line = element_blank(),
          axis.ticks = element_blank())

```
```{r, include=FALSE}
dtgroups_junk_plot <- dtgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews & !Link.Text %in% c("Sarah", "Dino", "Pat", "Lisa")) %>% group_by(Link.Text, domain) %>% arrange(Created) %>% mutate(reach=cumsum(Members.at.Posting), index=seq_along(reach), n=max(index)) %>% filter(n > 1) %>% mutate(Article=ifelse(index==n, as.character(Link.Text), "")) 

dtgroups_junk_plot <- dtgroups_junk_plot %>% group_by(Link.Text) %>% arrange(Created) %>% mutate(previous=lag(Created, 1), elapsed=Created-lag(Created,1), elapsed=ifelse(is.na(elapsed), 0, as.integer(elapsed)), time=cumsum(elapsed), Name=Group.Name)
```
A further analysis of nine of the top 10 junk news links shared in pro-Trump Groups indicates that there is seemingly little coordination. Instead, the sharing of junk news links, while pervasive, appears to be more or less organic. However, it is not possible to definitively say whether there is some level of coordination among specific group members given that the identities of members are privacy protected.

```{r, echo=FALSE, warning=FALSE, eval=TRUE}
topdtjunk <- dtgroups_junk_plot %>% group_by(Link.Text) %>% count() %>% arrange(desc(n)) %>% head(10)
topdtjunk <- topdtjunk[-1,]
topdtjunk <- topdtjunk$Link.Text

dtgroups_junk_plot %>% filter(Link.Text %in% topdtjunk) %>% ggplot(aes(x=time, y=reach)) +
  geom_jitter(aes(color=domain)) + 
  geom_path(aes(colour=domain), size=0.5, alpha=0.5) +
  geom_text(aes(x=time, y=reach, label=Name), size=1, hjust="inward") +
  #xlim(0, 5000000) +
  labs(x="Time Elapsed (in Seconds)", y="Followers", colour="Domain") +  
  facet_wrap(~ Link.Text, scales="free_y", labeller = label_wrap_gen(width = 75, multi_line =T)) +
  scale_x_continuous(labels = comma) +
  scale_y_continuous(labels = comma) +
  theme(axis.title.x = element_text(face = "bold", size=4),
          axis.text.x = element_text(face = "bold", size=4),
          axis.title.y = element_text(face = "bold", size=4),
          axis.text.y = element_text(face = "bold", size=4),
          strip.text = element_text(size = 3),
          #panel.margin.y = unit(-3, 'cm'),
          legend.title=element_text(face = "bold", size=4),
          legend.text = element_text(face = "bold", size=4),
          legend.position = "none",
          plot.title = element_blank(),
          plot.subtitle = element_text(face = "bold", size=4),
          plot.caption = element_text(face = "italic", size=4),
          #panel.grid.minor=element_blank(),
          #panel.grid.major=element_blank(),
          panel.background = element_blank(), 
          #axis.line = element_blank(),
          #plot.margin = margin(-0.25, -0.25, -0.25, -0.25, "cm"),
          axis.ticks.y=element_blank()) 

```

In the **pro-Biden Groups Ecosystem**, there were 26 unique links shared from six different "junk news" websites - all of which appear in the pro-Trump Pages or pro-Trump Groups ecosystems. Of the 26 links, 13 were from "neonnettle.com", seven were from "newspunch.com", two from "yournewswire.com" and one each from "conservativepost.com", "gellerreport.com", "infowars.com", and "puppetstringnews.com".

The links from "junk news" websites that were most shared within the pro-Biden Groups ecosystem were as follows:

```{r, echo=FALSE, warning=FALSE, eval=TRUE}
#All Junk News Links 
#jbgroups_junk_dom_nofb %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% group_by(Link.Text) %>% filter(!Group.Name %in% c("We Are Q", "UNITED DEMOCRATS, REPUBLICANS & LIBERTARIANS 4 TRUMP (taking back the USA)")) %>% arrange(Created) %>% mutate(reach=cumsum(Members.at.Posting), index=seq_along(Link.Text)) %>% arrange(Link.Text, index)

oddball <- c("Pamela Geller", "| Neon Nettle", "Conservative Post", "Angel", "Dennis", "Larry", "Lee", "Show Archives", "Brian Kolfage", "Dina", "Dino", "Lisa", "Chon", "Merrill", "Pat", "Sarah", "Bill", "Daniel", "Elizabeth", "Micky", "Kimberley", "Ron", "Darlene", "Jerry", "John", "Mark", "Dan", "Frank", "Gayle", "Gene", "Gregg", "Karen",  "Linda", "Michealina", "Roxanne", "Theresa", "Alice", "Alicia", "Astrid", "Cathy", "Cher", "Clint", "David", "Daniel", "Denise", "Diane", "Donna", "Doug", "Gary", "Gennady", "Helen", "Holton", "James-Christina", "Jane", "Jennifer", "Jeri", "Jewel", "Jose",  "Katie", "Ken", "Kenneth", "Kevin", "Marie", "Mary", "Mel", "Michael",  "Monty", "Patrick", "Patty", "Ralph William", "Randy", "Senceria", "Salman", "Sharon", "Stacie", "Shankar", "Stephen", "Steven", "Steve", "Theresa", "Tommy", "Treg", "Walter", "Willis")
jbg_dt <- jbgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% filter(!Link.Text %in% oddball & !Group.Name %in% c("We Are Q", "UNITED DEMOCRATS, REPUBLICANS & LIBERTARIANS 4 TRUMP (taking back the USA)")) %>% group_by(Link.Text, domain) %>% count() %>% arrange(desc(n))

#jbgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews) %>% filter(!Link.Text %in% oddball & !Group.Name %in% c("We Are Q", "UNITED DEMOCRATS, REPUBLICANS & LIBERTARIANS 4 TRUMP (taking back the USA)")) %>% ungroup() %>% select(domain, Link.Text) %>% distinct() %>% group_by(domain) %>% count() %>% arrange(desc(n))

datatable(jbg_dt, options=list(order=list(list(3, 'desc'))))
```

Of the 77 groups in the pro-Biden Groups ecosystem, at least one "junk news" link was posted in 18 of them. Most links were only posted one time in a single group. In six groups, two "junk news" links were found. The outlier is a group named "Biden for president", in which 13 different "junk news" links were found.    
```{r, echo=FALSE, eval=TRUE, warning=FALSE}
#Biggest Sharers of "Junk News Domains" - Arranged by number of posts shared then followers 
jbg_dt_leaders <- jbgroups_junk_dom %>% select(Group.Name, Created, domain, Link.Text, Members.at.Posting) %>% filter(domain %in% fakenews & !Link.Text %in% oddball & !Group.Name %in% c("We Are Q", "UNITED DEMOCRATS, REPUBLICANS & LIBERTARIANS 4 TRUMP (taking back the USA)")) %>% group_by(Group.Name) %>% mutate(Posts=length(Group.Name), Members=max(Members.at.Posting)) %>% select(Group.Name, Members, Posts) %>% distinct() %>% arrange(desc(Posts), desc(Members))

datatable(jbg_dt_leaders, options=list(order=list(list(3, 'desc'))))

```

The "junk news" links in the "Biden for President" group feature links to  anti-Biden, anti-Democrat, anti-immigrant and anti-Islam content, which suggests that the "Biden for president" group is actually an anti-Biden group. 

```{r, include=FALSE}
jbg_dt_b4p <- jbgroups_junk_dom %>% filter(domain %in% fakenews & Group.Name=="Biden for president" & !Link.Text=="Ralph William") %>% arrange(Created) %>% select(Link.Text, Link) 
```

```{r, echo=FALSE, eval=TRUE, warning=FALSE}
datatable(jbg_dt_b4p, options=list(order=list(list(1, 'desc'))))

```
