library(tidyverse)
library(rtweet)
library(gridExtra)

1 Introduction

In this document, I will scrape and analyse data from Twitter accounts which threatened British politicians. This document draws on publicly available Twitter data which were valid as of 25-26/10/2018, when the data collection was completed.

2 Data I/O

In this section, we will import and skim data sets that will be used in this document.

2.1 Threat-maker Accounts

Below is a list of account who threatened British politicians, provided by Mark. There are 50 accounts in the table but there are duplicates (as some accounts seemed to have threatened more than one politicians). Overall, there are 45 unique accounts.

abusers_list <- read_csv("data/abusers_list.csv")
abusers_list

2.2 Timelines of Threat-makers

We scraped timelines of threat-makers (referred to as abusers in code as Twitter T&Cs classifies such behavior as ‘abuse’ ), which means we captured up to 3,200 statuses posted to the timelines of each specified Twitter user. Four things worthy of note here:

Retweets are included in this data set.
Number of tweets captured in this data set differs for each account as some use Twitter more actively than others. For instance, if an account has posted 2000 tweets to date, only 2000 tweets will be captured for that account. For accounts which has more than 3200 tweets, our method would capture last 3200 tweets.
In relation to second point, how far tweet collection goes back for each account depends on a) each account’s creation date and b) each account’s tweet and retweet rate. For example, if an account was created in 2011 and tweeted 3000 tweets to date, our collection method would capture all publicly available tweets published from that particular account from 2011 to date. However, if another account was created in 2016 and published 37000 tweets to date, our method would return last 3200 publicly available tweets. This can be a month, a year, or even ten years, as explained above.
It is not possible to scrape timelines of all accounts. Some delete their accounts, some get suspended/removed by Twitter and some go protected.

abusers_timelines <- read_twitter_csv("data/abusers_timelines.csv") %>% 
  distinct(status_id, .keep_all=T)
abusers_timelines %>% nrow()

## [1] 79232

Our collection returned 79232 tweets in total, which can be found in ‘abusers_timelines.csv’ file. Let’s briefly check out this data set. The number of captured tweets and the date of oldest tweet included in the data set for each account are presented below.

abusers_timelines %>% 
  group_by(screen_name) %>% 
  arrange(created_at) %>%
  summarise(number_of_tweets=n(), first_tweet=first(created_at), last_tweet=last(created_at)) %>% 
  arrange(desc(number_of_tweets)) # tweets from 29 accounts only (30 active-1 protected)

As one can see, above we have managed to collect tweets from only 29 accounts. Out of 45 unique threat-makers, 29 are still active on Twitter, one is protected (hence their data are not publicly available), and 15 are not active on Twitter anymore. As observed from the recent dates in the column last_tweet, most accounts are still using Twitter actively. However, some seems to have stopped tweeting a few months back. On the other hand, as observed from the values in the column first_tweet, most accounts seem to be posting tweets very frequently (as 3200 tweets covers a time period of only a month or less). However, some accounts seem to tweet less frequently as their first_tweets in the data set seems to be from 2010, 2013, 2015 etc.

2.3 Threat-makers Account Info

Lets check the account information of threat-makers. Below is a data set of publicly available account information from Twitter belonging to the threat-makers. Same information is also available in csv file ‘abusers_list_info.csv’.

abusers_list_info  <- read_csv("data/abusers_list_info.csv")
abusers_list_info

As one can see, most accounts express their political opinions on their profiles explicitly (see the name and description columns, especially hashtags are informative). Most accounts self-report their location as England.

2.4 Followers and Friends of Threat-makers

Using Twitter API, we have scraped followers and friends (accounts which are being followed by) of threat-makers.

The total number of followers were 75009. This data set is available as a csv file abusers_followers_with_info.csv.

abusers_followers_with_info <- read_twitter_csv("data/abusers_followers_with_info.csv") %>% 
  filter(!is.na(abuser))
abusers_followers_with_info %>% nrow()

## [1] 75009

On the other hand, the total number of friends (i.e. accounts being followed by threat-maker accounts) is 76234. This data set is available as a csv file

abusers_friends_with_info <- read_twitter_csv("data/abusers_friends_with_info.csv") %>% 
  filter(!is.na(user))
abusers_friends_with_info %>% nrow()

## [1] 76234

3 Analysis

3.1 Hastags of Interest

Using the timelines data set (see the Timelines of Threat-makers section above), we pattern matched tweet texts with a pattern of the following hashtags.

“#BrexitLordsBetrayal”
“#LordsOfBrexit”
“#BrexitMeansBrexit”
“#StandUp4Brexit”
“#ChequersmeansCorbyn”

hashtag_pattern <- c("BrexitLordsBetrayal","LordsOfBrexit","BrexitMeansBrexit","StandUp4Brexit","ChequersmeansCorbyn")

hashtag_subset <- abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(hashtag_pattern,collapse = "|"),ignore_case = T)))
hashtag_subset %>% nrow()#2071

## [1] 1147

This pattern matched with 1147 tweets in the timelines. We can also check which accounts pushed these hashtags more frequently in their timelines.

hashtag_subset %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who is Pushing the Hastags of Interest More?",
       subtitle="Hashtags of interest: #BrexitLordsBetrayal, #LordsOfBrexit, #BrexitMeansBrexit,\n#StandUp4Brexit,#ChequersmeansCorbyn",
       caption="Social Data Science Lab, Cardiff University",
       x="Twitter Handle",
       y="Number of Hashtags Used")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))

ggsave(filename = "viz/hashtag_abuser_plot.pdf",device = cairo_pdf, width =9 , height =6 ,dpi = 600,scale =1.5)

We can also see the frequency of each hashtag. As below plot indicates, #StandUp4Brexit is much more frequent than other hashtags of interest.

hashtag_subset %>% 
  mutate( freq_BrexitLordsBetrayal=str_count(text, "BrexitLordsBetrayal"),
          freq_LordsOfBrexit=str_count(text, "LordsOfBrexit"),
          freq_BrexitMeansBrexit=str_count(text, "BrexitMeansBrexit"),
          freq_StandUp4Brexit=str_count(text, "StandUp4Brexit"),
          freq_ChequersmeansCorbyn=str_count(text, "ChequersmeansCorbyn")
  ) %>% 
  select(screen_name, text, starts_with("freq_")) %>%
  summarise(sum_BrexitLordsBetrayal=sum(freq_BrexitLordsBetrayal), 
            sum_LordsOfBrexit=sum(freq_LordsOfBrexit),
            sum_BrexitMeansBrexit=sum(freq_BrexitMeansBrexit),
            sum_StandUp4Brexit=sum(freq_StandUp4Brexit),
            sum_ChequersmeansCorbyn=sum(freq_ChequersmeansCorbyn)
            ) %>% 
  reshape2::melt() %>% 
  select(hashtag=variable, n_occurance=value) %>% 
  mutate(hashtag=str_replace(hashtag, "sum_", "#")) %>% 
  arrange(desc(n_occurance)) %>%
  mutate(hashtag = fct_reorder(hashtag, n_occurance, .desc = F)) %>%
  ggplot(aes(hashtag, n_occurance)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Which Hashtag Was Used More Frequently?",
       subtitle="Hashtags of interest: #BrexitLordsBetrayal, #LordsOfBrexit, #BrexitMeansBrexit,\n#StandUp4Brexit,#ChequersmeansCorbyn",
       caption="Social Data Science Lab, Cardiff University",
       x="Hashtag",
       y="Number of Occurance in Timelines")+
  geom_text(aes(label =n_occurance, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))

ggsave(filename = "viz/hashtag_freq_plot.pdf",device = cairo_pdf, width =9 , height =6 ,dpi = 600,scale =1.5)

3.2 Companies of Interest

Another aim of the analysis was to find out out if threat-makers are following and/or engaging with specific companies and their websites. In order to do this, we created a pattern for each company or website. Using these patterns, we looked for the matches in tweet text and/or URLs attached to the tweet. For the URLs, we used the column named urls_expanded_url to match with patterns because this column contains fully expanded URL destination, which would match with the pattern, if threat-makers shared URLs from companies of interest.

3.2.1 Voter Consultancy LTD

https://www.vc-l.co.uk https://www.facebook.com/VoterConsultancy/

pattern_voter_consultancy <- c("Voter Consultancy", "vc-l.co.uk", "VoterConsultancy",
                               "https://www.facebook.com/VoterConsultancy/")  
#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_voter_consultancy,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_voter_consultancy,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.2.2 Disruptive Communications

Disruptive Analytica https://www.disruptiveanalytica.com

pattern_distruptive_communications <- c("Disruptive Communications", "Disruptive Analytica", "disruptiveanalytica")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_distruptive_communications,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_distruptive_communications,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.2.3 Kanto Systems AND Kanto Elects

https://www.kan.to https://www.facebook.com/KantoSystems/

pattern_kanto <- c("Kanto Systems", "Kanto Elects", "kan.to","KantoSystems","KantoElects")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_kanto,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 2

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_kanto,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.2.4 Conservative Support LTD

pattern_conservative_support <- c("bconservative support ", "ConservativeSupport")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_conservative_support,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 6

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_conservative_support,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_conservative_support,collapse = "|"),ignore_case = T))) %>% 
  select(text, everything())

No match!

3.2.5 Brexitrealities.com

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("brexitrealities",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste("brexitrealities",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.2.6 Logical Campaign

AKA Campaign Logic - www.logicalcampaign.com. (NB no longer active)

pattern_logical_campaign <- c("Logical Campaign", "LogicalCampaign","Campaign Logic", "CampaignLogic" )

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_logical_campaign,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_logical_campaign,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.2.7 Stand up 4 Brexit

Standup4Brexit.com

pattern_Standup4Brexit <- c("Stand up 4 Brexit", "Stand up 4 Brexit", "Standup4Brexit", "StandupforBrexit")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_Standup4Brexit,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 1057

## [1] 1057

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_Standup4Brexit,collapse = "|"),ignore_case = T))) %>%
  nrow() # 21

## [1] 21

1057 matches in text and 21 matches in URL columns. Many occurrences in text column are expected as the pattern used here is also one of the most popular hashtag of interest in this analysis. Contrary to previous companies of interests, Standup4Brexit pattern also matched with 21 URLs. The latter subset is presented below.

abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_Standup4Brexit,collapse = "|"),ignore_case = T))) %>% 
  select(created_at, screen_name, urls_expanded_url, text, everything())

As can be seen, almost all URL matches are references to a Twitter user called “@StandUp4Brexit”. On the other hand, https://standup4brexit.com/ occurred 3 times in the data set.

3.2.8 Fake Fixed

https://www.fakefixed.com https://www.facebook.com/FixingFakeNews/ https://www.facebook.com/fakefixed/

pattern_fake_fixed <- c("fakefixed", "FixingFakeNews")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_fake_fixed,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_fake_fixed,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No match!

3.3 Social Media Pages of Interest

One of the aims of the analysis was to find out out if threat-makers are engaging with specific social media groups and websites. Similar to Companies of Interest section above, we created a pattern for each social media group or website. Using these patterns, we looked for the matches in tweet text and/or URLs attached to the tweet. For the URLs, we used the column named urls_expanded_url to match with patterns because this column contains fully expanded URL destination, which would match with the pattern, if threat-makers shared URLs from social media groups and websites of interest.

3.3.1 Leave Means Leave

https://www.facebook.com/LeaveMnsLeave/?hc_ref=ARRqiIBUV-HFreUnMCt38KcFZbRqM-i4I1akw2y9w1uMobaDgqoRqVO35tz5abuhIZo&fref=nf&__xts__[0]=68.ARAUDRgjAp3cXjblFdJiJpIBwN8bLVvR16i-Wuyw-G8JcY6SnKLBqNqCs8ZfaJE3m2pIArHYGB9jgFX8-woNbyAiDZEhkqH3avCNjdAY-KCpQfn85iGsTekuRfK1ztRGAlgfE8Am1qOCtTavD86KuaUOaIFQBZp2jgQA5wDtjbeoOjrJFkYVrg&__tn__=kC-R

As Leave Means Leave has a Twitter account (i.e. @LeaveMnsLeave), in addition to looking for matches in text and expanded URLs, we will also look for matches in retweets and quotes.

pattern_leave_means_leave <- c("@LeaveMnsLeave", "leavemeansleave", "LeaveMnsLeave")

#text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 124

## [1] 124

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 7

## [1] 17

#retweets
abusers_timelines %>% 
  filter(retweet_screen_name %in% "LeaveMnsLeave") %>% 
  nrow() # 168

## [1] 168

#quotes
abusers_timelines %>% 
  filter(quoted_screen_name %in% "LeaveMnsLeave") %>% 
  nrow() # 2

## [1] 2

#mentions
abusers_timelines %>% 
  filter(str_detect(mentions_screen_name, "LeaveMnsLeave")) %>% 
  nrow() # 252

## [1] 252

124 matches in text column, 17 matches in URL column, 168 retweets, 2 quotes, and 252 mentions. Leave Means Leave is one of the most popular SM pages among threat-makers. It is somewhat not surprising as @LeaveMnsLeave has a Twitter presence and they organised offline events which threat-makers promoted and/or engaged with on Twitter.

Two things to note here. First, there is an overlap between text matches subset and mentions matches subset. Second, mentions matches returns a larger subset than text matches. These two observations are expected. When Twitter first introduced @mentioning feature, users could only mention others by typing ‘@screen_name’ inside tweet text, eating away precious 140 character limit. However, in 2017, Twitter made some changes in the API and moved @mention feature outside tweet text. After the changes, Twitter API allows users to tag up to 50 users w/o eating away from the character limit (which is 280 now). Despite the changes in the API, Twitter did not force the feature upon users. Some Twitter users still prefer to @mention others by typing @screen_names inside tweet text.

On the other hand, it is important to note that @LeaveMnsLeave were retweeted 168 times by threat-makers. Furthermore, 17 Leave Means Leave URLs were present in the data set, which is also more than other companies/SM groups of interest. All five subsets are provided below.

Subset for Matches in Text:

abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(created_at, screen_name, urls_expanded_url, text, everything())

Subset for matches in URL:

abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(created_at, screen_name, urls_expanded_url, text, everything())

Out of 17 URLs matched with LML patter, 9 were tweets from @LeaveMnsLeave account. On the other hand 7 URLs were links to events section of Leave Means Leave (LML)‘s webpage (i.e. https://www.leavemeansleave.eu/events/). Interestingly, 3 different threat-makers attached LML’s events page to their tweets. When corresponding tweets are inspected, one can see that these URLs were used to promote LML’s rallies across the country and aimed to increase attendance by using inflammatory and provocative language. Even the PM had taken a beating (see the hashtag #MayMustGoNow). This is an important observation which indicates threat-makers’ actively aimed to mobilise their Twitter followers to attend LML’s offline events. This type of explicit support was not observed for other companies/ SM groups of interest, which is also interesting.

Subset for retweets of LML:

abusers_timelines %>% 
  filter(retweet_screen_name %in% "LeaveMnsLeave") %>%  
    select(created_at, screen_name, urls_expanded_url, text, everything())

According to above subset, 19 (out of 29 active) threat-makers retweeted LML’s tweets a total of 168 times This means most threat-makers in our data set retweeted and/or engaged with LMLs tweets. Content of LML’s tweets are dominantly promotion of its offline rallies.

This pattern of behaviour can also be corroborated with URLs attached to LMLs tweets. 4 different type of URLs observed in LML’s tweets:

Eventbrite links (invites to LML rallies)
Periscope links (broadcasts form LML rallies)
http://leavemeansleave.eu/events (info about date-time and location of rallies)
Brexit related newspaper articles

As one can see, 3 out of 4 URL typologies relate to LML’s offline events, which is another interesting observation indicating LML was heavily focused on promoting their offline agenda on Twitter.

Subset for retweets of LML:

abusers_timelines %>% 
  filter(quoted_screen_name %in% "LeaveMnsLeave") %>%  
    select(created_at, screen_name, urls_expanded_url, text, everything())

Subset for mentions at LML:

abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(created_at, screen_name, urls_expanded_url, text, everything())

Similar to retweets, 21 out of 29 threat-makers mentioned LML in their tweets.

lml_rt_plot <- abusers_timelines %>% 
  filter(retweet_screen_name %in% "LeaveMnsLeave") %>%  
  select(screen_name) %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who Retweets @LeaveMnsLeave?",
       caption="    ",
       x="Twitter Handle",
       y="Number of RTs")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,120)




lml_mentions_plot <- abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(screen_name) %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who Mentions @LeaveMnsLeave?",
       caption="Social Data Science Lab, Cardiff University",
       x="Twitter Handle",
       y="Number of Mentions")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,150)

lml_plot_grid <- grid.arrange(lml_rt_plot,lml_mentions_plot,nrow =1)

ggsave(plot=lml_plot_grid,filename = "viz/lml_plot_grid.pdf",device = cairo_pdf, width =12 , height =9 ,dpi = 600,scale =1.5)

3.3.2 StandUp4Brexit

https://www.facebook.com/StandUp4Brexit/

This was covered in the Stand up 4 Brexit section.

3.3.3 Get Britain Out

https://www.facebook.com/GetBritainOut/

We observed no engagement with Get Britain Out’s Facebook page. This is expected as cross-platform engagement between Twitter and Facebook is very rare. However, as Get Britain Out has a Twitter account (@GetBritainOut), we observed engagement with that twitter account.

# text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 27

## [1] 27

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 5

## [1] 5

# retweets
abusers_timelines %>% 
  filter(str_detect(string = retweet_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 25 retweets

## [1] 25

# mentions
abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 47 retweets

## [1] 47

27 matches in text, 5 matches in URL, 25 matches in retweets and 47 matches in mentions

Subset when pattern is matched with text:

# text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T)))

Subset when pattern is matched with URLs:

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T)))

Subset of @GetBritainOut Retweets

abusers_timelines %>% 
  filter(str_detect(string = retweet_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T)))

Subset when pattern matched with mentioned accounts:

# mentions
abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T)))

3.3.4 StrongerOut

https://www.facebook.com/groups/208041792997959/

# text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("StrongerOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 5

## [1] 5

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste("StrongerOut",collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

5 matches in text column, no matches in URL column

# text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("StrongerOut",collapse = "|"),ignore_case = T)))

3.3.5 Museum of Communist Terror

https://www.facebook.com/museumofcommunistterror/

pattern_museum_of_communist_terror <- c("museumofcommunistterror","Museum of Communist Terror")
# text
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_museum_of_communist_terror,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

#url
abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_museum_of_communist_terror,collapse = "|"),ignore_case = T))) %>% 
  nrow() # 0

## [1] 0

No Match!

3.4 Twitter Accounts of interest (AOI)

Below is a table of AOI. We will create a pattern using all AOI and look for matches in text,replies, mentions, URLs, retweets, quotes, and in threat-makers’ timelines data set.

# create a tibble
twitter_accounts_of_interest <- c("@citizenshayler", "@tborwick", "@PeterWard09", "@freespirited_p", "@standup4Brexit", "@Tory4Liberty", "@SophiaGreenblat", "@EssexCanning", "@AshleyJFraser", "@CommunistTerror", "@SteveBakerHW", "@kantosystems", "@kantoelect", "@conservativeSPT", "@opinionated_european", "@kantoapp", "@ZuzannaWMroz", "@logicalcampaign")
twitter_accounts_of_interest_tibble <-  twitter_accounts_of_interest %>% 
  as.tibble() %>% 
  select(aoi_screenname=value) %>% 
  mutate(aoi_screenname_wo_at=str_remove(aoi_screenname,"@"))

twitter_accounts_of_interest_tibble

3.4.1 AOI Matches in Text

Looking for all AOIs in text.

# look for all AOI in text 
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname,collapse  = "|"),ignore_case = T))) %>% 
  nrow() # 322

## [1] 322

322 matches in text column

Data:

# look for all accounts in text 
abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname,collapse  = "|"),ignore_case = T)))

3.4.2 AOI Matches in Replies

Looking for all AOIs in reply_to_screen_name column.

abusers_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"),ignore_case = T))) %>% 
  nrow() #72

## [1] 72

72 matches in replies

Data:

abusers_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"),ignore_case = T)))

3.4.3 AOI Matches in Mentions

Looking for threat-makers mentioning AOIs.

abusers_timelines %>% 
  filter(str_detect(mentions_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|")))) %>% 
  nrow() #471

## [1] 471

471 matches in mentions

Data:

abusers_timelines %>% 
  filter(str_detect(mentions_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"))))

3.4.4 AOI Matches in Retweets

Looking for threat-makers retweeting AOIs.

abusers_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|")))) %>% 
  nrow() #247

## [1] 247

247 matches in retweets

Data:

abusers_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"))))

3.4.5 AOI Matches in Quotes

Looking for abusers quoting AOI

abusers_timelines %>% 
  filter(str_detect(quoted_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|")))) %>% 
  nrow() #7

## [1] 7

7 matches in quotes

Data:

abusers_timelines %>% 
  filter(str_detect(quoted_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"))))

3.4.6 AOI Retweet and Mention Visualisations

First, retweet plots:

aoi_rted_accounts <- abusers_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"))))  %>% 
  select(retweet_screen_name) %>% 
  add_count(retweet_screen_name) %>% 
  select(retweet_screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(retweet_screen_name, .keep_all = T) %>% 
  mutate(retweet_screen_name = fct_reorder(retweet_screen_name, n, .desc = F)) %>%
  ggplot(aes(retweet_screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Which AOIs Get Retweeted More?",
       caption="    ",
       x="Twitter Handle",
       y="Number of RTs")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,120)

aoi_rting_abusers <- abusers_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"))))  %>% 
  select(screen_name) %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who Retweets AOIs?",
       caption="Social Data Science Lab, Cardiff University",
       x="Twitter Handle",
       y="Number of RTs")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,120)


aoi_rt_plot <- grid.arrange(aoi_rted_accounts,aoi_rting_abusers,nrow =1)

ggsave(plot=lml_plot_grid,filename = "viz/aoi_rt_plot.pdf",device = cairo_pdf, width =12 , height =9 ,dpi = 600,scale =1.5)

aoi_mentions_plot <- abusers_timelines %>% 
  filter(str_detect(mentions_screen_name,pattern = regex(paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|")))) %>% 
  select(screen_name) %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who Mention AOIs?",
       caption=" ",
       x="Twitter Handle",
       y="Number of Mentions")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,130)


aoi_mentioned_plot <- twitter_accounts_of_interest_tibble %>% 
  select(aoi_screenname_wo_at) %>% 
  mutate(mentioned_count=str_count(string =  paste(abusers_timelines$mentions_screen_name,collapse = " "), pattern = aoi_screenname_wo_at))%>% 
  arrange(desc(mentioned_count)) %>% 
  mutate(aoi_screenname_wo_at = fct_reorder(aoi_screenname_wo_at, mentioned_count, .desc = F)) %>%
  ggplot(aes(aoi_screenname_wo_at, mentioned_count)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Which AOIs Are Mentioned?",
       caption="Social Data Science Lab, Cardiff University",
       x="Twitter Handle",
       y="Number of Mentions")+
  geom_text(aes(label =mentioned_count, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,200)

aoi_mention_plot <- grid.arrange(aoi_mentions_plot,aoi_mentioned_plot,nrow =1)

aoi_mention_plot

## TableGrob (1 x 2) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]

ggsave(plot=aoi_mention_plot,filename = "viz/aoi_mention_plot.pdf",device = cairo_pdf, width =12 , height =9 ,dpi = 600,scale =1.5)

aoi_replied_plot <- abusers_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"),ignore_case = T))) %>% 
  select(reply_to_screen_name) %>% 
  add_count(reply_to_screen_name) %>% 
  distinct(reply_to_screen_name, .keep_all = T) %>% 
  arrange(desc(n)) %>% 
  mutate(reply_to_screen_name= fct_reorder(reply_to_screen_name, n, .desc = F))%>%
  ggplot(aes(reply_to_screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Which AOIs Are Replied At?",
       caption="    ",
       x="Twitter Handle",
       y="Number of Replies")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,20)

aoi_replying_abusers <- abusers_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(twitter_accounts_of_interest_tibble$aoi_screenname_wo_at,collapse  = "|"),ignore_case = T))) %>%
  select(screen_name) %>% 
  add_count(screen_name) %>% 
  select(screen_name, n) %>% 
  arrange(desc(n)) %>%
  distinct(screen_name, .keep_all = T) %>% 
  mutate(screen_name = fct_reorder(screen_name, n, .desc = F)) %>%
  ggplot(aes(screen_name, n)) +
  # geom_bar(stat = 'identity')+
  geom_col()+
  coord_flip()+
  hrbrthemes::theme_ipsum_rc()+
  labs(title="Who Replies AOIs?",
       caption="Social Data Science Lab, Cardiff University",
       x="Twitter Handle",
       y="Number of Replies")+
  geom_text(aes(label =n, hjust=-0.3)) +
  theme(plot.caption = element_text(size = 12))+
  ylim(0,20)


aoi_replies_plot <- grid.arrange(aoi_replied_plot,aoi_replying_abusers,nrow =1)

aoi_replies_plot

## TableGrob (1 x 2) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]

ggsave(plot=aoi_replies_plot,filename = "viz/aoi_replies_plot.pdf",device = cairo_pdf, width =12 , height =9 ,dpi = 600,scale =1.5)

4 Social Network Analysis

4.1 SNA Plot

TLDR: Here is the retweet network plot from account timelines. Circles are twitter accounts and orange lines are RTs. Circle size and and greener hues indicate being retweeted more by threat-makers. Naturally, largest and greenest nodes were the ones retweeted the most. Red colour and small node size point to accounts retweeting others so, in this plot these are threat-makers mostly.

Retweet Network

Below is data-wrangling and network creation bit so it might not be suitable for quick consumption. One can skip below as the result is the network graph presented above.

4.2 SNA Data Wrangling Work

First step of the SNA is to create an edgelist and a nodelist, which will enable us to visualise retweeted and retweeter nodes or in the network. Jesse Sadler explains what edgelists and nodelists are in quite well here.

An edge list is a data frame that contains a minimum of two columns, one column of nodes that are the source of a connection and another column of nodes that are the target of the connection. The nodes in the data are identified by unique IDs. If the distinction between source and target is meaningful, the network is directed. If the distinction is not meaningful, the network is undirected. With the example of letters sent between cities, the distinction between source and target is clearly meaningful, and so the network is directed. For the examples below, I will name the source column as “from” and the target column as “to”. I will use integers beginning with one as node IDs. An edge list can also contain additional columns that describe attributes of the edges such as a magnitude aspect for an edge. If the edges have a magnitude attribute the graph is considered weighted.

Edge lists contain all of the information necessary to create network objects, but sometimes it is preferable to also create a separate node list. At its simplest, a node list is a data frame with a single column — which I will label as “id” — that lists the node IDs found in the edge list.

4.3 Retweet Network (from threat-makers’ timelines)

In the context of edgelists, from column would be the threat-maker account screen_names and to column would be retweet_screen_names. The edgelists will enable us to visualise the retweet network of threat-makers using the most recent retweets each account posted. Ultimately this would give us an idea as to which actors play an important role in this network. We will present 3 data sets:

Edgelist without weights: A tibble with two columns, from_retweeter and to_retweeted.
Edgelist with weights: A tibble with three columns, from_retweeter, to_retweeted, and rt_count.
Nodelist: A tibble with two columns, ID and node_name. node_name is all retweeted account screen_names plus threat-maker accounts’ screen_names.

RT edgelist W/O weights:

timelines_RT_edgelist <- abusers_timelines %>% 
  mutate(created_at= as.POSIXct(created_at)) %>% 
  filter(!is.na(retweet_screen_name)) %>% 
  select(from_retweeter=screen_name, to_retweeted=retweet_screen_name)

# timelines_RT_edgelist %>% 
#   write_csv("data/timelines_RT_edgelist.csv")

timelines_RT_edgelist

RT edgelist with weights (weights are RT counts here):

timelines_RT_edgelist_weights <- abusers_timelines %>% 
  mutate(created_at= as.POSIXct(created_at)) %>% 
  filter(!is.na(retweet_screen_name)) %>% 
  select(from_retweeter=screen_name, to_retweeted=retweet_screen_name) %>% 
  group_by(from_retweeter,to_retweeted ) %>% 
  summarise(rt_count=n()) %>% 
  ungroup()
  
# timelines_RT_edgelist_weights %>% 
  # write_csv("data/timelines_RT_edgelist_weights.csv")
timelines_RT_edgelist_weights

Note that row counts decreased here as duplicate connections (i.e. RTs) are mapped to rt_count column.

RT nodelists:

original_accounts <- tibble(node_name=unique(timelines_RT_edgelist$from_retweeter))
timelines_RT_nodelist <- timelines_RT_edgelist %>% 
  distinct(to_retweeted) %>% 
  select(node_name=to_retweeted) %>% 
  bind_rows(original_accounts) %>% 
  distinct(node_name, .keep_all = T) %>% # crucial, forgot once and corrupted whole data below!!!
  rowid_to_column("ID") %>% 
  select(ID, label=node_name)
timelines_RT_nodelist

# timelines_RT_nodelist %>% 
#   write_csv("data/timelines_RT_nodelist.csv")

Edgelist with integer values (rather than string sceen_names values):

# timelines_RT_edgelist_weights %>% colnames()
# timelines_RT_nodelist %>% colnames()
rt_edges <- timelines_RT_edgelist_weights %>% 
  left_join(timelines_RT_nodelist, c("from_retweeter"="label")) %>% 
  rename(from_RTer=ID) %>% 
  left_join(timelines_RT_nodelist, c("to_retweeted"="label")) %>% 
  rename(to_RTed=ID) %>% 
  select(from_RTer, to_RTed, rt_count)
rt_edges
# rt_edges %>%
#   write_csv("data/rt_edges.csv")

4.3.1 Network using `network package`

Create the network and look at basic network attributes:

library(network)
RT_network <- as.network(rt_edges, vertex_attrnames = timelines_RT_nodelist,
                         directed=T,
                         matrix.type="edgelist",ignore.eval=FALSE)

class(RT_network)
RT_network

4.3.2 Very Crude Network Plot

plot(RT_network, vertex.cex=3)

4.3.3 Network using igraph package

detach(package:network)
library(igraph)

rt_graph <- graph_from_data_frame(rt_edges,vertices = timelines_RT_nodelist, directed = T) %>% 
  set_edge_attr( "weight", value= rt_edges$rt_count)
rt_graph
is_weighted(rt_graph)

4.3.4 Very Crude Network Plot-2

plot(rt_graph, edge.arrow.size=0.2, layout=layout_with_graphopt)

4.3.5 Enter `tidygraph` and `ggraph`

library(tidygraph)
library(ggraph)

rt_network_tidy <- tbl_graph(nodes = timelines_RT_nodelist, edges =rt_edges, directed = T )
rt_network_tidy

a <- rt_network_tidy %>% 
  activate(edges) %>% 
  filter(rt_count>50) %>% 
  activate(nodes)


a%>% 
  ggraph()+
  geom_edge_link(aes(width=rt_count), alpha=0.3)+
  scale_edge_width(range = c(0.1,2))+
  geom_node_point()+
  labs(edge_width="Retweet Count")+
  theme_graph()

a %>%
  mutate(centrality = centrality_authority()) %>% 
    ggraph(layout = 'kk') + 
    geom_edge_link(aes(width=rt_count), alpha=0.5) + 
    scale_edge_width(range = c(0.1,2))+
    geom_node_point(aes(size = centrality, colour = centrality)) + 
    scale_color_continuous(guide = 'legend') + 
    theme_graph()

5 First Round of Questions

5.1 Link between EUVoteLeave23rd and StandUp4Brexit

5.1.1 Retweets

Our dataset has 3197 tweets from EUVoteLeave23rd account.

euvl23_data <- abusers_timelines %>% 
  filter(screen_name %in% "EUVoteLeave23rd")
euvl23_data

Out of 3197 tweets, 2570 are retweets. It would be fair to say, during the study period, this account is used to promote and push messages previously posted, rather than creating original content. Below are the top 50 accounts EUVoteLeave23rd most frequently retweeted.

euvl23_data %>% 
  filter(!is.na(retweet_screen_name)) %>% 
  group_by(retweet_screen_name) %>% 
  summarise(retweet_count=n()) %>% 
  arrange(desc(retweet_count)) %>% 
  head(50)

Top one is EUVoteLeave23rd themselves, so there is a self-propagation attempt. Second to top is StandUp4Brexit, which is the AOI. LeaveMnsLeave, realDonaldTrump brexitcountdow1,brexit_clock were frequently retweeted as well.

5.1.2 Replies

Out of 3197 tweets in EUVoteLeave23rd‘s timeline, 195 were replies. The mechanics of the replies is a bit different than other interaction types on Twitter. While users can reply someone to express agreement, they can also reply someone to confront them about their ’unfavourable’ opinions (as in counter-hate speech). My personal observation is, users should care about an opinion to a certain degree to reply a particular tweet. Otherwise, they would not have seen it on their timelines, ignore or actively avoid the source. But as I mentioned, the reason to care is not necessary confirmatory. Below I present EUVoteLeave23rd’s replies subset and screen_names most frequently replied by EUVoteLeave23rd.

Replies by EUVoteLeave23rd

euvl23_data %>% 
  filter(!is.na(reply_to_screen_name))

Twitter accounts EUVoteLeave23rd replied the most:

euvl23_data %>% 
  filter(!is.na(reply_to_screen_name)) %>%  
  group_by(reply_to_screen_name) %>% 
  summarise(reply_count=n()) %>% 
  arrange(desc(reply_count)) %>% 
  head(50)

Apart from self-replies (threads), we see that EUVoteLeave23rd replied multiple Conservative Party accounts and the PM the most. As mentioned in previous sections of this document, their replies to the PM and Conservative party is inflammatory. Note the treason narrative and almost-insults aimed at the PM.

euvl23_data %>% 
  filter(!is.na(reply_to_screen_name)) %>%  
  filter(reply_to_screen_name %in% c("Conservatives", "theresa_may", "CCHQPress", "10DowningStreet", "eastantrimmp", "andreajenkyns", "BrexitCentral", "SkyNews", "young_tories", "JohnnyMercerUK", "LeaveMnsLeave", "NigelDoddsDUP")) %>% 
  select(text,reply_to_screen_name)

5.1.3 Mentions

EUVoteLeave23rd mentions another Twitter account in 3044 out 3197 tweets. Mentioning can be used to alert other accounts about the engagement ( engagement can be positive or negative; confirmatory or provacatory), to give other accounts credit or to ping supporters (i.e. to alert like-minded accounts and call them for support to promote a message). As I explained above, up to 50 accounts can be mentioned in one tweet. It seems, EUVoteLeave23rd mentioned 604 distinct accounts in 3044 tweets with mentions.

Accounts EUVoteLeave23rd mentioned the most:

unique_mentioned_accounts <- euvl23_data %>% 
  filter(!is.na(mentions_screen_name)) %>% 
  select(mentions_screen_name) %>% 
  mutate(mentions_single = strsplit(mentions_screen_name, " ")) %>% 
  unnest(mentions_single) %>% 
  distinct(mentions_single)

unique_mentioned_accounts %>%
  mutate(mentioned_count= str_count(string =  paste(euvl23_data$mentions_screen_name,collapse = " "), pattern = mentions_single)) %>% 
  arrange(desc(mentioned_count))

Besides self-engagement, here we observe that StandUp4Brexit is among one of the most mentioned accounts by EUVoteLeave23rd. However, it must be noted that multiple Conservative Party accounts and Conservative party politicians were mentioned more than StandUp4Brexit.

5.2 “#BrexitLordsBetrayal” activity in Q1 2018

Data collection complete. This hashtag does not seem to have been used in Q1 2018 at all. Manual search on Twitter corroborates this observation.

No Twitter activity for #BrexitLordsBetrayal in Q1 2018

5.3 Number of Threat Makers Engaging with Get Britain Out

As can be seen in Get Britain Out section, there were 27 matches in text column, 5 matches in URL column, 47 matches in mentions and 25 retweets of tweets from @GetBritainOut account in threat-makers’ timelines. When 3 types of engagements are combined, number of unique threat-makers engaged with Get Britain Out is 14. See below:

# text
gbo_1 <- abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  distinct(screen_name) %>% 
  mutate(match_type="text")

#url
gbo_2 <- abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  distinct(screen_name) %>% 
  mutate(match_type="url")

gbo_3 <- abusers_timelines %>% 
  filter(str_detect(string = retweet_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  distinct(screen_name) %>% 
  mutate(match_type="retweet")

# mentions
gbo_4 <- abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste("GetBritainOut",collapse = "|"),ignore_case = T))) %>% 
  distinct(screen_name) %>% 
  mutate(match_type="mentions")

gbo_unique_abusers <- rbind(gbo_1, gbo_2, gbo_3,gbo_4) %>% 
  distinct(screen_name,.keep_all = T)
gbo_unique_abusers

5.4 Engagement with Leave Means Leave pages

Some unexpected results here, it seems. It’s hard to tell definitively the type of engagement threat-makers had with LeaveMeansLeave without in-depth content analysis. Even though we can look at tweet texts here, our analysis would not be classified as qualitative (as we are not trying to deconstruct the underlying message, power relations, reasons of language used here). Rather, our analysis would be closer to be classified as being quantitative and descriptive in sociological sense. With that in mind, when it comes to inspecting the semantics of each tweet in isolation, context matters a lot (sarcasm being the main cause of misinterpretation). Therefore, the interpretation of the data would depend on the theory being tested or the questions asked. That said, we can definitely look at the language being used by the threat-makers when engaging with Leave Means Leave sites.

Previously, this document has provided this data in Leave Means Leave section above. Below subsets text match, URL match and mention subsets are provided again for convenience. Retweet subset is not included as they are essentially LML’s messages.

Subset for Matches in Text:

abusers_timelines %>% 
  filter(str_detect(string = text, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(text, created_at, screen_name, urls_expanded_url, everything())

Subset for matches in URL:

abusers_timelines %>% 
  filter(str_detect(string = urls_expanded_url, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(text,created_at, screen_name, urls_expanded_url,  everything())

Subset for mentions at LML:

abusers_timelines %>% 
  filter(str_detect(string = mentions_screen_name, regex( paste(pattern_leave_means_leave,collapse = "|"),ignore_case = T))) %>% 
    select(text,created_at, screen_name, urls_expanded_url,  everything())

5.5 Engagement with AOIs

I have updated the Twitter Accounts of interest (AOI) section, reflecting on the question. I would agree that threat-makers engaged extensively with AOIs; however, some accounts interact/get interacted more than others. Updated visualisations of retweet, replies and mentions between threat-makers and AOIs should be help the reader to distinguish which threat makers and AOIs were more active than others.

5.6 Edgelists

Added a TLDR in 4th section with SNA plot SNA Plot which should be the main take-home. Rest is data wrangling and experimenting with data which can be ignored by the reader.

6 AOIs Interacting with Threat-Makers

6.1 AOIs account Information

In this section, we will import data from AOIs. Below is a data set containing profile information of AOIs queried from the Twitter API. We could get data from only 14 accounts. There are 18 AOIs but only 16 of them seem to be active at the time of collection. In particular, no data returned from @kantoapp and @ZuzannaWMroz. In addition, @Tory4Liberty is protected, therefore, we could not capture any data from this account. Finally, the account @logicalcampaign had no tweets (possibly deleted tweets and/or the original Logical account changed screen name).

Account Info:

aoi_list <- read_csv("data/aois_list_info.csv")
aoi_list

6.2 AOI Timelines

aoi_timelines <- read_twitter_csv("data/aois_timelines.csv") %>% 
  distinct(status_id, .keep_all=T)
aoi_timelines %>% nrow()

## [1] 21189

Our collection returned 21189 tweets in total, which can be found in ‘aois_timelines.csv’ file. Let’s briefly check out this data set. The number of captured tweets and the date of oldest and newest tweets scraped fro each account are presented below.

aoi_timelines %>% 
  group_by(screen_name) %>% 
  arrange(created_at) %>%
  summarise(number_of_tweets=n(), first_tweet=first(created_at), last_tweet=last(created_at)) %>% 
  arrange(desc(number_of_tweets))

6.3 AOI Account Info

aoi_list_info <- read_csv("data/aois_list_info.csv")
aoi_list_info

The pattern of tweet counts are quite interesting. There seems to be a clear dichotomy. As can be inferred from the above tables, some accounts are avid tweeters. In particular, @PeterWard09, @freespirited_p, @EssexCanning, @SteveBakerHW has more than 30000 tweets. In contrast, @tborwick, @Tory4Liberty, @SophiaGreenblat, @AshleyJFraser, @CommunistTerror, @kantosystems, @kantoelect, @conservativeSPT, @ZuzannaWMroz, @logicalcampaign posted less than 1000 tweets to date. @citizenshayler had ~4600 tweets.

6.4 AOI interaction with Threat-makers

Below, I will look for whether AOIs interacted with threat-makers on Twitter or not. In particular, four interaction types i.e. mentions, retweets,quotes, replies will be investigated.

6.4.1 Replies

unique_abusers <- abusers_list_info %>% 
  select(screen_name) %>% 
  distinct() %>% 
  na.omit()

aoi_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(unique_abusers$screen_name,collapse  = "|"),ignore_case = T))) %>% 
  nrow() #223

## [1] 4

4 matches in replies

Data:

aoi_timelines %>% 
  filter(str_detect(string = reply_to_screen_name, regex( paste(unique_abusers$screen_name,collapse  = "|"),ignore_case = T)))

As seen above, only 4 replies to threat-makers are found in AOI timelines. freespirited_p, StandUp4Brexit and EssexCanning replied at tweets from Torysoldier, bernerlap, AmpersUK.

6.4.2 Mentions

Looking for threat-makers mentioning AOIs.

aoi_timelines %>% 
  filter(str_detect(mentions_screen_name,pattern = regex(paste(unique_abusers$screen_name,collapse  = "|")))) %>% 
  nrow() #471

## [1] 32

32 matches in mentions

Data:

aoi_mentions_abusers <- aoi_timelines %>% 
  filter(str_detect(mentions_screen_name,pattern = regex(paste(unique_abusers$screen_name,collapse  = "|"))))
aoi_mentions_abusers

32 mentions were made by only 5 AOIs:

aoi_mentions_abusers %>% 
  select(screen_name) %>% 
  add_count(screen_name) %>% 
  distinct(screen_name,.keep_all = T)

Mentioned threat-makers can be found below:

aoi_mentions_abusers %>% 
  select(screen_name,mentions_screen_name)

6.4.3 Retweets

Frankly, I’d be very surprised to see any retweets from AOIs but lets see.

Looking for AOIs retweeting threat-makers.

aoi_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(unique_abusers$screen_name,collapse  = "|")))) %>% 
  nrow() #4

## [1] 4

4 matches in retweets

Data:

aoi_timelines %>% 
  filter(str_detect(retweet_screen_name,pattern = regex(paste(unique_abusers$screen_name,collapse  = "|"))))

It seems, freespirited_p and StandUp4Brexit were the two AOIs retweeting threat-makers. Interestingly, they have retweeted tweets singularly from bernerlap.

6.4.4 Quotes

Looking for AOIs quoting threat-makers

aoi_timelines %>% 
  filter(str_detect(quoted_screen_name,pattern = regex(paste(unique_abusers$screen_name,collapse  = "|")))) %>% 
  nrow() #7

## [1] 0

** No match in quotes **

7 AOIs Interacting with @AmyMek

Finally, last task was to look at whether AOIs interacted with @AmyMek. As above, I will look for replies, mentions, retweets adn quotes mathcing with @AmyMek in AOI timeline dataset.

7.1 Replies to @AmyMek

aoi_timelines %>% 
  filter(str_detect(string = reply_to_screen_name,pattern =regex( "AmyMek",ignore_case = T)))

** No match in Replies**

7.2 Mentions to @AmyMek

aoi_timelines %>% 
  filter(str_detect(string = mentions_screen_name,pattern =regex( "AmyMek",ignore_case = T)))

** No match in Mentions**

7.3 Retweets of @AmyMek

aoi_timelines %>% 
  filter(str_detect(string =retweet_screen_name ,pattern =regex( "AmyMek",ignore_case = T)))

** No match in Retweets**

7.4 Quotes of @AmyMek

aoi_timelines %>% 
  filter(str_detect(string =quoted_screen_name ,pattern =regex( "AmyMek",ignore_case = T)))

** No match in Quotes**

It seems, AOIs did not interact with @AmyMek at all. At least, during the data collection period.

An Analysis of Twitter Accounts which Threatened British Policitans on Twitter

Sefa Ozalp

28/10/2018

1 Introduction

2 Data I/O

2.1 Threat-maker Accounts

2.2 Timelines of Threat-makers

2.3 Threat-makers Account Info

2.4 Followers and Friends of Threat-makers

3 Analysis

3.1 Hastags of Interest

3.2 Companies of Interest

3.2.1 Voter Consultancy LTD

3.2.2 Disruptive Communications

3.2.3 Kanto Systems AND Kanto Elects

3.2.4 Conservative Support LTD

3.2.5 Brexitrealities.com

3.2.6 Logical Campaign

3.2.7 Stand up 4 Brexit

3.2.8 Fake Fixed

3.3 Social Media Pages of Interest

3.3.1 Leave Means Leave

3.3.2 StandUp4Brexit

3.3.3 Get Britain Out

3.3.4 StrongerOut

3.3.5 Museum of Communist Terror

3.4 Twitter Accounts of interest (AOI)

3.4.1 AOI Matches in Text

3.4.2 AOI Matches in Replies

3.4.3 AOI Matches in Mentions

3.4.4 AOI Matches in Retweets

3.4.5 AOI Matches in Quotes

3.4.6 AOI Retweet and Mention Visualisations

4 Social Network Analysis

4.1 SNA Plot

4.2 SNA Data Wrangling Work

4.3 Retweet Network (from threat-makers’ timelines)

4.3.1 Network using network package

4.3.2 Very Crude Network Plot

4.3.3 Network using igraph package

4.3.4 Very Crude Network Plot-2

4.3.5 Enter tidygraph and ggraph

5 First Round of Questions

5.1 Link between EUVoteLeave23rd and StandUp4Brexit

5.1.1 Retweets

5.1.2 Replies

5.1.3 Mentions

5.2 “#BrexitLordsBetrayal” activity in Q1 2018

5.3 Number of Threat Makers Engaging with Get Britain Out

5.4 Engagement with Leave Means Leave pages

5.5 Engagement with AOIs

5.6 Edgelists

6 AOIs Interacting with Threat-Makers

6.1 AOIs account Information

6.2 AOI Timelines

6.3 AOI Account Info

6.4 AOI interaction with Threat-makers

6.4.1 Replies

6.4.2 Mentions

6.4.3 Retweets

6.4.4 Quotes

7 AOIs Interacting with @AmyMek

7.1 Replies to @AmyMek

7.2 Mentions to @AmyMek

7.3 Retweets of @AmyMek

7.4 Quotes of @AmyMek

4.3.1 Network using `network package`

4.3.5 Enter `tidygraph` and `ggraph`