This work aims to study the ISIS activity at Twitter related to the Iberian Peninsula (Portugal and Spain) since the November 2015 Paris Attacks. All the work was done based on a CC0 Public Domain License dataset avaiable from a Kaggle competition in https://www.kaggle.com/kzaman/how-isis-uses-twitter
Downloaded manually the dataset May 21 from # https://www.kaggle.com/kzaman/how-isis-uses-twitter/downloads/how-isis-uses-twitter.zip
# Load the dataset (ITdf = ISIS Twitter dataframe)
ITdf <-read.csv("./data/tweets.csv", stringsAsFactors = FALSE)
# Inspect the dataset
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
glimpse(tbl_df(ITdf))
## Observations: 17,410
## Variables: 8
## $ name (chr) "GunsandCoffee", "GunsandCoffee", "GunsandCoffe...
## $ username (chr) "GunsandCoffee70", "GunsandCoffee70", "GunsandC...
## $ description (chr) "ENGLISH TRANSLATIONS: http://t.co/QLdJ0ftews",...
## $ location (chr) "", "", "", "", "", "", "", "", "", "", "", "",...
## $ followers (int) 640, 640, 640, 640, 640, 640, 640, 640, 640, 64...
## $ numberstatuses (int) 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49,...
## $ time (chr) "1/6/2015 21:07", "1/6/2015 21:27", "1/6/2015 2...
## $ tweets (chr) "ENGLISH TRANSLATION: 'A MESSAGE TO THE TRUTHFU...
We have 17410 tweets with 8 variables.
# Determine the range of dates of the tweets
head(ITdf$time)
## [1] "1/6/2015 21:07" "1/6/2015 21:27" "1/6/2015 21:29" "1/6/2015 21:37"
## [5] "1/6/2015 21:45" "1/6/2015 21:51"
tail(ITdf$time)
## [1] "5/13/2016 15:45" "5/13/2016 15:46" "5/13/2016 15:50" "5/13/2016 15:51"
## [5] "5/13/2016 16:07" "5/13/2016 16:10"
We have tweets between 01/06/2015 and 13/05/2016.
# Let's see how many tweets have a geolocation specified
sum(ITdf$location != "")
## [1] 11432
We have 11432 tweets with a geolocation, that’s a decent porcentage.
# Subset the data to the cases with geolocation
ITdf_loc <- filter(ITdf, location != "")
glimpse(tbl_df(ITdf_loc))
## Observations: 11,432
## Variables: 8
## $ name (chr) "Abu Layth Al Hindi", "Abu Layth Al Hindi", "Wa...
## $ username (chr) "AbuLaythAlHindi", "AbuLaythAlHindi", "warrepor...
## $ description (chr) "Kik: abulayth2014. Ex South African. Currently...
## $ location (chr) "Islamic State", "Islamic State", "München, Deu...
## $ followers (int) 68, 68, 130, 130, 130, 130, 130, 130, 130, 130,...
## $ numberstatuses (int) 18, 18, 656, 656, 656, 656, 656, 656, 656, 656,...
## $ time (chr) "7/7/2015 6:02", "8/17/2015 20:20", "9/29/2015 ...
## $ tweets (chr) "\U0001f53b[ الإثنين 1436/9/19 هـ ]...
# Subset the tweets with attributed locations in Iberia
ibe <- grep("[Pp]ortugal|[Ee]spanha|[Ss]pain|[Ll]isbo[an]|[Mm]adrid|
[Aa]l-andalus|[Aa]l andalus|[Cc]ordoba|[Tt]oledo|[Ss]evilla|[XxJj][aàá]tiva|
[Pp]orto|[Cc]oimbra|[Ii]beri[ac]",ITdf_loc$location, value = TRUE)
ibe
## character(0)
No words of interest were detected.
# Return to ITdf and inspect the tweets
ibe <- grep("[Pp]ortugal|[Ee]spanha|[Ss]pain|[Ll]isbo[an]|[Mm]adrid|
[Aa]l-andalus|[Aa]l andalus|[Cc]ordoba|[Tt]oledo|[Ss]evilla|
[XxJj][aàá]tiva|[Bb]arcelona|[Pp]orto|[Cc]oimbra|[Ii]beri[acn]",
ITdf$tweets, value = TRUE)
ibe
## [1] "@CyanideMadrid @BBCWorld They will take ramadi again just wait and see"
## [2] "@CyanideMadrid @BBCWorld You need to understand that we don't want that Shi'a Iraqi army take ramadi because they will kill all Sunni ppl"
## [3] "RT @PresosMusulmane: Make du3a for sister Maryam, 22, a revert of Spain. https://t.co/laslWCxnuJ"
## [4] "RT @PresosMusulmane: Make du3a for brother Hamed Abderrahaman. He has been arrested in Guantanamo and recently has been arrested by Spain h…"
## [5] "RT @AJENews: Turkish religious authority blocks 'rocking imam' from singing in Portugal. @Um_uras reports https://t.co/OKyOCiuFLB https://t…"
## [6] "One of the Umayad rulers (perhaps) Waleed conquered Spain(Andhlusia) , Hindh, Africa, Turkmenistan at the same time."
## [7] "@xanasore Ah I see. I wonder if Musa Bin Nusayr lost Spain after 3 days like the FSA did with Al-Rai?"
Only seven tweets were detected. The first two are sent to a username that includes “Madrid” and refers to the iraquian city of Ramadi that was recaptured by the iraqui army and coallition forces by December 2015. The next two, refer to prisioners with connections to Spain. The fifth is a retweet of a news agence that refers to the prohibition, by turkish authorities, of a “rocking iman” to sing in Portugal. The last two are more interesting as they discuss the former arab state of the Iberic Peninsula from 711 to 1492 AC, the Al-andalus, wich is a declared objective for the ISIS to reconquer.