"Catalonia Independence sentiment analysis- Alvaro Bueno"
"12/5/2017"

Catalonia Independence sentiment analysis

The project consists of Performing sentiment analysis in Catalonia news content from diverse sites. The sources correspond to articles from december, november and october. Where an independence referendum and other remarkable events have taken place.

Catalonia Independence sentiment analysis (2)

Since 2010 where some regional autonomy laws were stripped from Spain constitution, there has been a power struggle between catalonia and spain (the central government).

Catalonia Independence sentiment analysis (3)

2016 and 2017 Have been very intense.

  • multiple cancelations to independence ballots.
  • Funding scandal for a president (Artur Mas)
  • Ousted another president after Independence vote (Carles Puidgemont).

Catalonia Independence sentiment analysis (4)

Catalonia wants to be independent even before spain.

Home to ancient community with traditions and language still thriving.

Media

Catalan News, BBC, The Guardian, The independent, AbcNews, NPR, La Vanguardia, El periodico, RTE.ie, Al Jazeera and Bloomberg

Project Drawbacks

  • Restricted access to Spanish Corpus, no Catalan Corpus
  • Very difficult reformatting of dates after site mining.
  • Twitter restricts access to last 7 days, so it didn't become useful for the time needed for this project.

Data, Sources

  • Saving the data frame filtered by language and news company
  • using the content variable as the source of the sentiment analysis
  • the package methods will take care of whitespace, punctuation, and general clean-up of the data.

Assumptions, Methodology

The data was gathered in Descending order.

  • Most recent news first (ousted catalan government top posts are starting to get out of prison on bail)
  • The ones from october are in the right side of the plot. (Vote started for independence referendum)
  • did not perform positivity/negativity analysis on the articles because of the source (news).

Assumptions, Methodology (2)

using the analyzeSentiment library we proceed to plot the variability in sentiment across the mined documentts.

sent_english <- analyzeSentiment(as.character(df[df$lang=='EN',]$content))
sent_spanish <- analyzeSentiment(as.character(df[df$lang=='ES',]$content), language='spanish')

sent_abc <- analyzeSentiment(as.character(df[df$newscompany=='abcnews',]$content))
sent_periodico <- analyzeSentiment(as.character(df[df$newscompany=='periodico',]$content), language='spanish')
sent_ctn <- analyzeSentiment(as.character(df[df$newscompany=='CTN',]$content))
sent_jaz <- analyzeSentiment(as.character(df[df$newscompany=='aljazeera',]$content))
sent_bbc <- analyzeSentiment(as.character(df[df$newscompany=='bbc',]$content))
sent_gua <- analyzeSentiment(as.character(df[df$newscompany=='guardian',]$content))
sent_bbg <- analyzeSentiment(as.character(df[df$newscompany=='bberg',]$content))
sent_ind <- analyzeSentiment(as.character(df[df$newscompany=='indep',]$content))
sent_npr <- analyzeSentiment(as.character(df[df$newscompany=='npr',]$content))
sent_rte <- analyzeSentiment(as.character(df[df$newscompany=='rte.ie',]$content))

Other Plots

plotSentiment(sent_english) 

plot of chunk unnamed-chunk-2

Other Plots

plotSentiment(sent_spanish)

plot of chunk unnamed-chunk-3

Other Plots

plotSentiment(sent_ctn) 

plot of chunk unnamed-chunk-4

Other Plots

plotSentiment(sent_jaz) 

plot of chunk unnamed-chunk-5

Other Plots

plotSentiment(sent_gua) 

plot of chunk unnamed-chunk-6

Other Plots

plotSentiment(sent_npr) 

plot of chunk unnamed-chunk-7

Conclusions

The graph corresponding to Catalan News (third graphic, the most mined news source) is peaking at the end, the dates corresponding at october when the polls just started to be declared illegal by the central government in spain and the vote continued as promised.

These days were particularly evenful in the press because of the violence applied by police to voters.

Conclusions (2)

The low numbers in the left part of the English graph (First graphic) shows that the press keeps an easy or moderate tone to inform uneventfully as the days go by, Specially when the turnout of events it's not positive for Spain.

We can expect a similar increase of internet reactions in the days close to the new vote of december 21 if the events turn violent like the ones that happened in october.