Climate change and climate displacement are set to become some of the most important issues in the upcoming decades. In this regard, the Kerala floods of August 2018 and 8th of August 2019, represent but an exemplar case study of these much wider phenomena. The relief strategies implemented to solve the situation of the 5.4 million people affected and the 1.4 million displaced also showed some profound problems, arising from social stratification patterns like the caste-based social hierarchies around which the Indian society is structured (UNDP). In this regard, it has been put into evidence by social movements like the National Dalit Watch the missed inclusion of the dalit (term refering to those who are commonly named as the “outcasted” or “untouchables) and adivasi (”tribal" populations in India) communities in the relocation and housing programmes promoted by the Keralite government in the aftermath of these subsequent crises (National Dalit Watch, 2019), given that policies like house rellocation were bounded to the formal ownership of a house (Kerala Government, 2019), which neglected marginalised population who do not formally own their houses or the land they work on (Mathrubuni, 2019).
Legally and politically, the issue of how to deal with populations displaced as a result of climate change remains a highly debated one, on which no existent protection framework yet exists.
For this reason, with this project we intend to explore these controversies through a Natural Language Processing analysis on a Factiva corpus of 994 articles concerning the Kerala floods.
Firstly, we wish to operate a sentiment analysis of our corpus to observe variations across time. We set the hypothesis that in the period where the floods took place, as well as in their immediate aftermath, the language used in articles was more negatively charged; whereas later, when relief and reconstruction programs were implemented, it evolved into a more positive one, associated with hope and resilience.
Secondly, we wish to explore the recurrent words and bigrams in our corpus, looking specifically at those which would reflect the issue of climate change and climate displacement/migration/refugees. We set the hypothesis that due to a lack of legal framing and political acknowledgement of the role of such issues, their recurrence will be equally limited in our media corpus.
For constructing our corpus we resorted to Factiva, a repository of newspaper articles allowing users to search for and download digitalised news. Our search query was designed to search for articles published in English, talking about India and that included the words “Kerala”, “floods” (-Kerala and floods-) and a series of terms related to migration, displacement, diaspora and remittances that I detail below (ex:-migration-). For the sake of capturing the usage of terms deployed by the media, we created this “dictionary” after reading 10 articles on the Kerala floods that talked about migration on 2018 and 2019. We designed our query as to make sure that these latter words could appear within a distance of 30 words either to the left or the to the right of “Kerala” and “floods” (-/N30/-). Since none of us speaks neither Hindi nor Malayalam (the regional language spoken in Kerala), we resorted to using anglophone documents. Moreover, we decided to search for news that mentioned India (or a geography within India) as our topos and setted a time range from the 16/08/2018 to the 17/08/2019. Thus allowing us to capture the anniversary of the event.
Here is the searchy query we used on Factiva: Kerala and floods/N30/migration or Kerala and floods/N30/emigration or Kerala and floods/N30/immigration or Kerala and floods/N30/migrant or Kerala and floods/N30/migrants or Kerala and floods/N30/emigrant or Kerala and floods/N30/emigrants or Kerala and floods/N30/immigrant or Kerala and floods/N30/immigrants or Kerala and floods/N30/Non-resident Indian or Kerala and floods/N30/NRI or Kerala and floods/N30/displacement or Kerala and floods/N30/displaced or Kerala and floods/N30/IDP or Kerala and floods/N30/Internally Displaced Persons or Kerala and floods/N30/diaspora or Kerala and floods/N30/diasporic or Kerala and floods/N30/remittances or Kerala and floods/N30/remittance
Once we gathered our corpus, after encountering some difficulties in merging the different html files, we decided to do this with MacOS “Terminal App”. We used “cat”, a system command that allows to read the content of a pure-text file - in our case a n html (but also csv, txt, htm, xml - basically all files we can open with the textEdit app on MacOs). In general terms, with “>” we directed the output of the “cat” command towards a new file. Here’s a more precise explanation of how we did it. We first created a folder in which we put all the files to be merged. We then renamed each of them to create a sequence: 1.html; 2.html, etc. Nota Bene: if the files are more than 9, it is important to name them 01.html; 02.html, etc. to be sure that the 10th file is read after the 9th, as for many ordering systems 10 could otherwise go between 1 and 2. We launched the Terminal App and used the command “cd [pathway to get to files]”. What is here defined as “pathway to get to files” is the sequence of folders that from the “higher” one leads to the one where the files we want to work on are actually placed. At that point we press Enter and the prompt changes indicating us that we are in the chosen directory. We verify the content of the folder by launching ls), plus a “doctype” denomination. We therefore removed these markers from each file, by leaving it only at the head of the first one and the bottom of the last one. In fact, every parsing function needs these indicators in order to avoid that the parsing starts again in a continuous loop. By doing so we obtained the merged_pruned.html file that we used for our analysis.
Having described our corpus and merging strategy, we think it is important to devote a few lines to critically asses it. Given that we are only dealing with English media, it is important to understand who gets to write and read said corpus. It is indeed the socio-economic elite who has access to English in India and who uses it on an everyday basis. Thus, our analysis is made from a corpus that can only reflect discoursive patterns of said elite and should not be taken as a global analysis of discourses in Kerala surrounding migration and the Kerala floods. Nonetheless, we think that this does not mean the mediatic discourse of the elite is not interesting and suitable for our research methods. Indeed, it is compossed of a set of speech acts that were originally meant to be read (hence, we are circumventing what Saussure term the “tyranny of the text”, all of our texts were meant to be read rather than spoken) and we belief that teh can provide an insight into how a dominant group (socio-economic elite) portrays and talks about dominated groups (marginalised communities).
As a first operation, we parsed our dataset into one to be analyzed with R studio text-analysis functions. Secondly, we operated a very basic analysis of the most recurrent words and bigrams For the first part of the analysis we chose to operate a Sentiment Analysis of our corpus. This type of analysis was readapted from Julia Singer’s and David Robison manual Text Mining With R on text analysis. We thought it was particularly appropriate to explore how the vocaboulary used in the corpus would translate into sentiments about the crisis.
We start by loading the following libraries
library(tidyverse)
library(rvest)
library(tidytext)
library(textdata)
library(tidyr)
The next step is to parse our 994 articles, which had previously been merged:
articles = read_html("merged_pruned.2.html") %>%
html_nodes(".enArticle") #We take all the elements with the class articles
Next, for allowing us to use the rich metadata indexed by Factiva, we extracted it NB: it takes a while
metadata = articles %>%
html_node("table") %>%
html_table() %>%
bind_rows(.id = "article_id") %>%
as_tibble() %>%
spread(X1, X2)
Now, we extract our paragraphs
paragraphs = map(articles, html_nodes, ".articleParagraph")
paragraphs = map(paragraphs, html_text)
Followed by merging our paragraphs and our metadata, while doing so we are also reordering the article_ids
complete_factiva = metadata %>%
mutate(article_id = as.numeric(article_id)) %>%
arrange(article_id) %>%
mutate(paragraphs = paragraphs)
Lastly, since the resulting table has more columns than the ones we need, we are re-formatting it.
kerala_factiva = complete_factiva %>%
select(article_name = AN, title = HD, date = PD, publication = PUB, place = RE, paragraphs) %>%
unnest(paragraphs) %>%
mutate(date = lubridate::dmy(date))
For the sake of obtaining a dataset that is easier to work with, we will commence by unnesting our paragraphs into words (which will bring us closer to a “long” dataset rather than a “wide” one). We store this process in the variable -kerala_tidy_factiva-.
kerala_tidy_factiva <- kerala_factiva %>%
unnest_tokens(word, paragraphs)
The result can be viewed like so
#view(kerala_tidy_factiva)
Now that we can visualize one word per row, we can proceed with some basic textual exploration
kerala_tidy_factiva %>%
group_by(word) %>% #group
summarise(occurrences = n()) %>% #count
arrange(-occurrences) #order our dataset
Unsurprisingly, the most frequent words used are relatively empty of meaning on their own (the, of, in, and …). These are commonly refered to as “stop-words”. To take care of this situation, we are going to remove them by using the anti-join function an a dictionary of stop words.
We start by visualizing said dictionary
stop_words
view(stop_words)
And we remove the words included in the dictionary from our corpus. Yet, after obtaining the first result of this operation, we notice that https is showed as being the 9th most recurrent word. We chose therefore to customize our stopwords in order to only have fully signifying lemmas (and numbers) appearing among the first 10 results. We believe this operation could have been further extended, but for the sake of this notebook we only wished to give a single useful example of how customized stopwords could be used. Here is how we customized them:
custom_stop_words <- bind_rows(tibble(word = c("https"),
lexicon = c("custom")),
stop_words)
custom_stop_words #we can visualize them
We can therefore visualize our results:
kerala_tidy_factiva %>%
anti_join (custom_stop_words) %>%
group_by(word) %>%
summarise(occurrences = n()) %>%
arrange(-occurrences) #%>% view()
Joining, by = "word"
The most frequently used words are “Kerala, people, floods, relief and India”. At this stage there is nothing particularly evident in the first results in terms of their wider interpretative meaning: they simply reflect the topic of the corpus. It is yet interesting to notice how they help us summarize the content of the whole corpus with only a few words and how they give us a description of the issue at stake. With the extra contextual knowledge we have we can reconstruct that: heavy rains hit the Kerala region in India and they had an impact on the population (people) which urges the need for relief (operations). Some interesting hypotheses could be made by looking at the words further down in the occurrence-ranking. These expand the scope of the issue and become much more “sentimentally” charged, with prevalence of the semantic fields of “death” and “politics”, including some economic references (rs, crore, etc.).
Some questions could then be set before proceeding with the analysis: A) is the narrative around the Kerala floods a highly negative one? Is there space for positive discourse? If yes, in which regard? B) is there an evolution towards a more positive discourse when discussing the aftermath of the crisis? We could indeed expect that yes, being it a crisis, the language is mainly negative (hypothesis A.) and that, yes, the discourse will gradually evolve towards a more positive one (hypothesis B.).
In order to conduct a sentiment analysis we will make use of an already made dictionary. In this case, we chose the AFINN from Finn Årup Nielsen, since unlike other dictionaries it does not use a binary. It ranks a series of words in a -5 to +5 scale. This is clearly a simplication but qualifying that terms like“abhorrent” (-3) have a larger negative load than “inconvinient” (-2) seems to be a more reasonable choice. Nonetheless, it is important to keep in mind that these methods do not take into account qualifiers before a word, such as in “no good” or “not true”, because they are based on unigrams only. However, we can reasonably believe that there is not a prevalence of sarcasm or negated text in our Factiva corpus because of the journalistic style (active, plain language), we consider sentiment analysis to be an appropriate technique to get a hint of how the media framed its discourse on human mobility during the Kerala floods.
We start by loading the data with the AFINN sentiment analysis:
get_sentiments("afinn")
NA
Now, we will try to examine how the sentiment changes across the narrative arc of our Factiva corpus on Kerala. We launch therefore our sentiment analysis.
Firstly, we need to group our articles in some way (for this example we are grouping them by theit title) and mutate their rownnumber() (which requires a numeric value) into a line number that can opeate with our next script.
kerala_for_sentiment_grouped_by_title<-kerala_tidy_factiva %>%
group_by(title) %>% #grouping articles by title
mutate(linenumber = row_number())
And here is our sentiment analysis. Nota Bene: we create chunks of text in our corpus composed of 40 lines each. We do so because our shortest article is compossed of 40 lines:
afinn_analysis <- kerala_for_sentiment_grouped_by_title %>%
inner_join(get_sentiments("afinn")) %>%
group_by(index = linenumber %/% 40) %>% #here we create chunks of text in our corpus composed of 40 lines each: we did so because our shortest article was compossed of 40 lines
summarise(sentiment = sum(value)) %>%
mutate(method = "AFINN")
Joining, by = "word"
And now we create a graph showing the sentiment analysis of each article:
bind_rows(afinn_analysis) %>%
ggplot(aes(index, sentiment, fill = method)) +
geom_col(show.legend = FALSE)
As a way of responding to our hypotheses, we can right away observe that indeed the general framing of the event was a negative one, in so far it used negatively loaded terms. Nonetheless, our hypothesis stating that the framing would become less negative as time passes seems to be mistaken.
Given that the articles in factiva are numbered in a “backwards chronological order” (meaning that the article coded as 0 in the index is the most recent article published in April 2020 and the last article is the first article published back in August 2019) we could say that the media’s discourse became particularly prone to pessism in recent times. We would venture, lacking a more substantial explanation that could make use of methods like ethnography, that this pessimism is due to the “social remembering” of the floods as a traumatic experience. Moreover, given that following disasters/crisis the media can more easily carry out a complete evaluation of the registered damages, losses and deaths only a posteriori, also thanks to data received from the government and NGOs might contextualise this finding. These latter evaluations will likely translate in a wider use of negative lemmas to narrate accounts of the crisis.
So far we have grouped articles by title. This means that each sentiment analysis takes into account individual new pieces. This might be interesting, but we think that accounting for dates of publication might be a more fruitful heuristic when answering our research.By doing so, we will add the sentiment scores of a given day. In this way, we can get a more accurate depiction. If a given day has three articles loaded with negative terms and one article loaded with positive terms, the aggregated sentiment score will reflect it. We are trading off grasping particularities for a more dynamic overview of the media’s framing of the discourse.
We start by grouping our articles by publication date:
kerala_for_sentiment_grouped_by_date<-kerala_tidy_factiva %>%
group_by(date) %>% #grouping by date of publication
mutate(linenumber = row_number())
And conduct our sentiment analysis:
afinn_analysis2 <- kerala_for_sentiment_grouped_by_date %>%
inner_join(get_sentiments("afinn")) %>%
group_by(index = linenumber %/% 40) %>%
summarise(sentiment = sum(value)) %>%
mutate(method = "AFINN")
Joining, by = "word"
Now let’s visualize the aforementioned sentiment analysis:
bind_rows(afinn_analysis2) %>%
ggplot(aes(index, sentiment, fill = method)) +
geom_col(show.legend = FALSE)
Now, this sentiment analysis seems to be slightly more nuanced. It does show a general pattern where more recent articles contain more negative terms (those having an index closer to 0). However when compared to the previous bar chart, it adds a nuance around the early articles that tended to use positive terms (we would venture that this is due to the reference to relief programs on the media at the early stages of media coverage). As already explained, we believe that the raising use of negative terms is probably due to the ontology of the media and their access to evaluations of the damage caused by the floods. Moreover, given that articles with a low index where close to the anniversary of the floods, we consider this to be another confirmation that the social remembering tended to focus on grief rather than hope and resilience.
Then, we decided to add some more immediate sentiment analysis visualizations for the reader to have e simple overview over the corpus and for us to compare the results of sentiment analysis with AFINN with a binary system like the bing, or nrc one.
Therefore we first tried to scan our full corpus with the bing vocaboulary with the aimto obtain a clearcut individuation of positive and negative words in the corpus. Here is how we did it and visualized it with ggplot:
bing_word_counts <- kerala_tidy_factiva %>%
inner_join(get_sentiments("bing")) %>%
count(word, sentiment, sort = TRUE) %>%
ungroup()
Joining, by = "word"
bing_word_counts %>%
group_by(sentiment) %>%
top_n(10) %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_col(show.legend = FALSE) +
facet_wrap(~sentiment, scales = "free_y") +
labs(y = "Contribution to sentiment",
x = NULL) +
coord_flip()
Selecting by n
These are the 10 most used negative words of our corpus, compared with the 10 most positive ones. It is rather clear that again we are dealing with a corpus that clearly opposes the semantic sphere of death and loss to the one of relief work. Yet, we find it interesting to observe the limits of not only this approach, because once we visualize this we cannot say much more, but also of the bing vocaboulary, as we can already observe that while the lemmas in the negative chart give us a more susbstantive idea/mirroring of the topic we are dealing with, in the positive chart we have a prevalence of not-so-telling positive adjectives. Again, we could repeat the operation of customing stopwords to only have clearcut negative or positive words. In fact, for ex. “issue”, showed among the most negative words does not necessarily have such a clear-cut negative sentimental charge, as much as the word “well” among the positive ones does not tell us much in terms of information altough it is indeed positively charged. We choose not to continue here, but it could be an idea if we wished to expand our analysis further.
We choose instead this time to proceed with the same analysis with nrc, to have an overview over our different vocaboularies’ functions and results. Nrc scans the corpus looking for different sentiments which share (overlapping) vocaboularies. It will look for lemmas that are signifiers of trust, joy, etc. Here is the same operation, but done with nrc:
nrc_word_counts <- kerala_tidy_factiva %>%
inner_join(get_sentiments("nrc")) %>%
count(word, sentiment, sort = TRUE) %>%
ungroup()
Joining, by = "word"
nrc_word_counts %>%
group_by(sentiment) %>%
top_n(10) %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_col(show.legend = FALSE) +
facet_wrap(~sentiment, scales = "free_y") +
labs(y = "Contribution to sentiment",
x = NULL) +
coord_flip()
Selecting by n
We find this visualization particularly interesting. First of all, if we observe sentiments such as “joy”, we could even say that the nrc vocaboulary not only allows us to regroup lemmas by sentiment, but it somehow also gives us a cluster view of topics in our corpus. Obviously, we do not want to infer that this would be an appropriate way to look at thematics within the corpus, as cluster analysis and topic modeling are much more adequate means to do so, but we still believe that this is an interesting observation to make. Indeed, a beginner might want to have a quick and rough thematic overview of the corpus, and this can be done by looking at nrc sentiments as “themes”. Nota Bene: Yet, once again we need to be careful, as “ministry” and “money” are for ex. qualified here as positive lemmas of “joy”, when they could also stand, in the articles, to indicate failures and damages.
As a last thing, and mostly to have fun, we tried to visualize wordclouds.
install.packages("wordcloud") #the first time
Error in install.packages : Updating loaded packages
library(wordcloud)
kerala_tidy_factiva %>%
anti_join(stop_words) %>%
count(word) %>%
with(wordcloud(word, n, max.words = 100))
Joining, by = "word"
It would be interesting to investigate why this cloud does not take the first 100 words as indicated. There could be a problem with visualization, that would merit further analysis.
install.packages("reshape2") #the first time
Error in install.packages : Updating loaded packages
library(reshape2)
kerala_tidy_factiva %>%
inner_join(get_sentiments("bing")) %>%
count(word, sentiment, sort = TRUE) %>%
acast(word ~ sentiment, value.var = "n", fill = 0) %>%
comparison.cloud(colors = c("gray20", "gray80"),
max.words = 100)
Joining, by = "word"
We can make the same observations as for the single cloud. Interpretatively, there is not much to add to what has been previously said as we believe that wordclouds could be interesting tools for divulgation of content, but they are not highly valid tools for analysis.
In the introduction we talked about the way subaltern groups like the dalits and the adivasi communities suffered from (the lack of) relief policies coming from a legal apparatus that did not recognize their livelihoods within its infrastructure. Based on this material discrimation, we would like to entertain the extend to which they are represented in the media’s discourse.
To do so, we will use the grepl function to search for instances of what we consider to be underprivileged communities. Our query is structure to be rather generous, it will include both an actually mention of a certain community (ex: dalit) and a mention of the general field (ex:casteism). Chiefly, the adivasi, the dalits, workers and people with differently abled bodies.
signifiers_of_marginality<-kerala_tidy_factiva %>%
select(date, word) %>%
mutate(word = tolower(word)) %>%
filter(grepl("adivasi|casteist|casteism|dalit|undercaste|tribe|tribal|untouchable|worker|marginal|poor|poverty|vulnerable|vulnerability|sick|diseased|handicapped", word)) %>%
select(date, word)
Let us plot this information according to the publication date of our articles
ggplot(signifiers_of_marginality)+
aes(date) +
geom_bar(colour = 'deeppink3', alpha=0.2)
In total, our search query yielded 856 observations of what we have defined as language about underprivileged communities. If we recall our earlier script at lines 257-264, this is in sharp contrast to the frequency used by terms like Kerala (4815 occurrences),people (3638 occurrences), floods (2990 occurrences) and relief (1962 occurences). Whereas we understand that words like Kerala and floods are emphasized by our own search query in Factiva, we also think that the fact that the media prefers to talk about “people” in general rather than about specific communities (and marginalized communities in particular) shows a remarkable pattern indeed. Chronologically speaking, the first major entrance of signifiers of marginalit in the media discourse took place in October. Two months afters the first coverage of the Kerala floods by the media, it rapidly decreased afterwards and had a minor preminance in December. Later on, it peaked on April and slowly decreased while the anniversary of the floods was approaching.
We want to examine the occurrences of the bigram “climate change” in our corpus. This will tell us to what extent the floods were related to the issue of climate change.
We therefore extract bigrams from our tidy dataset:
kerala_bigrams<-kerala_factiva %>%
unnest_tokens(word, paragraphs, token = "ngrams", n = 2)
#view(kerala_bigrams)
kerala_bigrams %>%
group_by(word) %>% #group
summarise(occurrences = n()) %>% #count
arrange(-occurrences) %>% #order our dataset
view()
And we then launch the grepl function on this new bigrams dataset:
kerala_climatechange<-kerala_bigrams %>%
select(date, word) %>%
mutate(word = tolower(word)) %>%
filter(grepl("climate change", word)) %>%
select(date, word)
view(kerala_climatechange)
The occurences - 53 - are minimal if we think of the lenght of our corpus. In this regard, we could say that the floods were in prevalence not linked to climate change. Yet, we wanted to proof-check this result and we therefore a simple manual operation. We went back to our full view of the corpus’ bigrams and with the search function we typed “climate” in order to see whether something was escaping our search. And, indeed, we saw that among the first results we had bigrams such as “climate apartheid”, “climate emergency”, “climate action”, etc. This was impressive and it made us reflect a lot on how multi-faceted language is, so that the kind of queries that we can formulate can hardly encapsulate this incredible linguistic complexity. In fact, much more terms could have been used to refer to the issue of climate change, most of them that we would not have thought of when designing the simple search. This also represented for us a strong warning message, as when doing text analysis we need to be extremely cautious before jumping to conclusions, and always try to proof check or nuance them.
In order to conduct further study, it would be interesting to also explore the number of occurrences of the bynom “climate change” in articles’ titles, in order to then identify which specific articles, from which specific journals, addressed the issue with this particular framing and make further conclusions. However, for the sake of this specific research, we will not get into this in-depth level of analysis.
As we mentioned at the very beginning, although issues and terms such as “climate displacement” and “climate migration”, or “climate refugee” are increasigly used in migration and asylum studies, they still lack a clearcut legal definition. In fact, there is a strong political reticence to frame these issues as such, as the option to include migration among the legitimate grounds for claiming asylum is perceived as a threat by unwilling governments. Indeed, they tend in particular to claim that, by formalizing climate-based asylum, they would push the door wide open to massive inflows of displaced people in the next decades. In this regard, the issue is strictly related to climate change. Here is the search we conducted with some of the terms increasingly used in Academia:
signifiers_climatemigration<-kerala_bigrams %>%
select(date, word) %>%
mutate(word = tolower(word)) %>%
filter(grepl("climate migration|climate migrants|climate refugee|climate asylum|climate displacement|climate displaced|internally displaced", word)) %>%
select(date, word)
view(signifiers_climatemigration)
However, the research yields only 3 results (2x climate refugees and 1x climate displacement). In this regard, we can say that the issue is totally disregarded by the media. Yet, we can also consider that this might not be a wilful choice, but that it might also be the result of a lack of knowledge of these terms, which are yet mainly confined within the “walls” of Academia.
In conclusion, this research led us to different considerations which concern both the corpus in itself and text analysis as a research tool. As we approached it with a beginner’s look, we carried out different analyses to compare and proof-check our results, and what we found out was deeply interesting. Therefore we:
a) explored the limits and potential of the diverse Sentiment-Analysis vocaboularies;
b) explored (from far) the concept of Distant Reading of a corpus;
c) understood the incredible potential of matching quantitative text analysis with qualitative interpretation. The latter helped us to make sense of the limitations and the margins of error, as well as to extract overarching conclusions by comparing different numerical results.
In terms of the corpus itself we concluded that:
1) there is indeed a prevalence of negative discourse, linked to the semantic fields of death, loss and disaster. Yet, we also witness the presence of positive lemmas. These are mainly linked to the field of relief and reconstruction. It would be interesting, in the future, to explore whether other crises/disaster have been narrated in the same way (maybe by also operating a comparison between different types of crisis);
2) that this negative discourse is (surprisingly) stronger in the later stages of the crisis. We attributed this to a pattern of traumatic social remembering, as well as to a wider availability of clear post-crisis “damage assessments”. Yet, we cannot be sure of this, and it would be very interesting to conduct more qualitative research on the crisis management by the government, as its (un)effectiveness could also have been among the determinants of this negative trend;
3) and lastly that: a) there are indeed invisibility patterns with regards to certain casts and segments of society within our corpus that reflect wider social dynamics; b) references to “climate change” as such are very limited, but that this should not lead us to too quick conclusions as this issue could have (and seems to have been) represented with other expressions and idioms across the corpus; c) there is no reference to climate displacement in the corpus, what also reflects a wider lack of legal and political consensus over this issue at large.
Bibliography:
Mathrubuni, Leaving No One Behind:Lessons from the Kerala Disasters, 2019
National Dalit Watch, The extent of inclusion of Dalit and Adivasi communities in the post disaster response in Kerala 2019, 2019
Silge, Julia and Robinson David, Text Mining with R, 2020 https://www.tidytextmining.com/
UNDP, Post Disaster Needs Assesment, https://www.undp.org/content/undp/en/home/librarypage/crisis-prevention-and-recovery/post-disaster-needs-assessment---kerala.html 2018