Text as Data: Blog #6

Final Blog Post

Kalimah Muhammad
2022-04-25

In this final blog post I will attempt to conduct some comparative analysis between subreddit groups and pre- and post-pandemic years. As a reminder my research questions are:

Research Questions
* Did conversations of faith and religion strengthen, weaken, or remain the same among the faith subreddit groups: r/Christianity, r/Islam, and r/Judaism, during the pandemic years of 2020-2021 as compared to 2019?
* How did faith and religion permeate discussions of covid-19?

For more details on how documents are dispersed between groups and years. I looked at the following tables.


CHRISTIANITY      COVID19        ISLAM      JUDAISM 
         791          946          866          878 

2019 2020 2021 
 448 1759 1274 

The first table shows the distribution of top posts by subreddit group across all three years, 2019 -2021. The second table is the total thread distribution by year. Finally, the last plot shows the distribution by subreddit by year.

Correlated Topic Model

Topic 1 Top Words:
     Highest Prob: just, pray, please, thank, time, can, people 
     FREX: asking, surgery, tomorrow, shahada, process, dog, pornography 
     Lift: -1, -become, -built, -chanted, -finally, -gone, -had 
     Score: thank, pray, prayers, please, porn, just, post 
Topic 2 Top Words:
     Highest Prob: muslim, one, day, first, today, mosque, us 
     FREX: eid, art, mum, concentration, painted, risen, easter 
     Lift: -22, -6, -niv, -strength-love_, #3, #4, #5 
     Score: mum, muslim, mosque, forty, day, today, killed 
Topic 3 Top Words:
     Highest Prob: jewish, jews, people, us, can, like, israel 
     FREX: racism, shalom, racist, antisemitic, rabbi, me_, anti-semitism 
     Lift: legislation, mankind, onset, racial, represent, -dr, #_ 
     Score: jews, jewish, israel, nbsp, racism, shalom, https://ncov2019.live/data 
Topic 4 Top Words:
     Highest Prob: god, love, jesus, like, life, just, church 
     FREX: atheism, lgbt, loves, internet, drew, jesus, started 
     Lift: -kobe, -matthew, -with, #1-, _and, _awhile, _couldnt 
     Score: god, love, felt, jesus, started, church, really 
Topic 5 Top Words:
     Highest Prob: covid-19, sars-cov-2, vaccine, coronavirus, patients, covid-19_, study 
     FREX: covid-19, sars-cov-2, vaccine, covid-19_, infection, clinical, vitamin 
     Lift: 2019_, 2020_, activity, admission, agency, agreement, analysis_ 
     Score: covid-19, sars-cov-2, vaccine, covid-19_, vitamin, clinical, antibody 

Here we can see similar terms from the previous post’s word clouds and key terms for applying dictionary approaches in context.

Plot Examples

Plot of Topic Model by Prevalence

Plot of Labels

Plot of Two Perspectives- Topics 1 and 5

In the plot above, “covid” is by far the most dominant term between the two topics, followed by other covid-related themes. Interestingly, terms in Topics 1 and 5 are somewhat grouped together while “covid” specifically is isolated from the groupings.

Topic Names and Labels

The following is a list of topic names based on the score values from the topics above.

[1] "thank_pray_prayers_please"            
[2] "mum_muslim_mosque_forty"              
[3] "jews_jewish_israel_nbsp"              
[4] "god_love_felt_jesus"                  
[5] "covid-19_sars-cov-2_vaccine_covid-19_"

Here we can see there is possibly a topic relevant to each of the various subreddits groups: #2 - r/Islam, #3 - r/Judaism, #4 - r/Christianity, and #5- r/Covid19. The first topic could be for religious topics across the three faith groups.

Plot Topic Loess

Here I assume the pink lines arching upwards could be topics related to covid-19 since there was a great increase in threads related to the topic during 2020 and 2021. Similarly the other faith based topics may have had a continuous presence in the threads appearing somewhat flat in topic proportions. Note, a legend could easily solve this uncertainty; however, I had difficulty adding one to this plot.

Plotting Estimate of Effect

The following plots returns the estimated topic prevalence by year and topic name.

Lastly, this graphs above show the expected topic proportion over time. Most topics with the exceptions of topic #5 on covid and topic # 3 on Judaism see a decline over time.