1. PREPARE

    1. Context:

      1. Working parents in the United States experience limited support systems due to lack of affordable childcare, absence of paid family leave, and abbreviated family leave (Ogrysko, 2019). Mothers disproportionately manage children’s needs, especially during their children’s early years, due to cultural expectations, gender pay disparity, and likely other factors as well (Kim, 2021; Ogrysko, 2019). As a result, mothers of young children face barriers to career advancement and personal wellbeing, experiencing constrained schedules due to mothering duties and decreased time for sleep, exercise, social engagement, or other self-care. While the United States guarantees legal protection and accommodation for individuals with disabilities through legislation such as the Americans with Disabilities Act (ADA) and the Individuals with Disabilities Education Act (IDEA) and subsidized health insurance for lower income families with disabled children (Medicaid), “on the ground” these programs can be time-consuming and confusing to negotiate. Additionally, children with disabilities typically require frequent medical and therapy appointments and may have unpredictable episodes or illness or even hospitalization. Thus mothers of children with disabilities face significant constraints on their time and are asked to balance many roles (Kim, 2021; Stewart, 2020). To better understand the current dialogue surrounding mothers of children with disabilities, I investigated recent tweets with keywords such as “mom” or “mother” and “disability” or “special needs” and analyzed sentiment of the tweets using the afinn, loughran, bing, and nrc lexicons. Lexicons are pre-existing collections of words with associated sentiment or sentiments attached to them. While some lexicons characterize words on many axes (trust, etc.) all used here also offer a basic positive and negative characterization. Lexicons are created using available texts, generally from online sources, and therefore may not be accurate or valid in all contexts. Validity is enhanced with human review of words and sentiment values to ensure accuracy.

      2. My guiding questions for this report are:

        1. What is the overall sentiment of recent tweets on the topic of mothers parenting children with disabilities?

        2. Does sentiment vary based on keyword (disability vs. special needs)?

        3. Does sentiment vary by lexicon?

        Another question that I am interested in (though I won’t address it in this project, but possibly will in my final project for this course) is whether sentiment varies based on the location of the Twitter poster. More specifically, does sentiment vary in states with expansive free or low-cost pre-K programs?

        Some evidence suggests that greater access to low-cost early childhood education improves lifelong developmental and educational trajectories for children with disabilities (as well as for students from low income families and English language learners). It would also be interesting to see if such programs offer a “spillover effect” to mothers of kids with disabilities. Are mothers’ experiences different (as viewed by sentiment of tweets) in states with expansive pre-K programs versus those without?

    2. Set up: To begin, I’ll install the required packages. Following, I load them into the library.

Sys.setlocale("LC_MESSAGES", "en_US.utf8")
## Warning in Sys.setlocale("LC_MESSAGES", "en_US.utf8"): LC_MESSAGES exists on
## Windows but is not operational
## Warning in Sys.setlocale("LC_MESSAGES", "en_US.utf8"): OS reports request to set
## locale to "en_US.utf8" cannot be honored
## [1] ""
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.0.5
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
        library(readr)
## Warning: package 'readr' was built under R version 4.0.5
        library(tidyr)
## Warning: package 'tidyr' was built under R version 4.0.5
        library(rtweet)
## Warning: package 'rtweet' was built under R version 4.0.5
        library(writexl)
## Warning: package 'writexl' was built under R version 4.0.5
        library(readxl)
## Warning: package 'readxl' was built under R version 4.0.5
        library(tidytext)
## Warning: package 'tidytext' was built under R version 4.0.5
        library(textdata)
## Warning: package 'textdata' was built under R version 4.0.5
        library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.5
        library(textdata)
        library(scales)
## Warning: package 'scales' was built under R version 4.0.5
## 
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
## 
##     col_factor
        library(wordcloud2)
## Warning: package 'wordcloud2' was built under R version 4.0.5
3.  Next, I'll store API keys and authenticate them and check to see that the token is loaded. Note: secret keys are hidden.
## authenticate via web browser
        token <- create_token(
          app = app_name,
          consumer_key = api_key,
          consumer_secret = api_secret_key,
          access_token = access_token,
          access_secret = access_token_secret)
        ## check to see if the token is loaded
        get_token()
## <Token>
## <oauth_endpoint>
##  request:   https://api.twitter.com/oauth/request_token
##  authorize: https://api.twitter.com/oauth/authenticate
##  access:    https://api.twitter.com/oauth/access_token
## <oauth_app> SNAfoo
##   key:    clxr0mEFqC3FwV5ESDd8d3yNF
##   secret: <hidden>
## <credentials> oauth_token, oauth_token_secret
## ---
  1. WRANGLE

    2a. Importing Tweets

    1. First, I’ll import the tweets and view the resulting data frame.
#disability_tweets <- search_tweets(q = "#disability", n=5000)


 #       specialneeds_tweets <- search_tweets(q = "#specialneeds" , n=5000)

       # mom_disspecneeds_tweets <- search_tweets(q = "#disability OR #specialneeds 
                                    #           AND mom" ,
                                     #          n=5000,
                                      #         include_rts = FALSE)

        #kid_disspecneeds_tweets <- search_tweets(q = "#disabledchild OR #specialneeds 
                                              # AND mom" ,
                                                # n=5000,
                                                # include_rts = FALSE)

        #There's actually a lot of overlap b/t the findings of the previous two
        #searches, so the first set of terms seems to be sufficient. 
2.  Next I'll create two dictionaries, one for the keyword "disability" and the other for the keyword phrase "special needs"
#Next I'll create the dictionaries for 'special needs mom' and 'disabled
        #child mom

        specneedsmom_dictionary <- c("#specialneeds AND mom",
                             '"#specialneeds AND mother"',
                             '"special needs mom"',
                             '"special needs kid"',
                             '"special needs child"')

        snm_tweets <- search_tweets2(specneedsmom_dictionary,
                                      n=5000,
                                      include_rts = FALSE)

        diskidmom_dictionary <- c("#disabledchild AND mom",
                              '"#disabledchild AND mother"',
                              '"disabled child mom"',
                              '"disabled kid mom"',
                              '"disabled kid"',
                              '"disabled child"')

        dm_tweets <- search_tweets2(diskidmom_dictionary,
                                     n=5000,
                                     include_rts = FALSE)
3.  Next, I'll save the tweet files to Excel. This allows me to have a stable set of data, since Twitter and tweets are constantly changing. This would be useful if I want to do further analysis on this same set of tweets.
## Saving tweet files to Excel (need to create data folder first)
        #write_xlsx(snm_tweets, "data/snm_tweets.xlsx")
        #write_xlsx(dm_tweets, "data/dm_tweets.xlsx")
        dm_tweets <- read_xlsx("data/dm_tweets.xlsx")
        snm_tweets <- read_xlsx("data/snm_tweets.xlsx")
    [2b. Tidying the text]{.ul}

4.  Here I'll filter tweets by language, select relevant columns, add a column for keyword ("disabled" vs. "special needs") and relocate that column to first position
#for disability


dm_text <- dm_tweets %>%
          filter(lang == "en") %>%
          select(screen_name, created_at, text) %>%
          mutate(keyword = "disability") %>%
          relocate(keyword)
 #for special needs
        snm_text <- snm_tweets %>%
          filter(lang == "en") %>%
          select(screen_name, created_at, text) %>%
          mutate(keyword = "special needs") %>%
          relocate(keyword)
5.  Combine data frames and looking at head & tail of data frame
tweets <- bind_rows(dm_text, snm_text)
        head(tweets)
## # A tibble: 6 x 4
##   keyword    screen_name    created_at          text                            
##   <chr>      <chr>          <dttm>              <chr>                           
## 1 disability dropoutninja   2022-02-02 15:05:18 "you know what's even more deva~
## 2 disability dhargenerator  2022-02-02 11:37:11 "Disabled Kid threatens to fire~
## 3 disability dhargenerator  2022-01-29 11:57:25 "Disabled Kid witch hunts Famou~
## 4 disability dhargenerator  2022-01-30 21:46:59 "Gay Democrats threatens to fir~
## 5 disability RyanLavender94 2022-02-02 06:51:06 "@charlieINTEL Bragging about b~
## 6 disability Minagelina     2022-02-02 03:35:19 "@kaptanobveus I wonder what we~
        tail(tweets)
## # A tibble: 6 x 4
##   keyword       screen_name     created_at          text                        
##   <chr>         <chr>           <dttm>              <chr>                       
## 1 special needs walrozt         2022-01-25 14:43:59 "@WataAce1 @marybaphomet No~
## 2 special needs MrProPHessional 2022-01-25 14:22:53 "@queenglitter4 As a father~
## 3 special needs KinsG8R         2022-01-25 14:11:38 "@ddespairmusic @RollingSto~
## 4 special needs Rasuberri       2022-01-25 14:01:57 "@pulte Single mom of a spe~
## 5 special needs HeatherBayne6   2022-01-25 13:58:46 "@Jayecane Yes I got into a~
## 6 special needs RoseDaddyMike   2022-01-25 13:28:38 "@Neilyoung  was a pile of ~
6.  Tokenizing the text
tweet_tokens <- 
          tweets %>%
          unnest_tokens(output = word, 
                        input = text, 
                        token = "tweets")
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
7.  Removing stop words, looking at top word counts, and filtering out nonsense words.
## Removing stop words
        tidy_tweets <-
          tweet_tokens %>%
          anti_join(stop_words, by = "word")

        ## Looking at top word counts
        count(tidy_tweets, word, sort = T)
## # A tibble: 6,283 x 2
##    word         n
##    <chr>    <int>
##  1 child      751
##  2 disabled   656
##  3 special    372
##  4 kid        258
##  5 amp        111
##  6 parent      99
##  7 addition    96
##  8 im          96
##  9 dont        94
## 10 share       94
## # ... with 6,273 more rows
        #Making the code more orderly and removing nonsense words
        tidy_tweets <-
          tweet_tokens %>%
          anti_join(stop_words, by = "word") %>%
          filter(!word == "amp" & !word == "im" & !word == "special" 
                 & !word == "disabled" & !word == "child" 
                 & !word == "kid" & !word == "#etsy" & !word == "kids"
                 & !word == "1" & !word == "2"  & !word == "3"  
                 & !word == "4")
                 
        count(tidy_tweets, word, sort = T)
## # A tibble: 6,271 x 2
##    word                             n
##    <chr>                        <int>
##  1 parent                          99
##  2 addition                        96
##  3 dont                            94
##  4 share                           94
##  5 bite                            93
##  6 shop                            93
##  7 childautismcerebral             91
##  8 excited                         91
##  9 palsyarthritisautoaggression    91
## 10 school                          86
## # ... with 6,261 more rows
    [2c. Sentiment Values]{.ul}

8.  Next I'll add sentiment values. For every lexicon but bing, I have to select '1'

    in the console
afinn <- get_sentiments("afinn")

        bing <- get_sentiments("bing")

        nrc <- get_sentiments("nrc")

        loughran <- get_sentiments("loughran")
9.  Next, I will join each lexicon with the tidy_tweets file, creating a separate column that designates the lexicon being used. *(At least, I think that's what I'm doing here!)*
sentiment_afinn <- inner_join(tidy_tweets, afinn, by = "word")

        sentiment_afinn
## # A tibble: 1,930 x 5
##    keyword    screen_name   created_at          word        value
##    <chr>      <chr>         <dttm>              <chr>       <dbl>
##  1 disability dropoutninja  2022-02-02 15:05:18 devastating    -2
##  2 disability dropoutninja  2022-02-02 15:05:18 support         2
##  3 disability dropoutninja  2022-02-02 15:05:18 support         2
##  4 disability dhargenerator 2022-02-02 11:37:11 threatens      -2
##  5 disability dhargenerator 2022-02-02 11:37:11 fire           -2
##  6 disability dhargenerator 2022-02-02 11:37:11 shock          -2
##  7 disability dhargenerator 2022-01-29 11:57:25 shock          -2
##  8 disability dhargenerator 2022-01-30 21:46:59 threatens      -2
##  9 disability dhargenerator 2022-01-30 21:46:59 fire           -2
## 10 disability dhargenerator 2022-01-30 21:46:59 regrets        -2
## # ... with 1,920 more rows
        sentiment_bing <- inner_join(tidy_tweets, bing, by = "word")

        sentiment_bing
## # A tibble: 1,801 x 5
##    keyword    screen_name   created_at          word        sentiment
##    <chr>      <chr>         <dttm>              <chr>       <chr>    
##  1 disability dropoutninja  2022-02-02 15:05:18 devastating negative 
##  2 disability dropoutninja  2022-02-02 15:05:18 support     positive 
##  3 disability dropoutninja  2022-02-02 15:05:18 support     positive 
##  4 disability dhargenerator 2022-02-02 11:37:11 spoiled     negative 
##  5 disability dhargenerator 2022-02-02 11:37:11 shock       negative 
##  6 disability dhargenerator 2022-01-29 11:57:25 famous      positive 
##  7 disability dhargenerator 2022-01-29 11:57:25 shock       negative 
##  8 disability dhargenerator 2022-01-30 21:46:59 instantly   positive 
##  9 disability dhargenerator 2022-01-30 21:46:59 regrets     negative 
## 10 disability ZeffieXD      2022-02-02 01:36:57 bad         negative 
## # ... with 1,791 more rows
        sentiment_nrc <- inner_join(tidy_tweets, nrc, by = "word")

        sentiment_nrc
## # A tibble: 7,744 x 5
##    keyword    screen_name   created_at          word        sentiment
##    <chr>      <chr>         <dttm>              <chr>       <chr>    
##  1 disability dropoutninja  2022-02-02 15:05:18 devastating anger    
##  2 disability dropoutninja  2022-02-02 15:05:18 devastating disgust  
##  3 disability dropoutninja  2022-02-02 15:05:18 devastating fear     
##  4 disability dropoutninja  2022-02-02 15:05:18 devastating negative 
##  5 disability dropoutninja  2022-02-02 15:05:18 devastating sadness  
##  6 disability dropoutninja  2022-02-02 15:05:18 devastating trust    
##  7 disability dropoutninja  2022-02-02 15:05:18 disability  negative 
##  8 disability dropoutninja  2022-02-02 15:05:18 disability  sadness  
##  9 disability dhargenerator 2022-02-02 11:37:11 fire        fear     
## 10 disability dhargenerator 2022-02-02 11:37:11 shock       anger    
## # ... with 7,734 more rows
        sentiment_loughran <- inner_join(tidy_tweets, loughran, by = "word")

        sentiment_loughran
## # A tibble: 959 x 5
##    keyword    screen_name     created_at          word        sentiment
##    <chr>      <chr>           <dttm>              <chr>       <chr>    
##  1 disability dropoutninja    2022-02-02 15:05:18 devastating negative 
##  2 disability dhargenerator   2022-02-02 11:37:11 threatens   negative 
##  3 disability dhargenerator   2022-01-30 21:46:59 threatens   negative 
##  4 disability ZeffieXD        2022-02-02 01:36:57 bad         negative 
##  5 disability ZeffieXD        2022-02-02 01:36:57 fired       negative 
##  6 disability SolidCes        2022-02-01 20:36:05 successful  positive 
##  7 disability K_BallantyneArt 2022-02-01 19:29:57 frustrated  negative 
##  8 disability Randy_Bobandy88 2022-02-01 19:12:26 bad         negative 
##  9 disability Randy_Bobandy88 2022-02-01 19:12:26 criticize   negative 
## 10 disability ReitzPiper      2022-02-01 12:50:14 hurt        negative 
## # ... with 949 more rows
  1. EXPLORE

    1. Create a time series visualization
 ts_plot(tweets, by = "days") ##plot by days

        ts_plot(tweets, by = "hours") ## plot by hours

2.  Plot by groups of keywords
#plot by keyword (disability vs. special needs)
        ts_plot(dplyr::group_by(tweets, keyword), "hours")

        ts_plot(dplyr::group_by(tweets, keyword), "days")

3.  Analyzing sentiment, grouping by keyword, and creating a sentiment score for the **bing** lexicon, and adding a lexicon variable (column) to the data frame.
# Bing: Creating a single sentiment score and adding a lexicon variable
        # (the spread function from the tidyr package transforms our sentiment
        # column into separate columns for negative and positive
        # that contains the n counts for each)
        summary_bing <- sentiment_bing %>% 
          group_by(keyword) %>% 
          count(sentiment, sort = TRUE) %>% 
          spread(sentiment, n) %>%
          mutate(sentiment = positive - negative) %>%
          mutate(lexicon = "bing") %>%
          relocate(lexicon)

        summary_bing
## # A tibble: 2 x 5
## # Groups:   keyword [2]
##   lexicon keyword       negative positive sentiment
##   <chr>   <chr>            <int>    <int>     <int>
## 1 bing    disability         727      356      -371
## 2 bing    special needs      342      376        34
4.  Repeating the steps above for the remaining lexicons:

    1.  afinn
 # repeating the step above but for afinn lexicon
            summary_afinn <- sentiment_afinn %>% 
              group_by(keyword) %>% 
              summarise(sentiment = sum(value)) %>% 
              mutate(lexicon = "afinn") %>%
              relocate(lexicon)

            summary_afinn
## # A tibble: 2 x 3
##   lexicon keyword       sentiment
##   <chr>   <chr>             <dbl>
## 1 afinn   disability         -622
## 2 afinn   special needs       309
    2.  loughran
# repeating bing steps above for loughran lexicon and filtering summary
            # to only see positive and negative values

            summary_loughran <- sentiment_loughran %>% 
              group_by(keyword) %>% 
              count(sentiment, sort = TRUE) %>% 
              spread(sentiment, n) %>%
              mutate(sentiment = positive - negative) %>%
              mutate(lexicon = "loughran") %>%
              relocate(lexicon)

            summary_loughran
## # A tibble: 2 x 9
## # Groups:   keyword [2]
##   lexicon  keyword       constraining litigious negative positive superfluous
##   <chr>    <chr>                <int>     <int>    <int>    <int>       <int>
## 1 loughran disability              24        56      416       63           1
## 2 loughran special needs           18        24      177      122          NA
## # ... with 2 more variables: uncertainty <int>, sentiment <int>
            summary_loughran_2 <- summary_loughran %>%
              select(lexicon, keyword, negative, positive, sentiment)

            summary_loughran_2
## # A tibble: 2 x 5
## # Groups:   keyword [2]
##   lexicon  keyword       negative positive sentiment
##   <chr>    <chr>            <int>    <int>     <int>
## 1 loughran disability         416       63      -353
## 2 loughran special needs      177      122       -55
    3.  nrc
# repeating above steps for nrc lexicon; also selecting
            # only rows that contain "positive" and "negative" b/c
            # nrc lexicon contains other values like "trust","sadness"

            summary_nrc <- sentiment_nrc %>% 
              group_by(keyword) %>% 
              count(sentiment, sort = TRUE) %>%
              spread(sentiment, n) %>%
              mutate(sentiment = positive - negative) %>%
              mutate(lexicon = "nrc") %>%
              relocate(lexicon)

            summary_nrc
## # A tibble: 2 x 13
## # Groups:   keyword [2]
##   lexicon keyword       anger anticipation disgust  fear   joy negative positive
##   <chr>   <chr>         <int>        <int>   <int> <int> <int>    <int>    <int>
## 1 nrc     disability      378          379     272   469   282      784      750
## 2 nrc     special needs   156          388     106   185   399      435      701
## # ... with 4 more variables: sadness <int>, surprise <int>, trust <int>,
## #   sentiment <int>
            summary_nrc_2 <- summary_nrc %>%
              select(lexicon, keyword, negative, positive, sentiment)

            summary_nrc_2
## # A tibble: 2 x 5
## # Groups:   keyword [2]
##   lexicon keyword       negative positive sentiment
##   <chr>   <chr>            <int>    <int>     <int>
## 1 nrc     disability         784      750       -34
## 2 nrc     special needs      435      701       266
  1. MODEL

    1. The “Modeling” step of the learning analytics workflow involves using statistical models to analyze data and, where possible, make predictions. I’ll also use this section to visualize how sentiment varies by lexicon and what the nature of word choice is in tweets about mothering disabled or special needs children. Because I’m changing the order slightly from the Walkthrough, I’m going to do some “polishing” steps here…

    2. Polishing

      1. Step 1: Organizing filters, etc. a bit better
dm_text <-
              dm_tweets %>%
              filter(lang == "en") %>%
              select(status_id, text) %>%
              mutate(keyword = "disability") %>%
              relocate(keyword)

            snm_text <-
              snm_tweets %>%
              filter(lang == "en") %>%
              select(status_id, text) %>%
              mutate(keyword = "special needs") %>%
              relocate(keyword)
    2.  Step 2: merging data frames
 #merging the two data frames and taking a look
            tweets <- bind_rows(dm_text, snm_text)

            tweets
## # A tibble: 1,072 x 3
##    keyword    status_id           text                                          
##    <chr>      <chr>               <chr>                                         
##  1 disability 1488891280469991425 "you know what's even more devastating than n~
##  2 disability 1488838905637969920 "Disabled Kid threatens to fire Spoiled Impos~
##  3 disability 1487394447344091139 "Disabled Kid witch hunts Famous Mechanic, wh~
##  4 disability 1487905204007702528 "Gay Democrats threatens to fire Disabled Kid~
##  5 disability 1488766912565850113 "@charlieINTEL Bragging about beating this ga~
##  6 disability 1488717642504589319 "@kaptanobveus I wonder what we can do from h~
##  7 disability 1488712251443843075 "also i am attracted to the disabled kid"     
##  8 disability 1488115941556715520 "also i am attracted to the disabled kid"     
##  9 disability 1487813895284658177 "also i am attracted to the disabled kid"     
## 10 disability 1487051331751710721 "also i am attracted to the disabled kid"     
## # ... with 1,062 more rows
            head(tweets)
## # A tibble: 6 x 3
##   keyword    status_id           text                                           
##   <chr>      <chr>               <chr>                                          
## 1 disability 1488891280469991425 "you know what's even more devastating than no~
## 2 disability 1488838905637969920 "Disabled Kid threatens to fire Spoiled Impost~
## 3 disability 1487394447344091139 "Disabled Kid witch hunts Famous Mechanic, wha~
## 4 disability 1487905204007702528 "Gay Democrats threatens to fire Disabled Kid,~
## 5 disability 1488766912565850113 "@charlieINTEL Bragging about beating this gam~
## 6 disability 1488717642504589319 "@kaptanobveus I wonder what we can do from he~
            tail(tweets)
## # A tibble: 6 x 3
##   keyword       status_id           text                                        
##   <chr>         <chr>               <chr>                                       
## 1 special needs 1485986814200528902 "@WataAce1 @marybaphomet No, you used it as~
## 2 special needs 1485981505226776577 "@queenglitter4 As a father of a brilliant ~
## 3 special needs 1485978671315886082 "@ddespairmusic @RollingStone Principles? L~
## 4 special needs 1485976238149754882 "@pulte Single mom of a special needs child~
## 5 special needs 1485975433132879874 "@Jayecane Yes I got into a bad accident my~
## 6 special needs 1485967849881575424 "@Neilyoung  was a pile of Sh#t that abando~
    3.  Step 3: analyzing sentiment from the afinn lexicon
#Cleaning up code for analyzing sentiment from each lexicon
            #afinn
            customwords <- c("amp" , "im" , "child" , "disabled" ,
                             "special" , "kid", "1" , "2" , "3" , "4")

            sentiment_afinn <- tweets %>%
              unnest_tokens(output = word, 
                            input = text, 
                            token = "tweets")  %>% 
              anti_join(stop_words, by = "word") %>%
              filter(!word == "amp" & !word == "im" & !word == "special" 
                     & !word == "disabled" & !word == "child" 
                     & !word == "kid" & !word == "#etsy" & !word == "kids"
                     & !word == "1" & !word == "2"  & !word == "3"  
                     & !word == "4") %>%
              inner_join(afinn, by = "word")
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
            # Dr. J: wondering if I could have done filter(!word == "customwords") %>%
            # above instead

            sentiment_afinn
## # A tibble: 1,930 x 4
##    keyword    status_id           word        value
##    <chr>      <chr>               <chr>       <dbl>
##  1 disability 1488891280469991425 devastating    -2
##  2 disability 1488891280469991425 support         2
##  3 disability 1488891280469991425 support         2
##  4 disability 1488838905637969920 threatens      -2
##  5 disability 1488838905637969920 fire           -2
##  6 disability 1488838905637969920 shock          -2
##  7 disability 1487394447344091139 shock          -2
##  8 disability 1487905204007702528 threatens      -2
##  9 disability 1487905204007702528 fire           -2
## 10 disability 1487905204007702528 regrets        -2
## # ... with 1,920 more rows
            afinn_score <- sentiment_afinn %>% 
              group_by(keyword, status_id) %>% 
              summarise(value = sum(value))
## `summarise()` has grouped output by 'keyword'. You can override using the
## `.groups` argument.
            afinn_score
## # A tibble: 879 x 3
## # Groups:   keyword [2]
##    keyword    status_id           value
##    <chr>      <chr>               <dbl>
##  1 disability 1485965986058715137    -2
##  2 disability 1485970391323799553     5
##  3 disability 1485973110335627268    -3
##  4 disability 1485980376552165376    -3
##  5 disability 1485981992185438212    -2
##  6 disability 1485983403367411717    -2
##  7 disability 1485988403309162499    -4
##  8 disability 1485992900513124356    -7
##  9 disability 1485994909274460161    -1
## 10 disability 1485997459696504845    -1
## # ... with 869 more rows
            afinn_sentiment <- afinn_score %>%
              filter(value != 0) %>%
              mutate(sentiment = if_else(value < 0, "negative", "positive"))

            afinn_sentiment
## # A tibble: 836 x 4
## # Groups:   keyword [2]
##    keyword    status_id           value sentiment
##    <chr>      <chr>               <dbl> <chr>    
##  1 disability 1485965986058715137    -2 negative 
##  2 disability 1485970391323799553     5 positive 
##  3 disability 1485973110335627268    -3 negative 
##  4 disability 1485980376552165376    -3 negative 
##  5 disability 1485981992185438212    -2 negative 
##  6 disability 1485983403367411717    -2 negative 
##  7 disability 1485988403309162499    -4 negative 
##  8 disability 1485992900513124356    -7 negative 
##  9 disability 1485994909274460161    -1 negative 
## 10 disability 1485997459696504845    -1 negative 
## # ... with 826 more rows
            afinn_ratio <- afinn_sentiment %>% 
              group_by(keyword) %>% 
              count(sentiment) %>% 
              spread(sentiment, n) %>%
              mutate(ratio = negative/positive)

            afinn_ratio
## # A tibble: 2 x 4
## # Groups:   keyword [2]
##   keyword       negative positive ratio
##   <chr>            <int>    <int> <dbl>
## 1 disability         290      170 1.71 
## 2 special needs      143      233 0.614
3.  Keyword Differences: graphing positive versus negative tweets for the two keywords
 #For keyword 'disability'
        afinn_counts_dis <- afinn_sentiment %>%
          group_by(keyword) %>% 
          count(sentiment) %>%
          filter(keyword == "disability")

        afinn_counts_dis %>%
          ggplot(aes(x="", y=n, fill=sentiment)) +
          geom_bar(width = .6, stat = "identity") +
          labs(title = "Disability, Disabled Child, & Mom",
               subtitle = "Proportion of Positive & Negative Tweets") +
          coord_polar(theta = "y") +
          theme_void()

    Sentiment is decidedly more negative with the "disability" keyword than it is with the "special needs" keyword phrase (below).
#Repeat for special needs

        afinn_counts_sn <- afinn_sentiment %>%
          group_by(keyword) %>% 
          count(sentiment) %>%
          filter(keyword == "special needs")

        afinn_counts_sn
## # A tibble: 2 x 3
## # Groups:   keyword [1]
##   keyword       sentiment     n
##   <chr>         <chr>     <int>
## 1 special needs negative    143
## 2 special needs positive    233
        afinn_counts_sn %>%
          ggplot(aes(x="", y=n, fill=sentiment)) +
          geom_bar(width = .6, stat = "identity") +
          labs(title = "Special Needs and Mom",
               subtitle = "Proportion of Positive & Negative Tweets") +
          coord_polar(theta = "y") +
          theme_void()

    1.  Calculating sentiment scores for each lexicon and then comparing positive and negative sentiment for each lexicon visually.
# Creating "summary" data frames for each sentiment, parsing out summary scores of positive and negative sentiment.

            summary_afinn3 <- sentiment_afinn %>% 
              group_by(keyword) %>% 
              filter(value != 0) %>%
              mutate(sentiment = if_else(value < 0, "negative", "positive")) %>% 
              count(sentiment, sort = TRUE) %>% 
              mutate(method = "afinn")

            summary_bing3 <- sentiment_bing %>% 
              group_by(keyword) %>% 
              count(sentiment, sort = TRUE) %>% 
              mutate(method = "bing")

            summary_nrc3 <- sentiment_nrc %>% 
              filter(sentiment %in% c("positive", "negative")) %>%
              group_by(keyword) %>% 
              count(sentiment, sort = TRUE) %>% 
              mutate(method = "nrc") 

            summary_loughran3 <- sentiment_loughran %>% 
              filter(sentiment %in% c("positive", "negative")) %>%
              group_by(keyword) %>% 
              count(sentiment, sort = TRUE) %>% 
              mutate(method = "loughran") 
        Next, I'll combine lexicon summaries for summary of sentiment overall and visualize in a graph.
#Combining lexicon summaries to compare positive and negative sentiment scores in each lexicon.
            summary_sentiment <- bind_rows(summary_afinn3,
                                           summary_bing3,
                                           summary_nrc3,
                                           summary_loughran3) %>%
              arrange(method, keyword) %>%
              relocate(method)

            total_counts <- summary_sentiment %>%
              group_by(method, keyword) %>%
              summarise(total = sum(n))
## `summarise()` has grouped output by 'method'. You can override using the
## `.groups` argument.
            sentiment_counts <- left_join(summary_sentiment, total_counts)
## Joining, by = c("method", "keyword")
            sentiment_counts
## # A tibble: 16 x 5
## # Groups:   keyword [2]
##    method   keyword       sentiment     n total
##    <chr>    <chr>         <chr>     <int> <int>
##  1 afinn    disability    negative    676  1126
##  2 afinn    disability    positive    450  1126
##  3 afinn    special needs positive    489   804
##  4 afinn    special needs negative    315   804
##  5 bing     disability    negative    727  1083
##  6 bing     disability    positive    356  1083
##  7 bing     special needs positive    376   718
##  8 bing     special needs negative    342   718
##  9 loughran disability    negative    416   479
## 10 loughran disability    positive     63   479
## 11 loughran special needs negative    177   299
## 12 loughran special needs positive    122   299
## 13 nrc      disability    negative    784  1534
## 14 nrc      disability    positive    750  1534
## 15 nrc      special needs positive    701  1136
## 16 nrc      special needs negative    435  1136
        [***Positive and Negative Sentiment by Lexicon***]{.smallcaps}

        | Method   | Keyword       | Sentiment | n   |
        |----------|---------------|-----------|-----|
        | afinn    | disability    | negative  | 758 |
        | afinn    | disability    | positive  | 471 |
        | afinn    | special needs | positive  | 509 |
        | afinn    | special needs | negative  | 358 |
        | bing     | disability    | negative  | 819 |
        | bing     | disability    | positive  | 398 |
        | bing     | special needs | negative  | 407 |
        | bing     | special needs | positive  | 394 |
        | loughran | disability    | negative  | 476 |
        | loughran | disability    | positive  | 62  |
        | loughran | special needs | negative  | 210 |
        | loughran | special needs | positive  | 134 |
        | nrc      | disability    | negative  | 901 |
        | nrc      | disability    | positive  | 830 |
        | nrc      | special needs | positive  | 713 |
        | nrc      | special needs | negative  | 481 |
        | afinn    | disability    | negative  | 758 |
 #converting the sentiment scores to percentages for easier visualization
        sentiment_percents <- sentiment_counts %>%
          mutate(percent = n/total * 100)

        sentiment_percents
## # A tibble: 16 x 6
## # Groups:   keyword [2]
##    method   keyword       sentiment     n total percent
##    <chr>    <chr>         <chr>     <int> <int>   <dbl>
##  1 afinn    disability    negative    676  1126    60.0
##  2 afinn    disability    positive    450  1126    40.0
##  3 afinn    special needs positive    489   804    60.8
##  4 afinn    special needs negative    315   804    39.2
##  5 bing     disability    negative    727  1083    67.1
##  6 bing     disability    positive    356  1083    32.9
##  7 bing     special needs positive    376   718    52.4
##  8 bing     special needs negative    342   718    47.6
##  9 loughran disability    negative    416   479    86.8
## 10 loughran disability    positive     63   479    13.2
## 11 loughran special needs negative    177   299    59.2
## 12 loughran special needs positive    122   299    40.8
## 13 nrc      disability    negative    784  1534    51.1
## 14 nrc      disability    positive    750  1534    48.9
## 15 nrc      special needs positive    701  1136    61.7
## 16 nrc      special needs negative    435  1136    38.3
sentiment_percents %>%
          ggplot(aes(x = keyword, y = percent, fill=sentiment)) +
          geom_bar(width = .8, stat = "identity") +
          facet_wrap(~method, ncol = 1) +
          coord_flip() +
          labs(title = "Public Sentiment on Twitter", 
               subtitle = "Disability vs. Special Needs and Mom",
               x = "Keyword", 
               y = "Percentage of Words")

        summary_sentiment
## # A tibble: 16 x 4
## # Groups:   keyword [2]
##    method   keyword       sentiment     n
##    <chr>    <chr>         <chr>     <int>
##  1 afinn    disability    negative    676
##  2 afinn    disability    positive    450
##  3 afinn    special needs positive    489
##  4 afinn    special needs negative    315
##  5 bing     disability    negative    727
##  6 bing     disability    positive    356
##  7 bing     special needs positive    376
##  8 bing     special needs negative    342
##  9 loughran disability    negative    416
## 10 loughran disability    positive     63
## 11 loughran special needs negative    177
## 12 loughran special needs positive    122
## 13 nrc      disability    negative    784
## 14 nrc      disability    positive    750
## 15 nrc      special needs positive    701
## 16 nrc      special needs negative    435
4.  Visualizing Word Choice Through Word Clouds

    1.  Overall wordcloud
#Now I want to create a wordcloud of the tweets. However, there are
            #too many tweets to visualize. I'll choose the top 50.
            top_tokens_all <- tidy_tweets %>%
              count(word, sort = TRUE) %>%
              top_n(50)
## Selecting by n
            wordcloud2(top_tokens_all)
            #Some words are kind of irrelevant (1, 2, hes), but it does give a glimpse #overall of the top words. I'm interested to see that "support" is in the top 50 #since that is an aspect of the phenomenon of interest to me.
# Below I'm going to try to filter out "customwords" to see if it looks any
            #different

            top_tokens_all <- tidy_tweets %>%
              filter(!word == "amp" & !word == "im" & !word == "special" 
                     & !word == "disabled" & !word == "child" 
                     & !word == "kid" & !word == "#etsy" & !word == "kids"
                     & !word == "1" & !word == "2"  & !word == "3"  
                     & !word == "4" & !word == "hes") %>%
              count(word, sort = TRUE) %>%
              top_n(50)
## Selecting by n
            wordcloud2(top_tokens_all)
    2.  "Disability" wordcloud
##Tokenizing text - Disability and Mom
            tweet_tokens_dm <- 
              dm_tweets %>%
              unnest_tokens(output = word, 
                            input = text, 
                            token = "tweets")
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
            tidy_tweets_dm <-
              tweet_tokens_dm %>%
              anti_join(stop_words, by = "word") %>%
              filter(!word == "amp" & !word == "im" & !word == "special" 
                     & !word == "disabled" & !word == "child" 
                     & !word == "kid" & !word == "#etsy" & !word == "kids"
                     & !word == "1" & !word == "2"  & !word == "3"  
                     & !word == "4" & !word == "hes")

            count(tidy_tweets_dm, word, sort = T)
## # A tibble: 4,298 x 2
##    word           n
##    <chr>      <int>
##  1 parent        69
##  2 dont          62
##  3 school        56
##  4 people        47
##  5 children      42
##  6 care          37
##  7 support       37
##  8 disability    34
##  9 time          34
## 10 parents       31
## # ... with 4,288 more rows
            #selecting top 50 disability & mom tokens
            top_tokens_dm <- tidy_tweets_dm %>%
              count(word, sort = TRUE) %>%
              top_n(50)
## Selecting by n
            wordcloud2(top_tokens_dm)
    3.  "Special Needs" wordcloud
##Tokenizing text - Special Needs and Mom
            tweet_tokens_snm <- 
              snm_tweets %>%
              unnest_tokens(output = word, 
                            input = text, 
                            token = "tweets")
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
            tidy_tweets_snm <-
              tweet_tokens_snm %>%
              anti_join(stop_words, by = "word") %>%
              filter(!word == "amp" & !word == "im" & !word == "special" 
                     & !word == "disabled" & !word == "child" 
                     & !word == "kid" & !word == "#etsy" & !word == "kids"
                     & !word == "1" & !word == "2"  & !word == "3"  
                     & !word == "4" & !word == "hes")
            #Ask Dr. J re: more efficient way to filter this

            count(tidy_tweets_snm, word, sort = T)
## # A tibble: 2,939 x 2
##    word                             n
##    <chr>                        <int>
##  1 addition                        93
##  2 share                           92
##  3 bite                            91
##  4 childautismcerebral             91
##  5 excited                         91
##  6 palsyarthritisautoaggression    91
##  7 shop                            91
##  8 #protectivegloves               77
##  9 #cerebralpalsytoys              72
## 10 #bitingmittens                  64
## # ... with 2,929 more rows
            #selecting top 50 disability & mom tokens
            top_tokens_snm <- tidy_tweets_snm %>%
              count(word, sort = TRUE) %>%
              top_n(50)
## Selecting by n
            wordcloud2(top_tokens_snm)
  1. COMMUNICATE

    1. Select Research Questions of Interest. As noted above, my primary research questions for this study are:

      1. What is the overall sentiment of recent tweets on the topic of mothers parenting children with disabilities?

      2. Does sentiment vary based on keyword (disability vs. special needs)?

      3. Does sentiment vary by lexicon?

    2. Polish. I did most of my “polishing” in the “Model” section, so not much to report here…

      1. Narrate

        Purpose: Mothers experience disproportionate role stress when caring for young children, potentially limiting time and resources for self-care and career advancement. These demands are increased when mothers care for children with special needs. This study reviewed sentiment in recent Twitter posts to investigate sentiment surrounding motherhood while caring for children with special needs.

        Methods. Using the twitter API, I searched recent tweets (past six-nine days) for the following keywords:

        Keyword Search Term
        special needs #specialneeds AND mom
        #specialneeds AND mother
        special needs mom
        special needs kid
        special needs child
        disabled #disabledchild AND mom
          #disabledchild AND mother
          disabled child mom
          disabled kid mom
          disabled kid
          disabled child

        I then tokenized the text of the tweets, loaded lexicons (afinn, bing, nrc, loughran), created dictionaries, and analyzed sentiment of the tokenized tweets. Next, I visualized sentiment overall and between the keywords “disability” and “special needs”. Lastly, I visualized the top 50 words overall, and those associated with each keyword, in wordclouds.

        Findings. Recent tweets are more positive for terms including “special needs” and “mom” versus “disability” and “mom” across all lexicons. TTop words overall include: don’t, parent, shop, share, kids, and bite. Top words for disability and mom tweets include: parent, don’t, kids, school, people, care and (position 8) support. Top words for special needs and mom tweets include: share, addition, bite, excited, childautismcerebral, and palsyarthritisautoaggression.

        Discussion. The word disability in tweets seems to be associated with more negative sentiment, which may indicate pejorative connotations of this word versus special needs. Some research has shown that mothers who themselves have disabilities are a highly stressed group (Lee, 2004), and it is unclear whether some tweets in this category may reflect not mothers of disabled children but disabled mothers. However, the difference in sentiment may also reflect inaccuracies in the lexicons themselves, which may ascribe positive sentiment to a word like “special” but a negative one to “disabled”. Terms such as “school”, “bite”, and other references to self-injurious behavior indicate that day-to-day concerns of safety and the educational environment dominate the recent Twitter discourse on mothering children withe special needs–the focus is pragmatic. The frequency of the words “support” and “share” merit further investigation, in my opinion, as both may address support systems (familial, community, national) that are either present or absent for mothers raising children with special needs.

REFERENCES

Kim, J. (February 2, 2021). The mothers who already left. New York Magazine. https://www.thecut.com/2021/02/i-always-thought-id-be-a-working-mom.html

Lee, S., Oh, G., Hartmann, H., Gault, B. (February, 2004). The impact of disabilities on mothers’ work participation: Examining differences between single and married mothers. Institute for Women’s Policy Research, Washington, DC.

Ogrysko, N. (December 9, 2019). Lawmakers unveil details of ‘historic’ federal paid parental leave benefits. Federal News Network. Accessed from https://federalnewsnetwork.com/workforce/2019/12/lawmakers-unveil-details-of-historic-federal-paid-parental-leave-benefits/ on November 3, 2021.

Stewart, N. (July 28, 2020). When caring for your child’s needs becomes a job all on its own. The New York Times. https://www.nytimes.com/2020/07/24/us/children-disabilities-parenting-poverty-assistance.html