Word Count of TN Bills Since 2023

Below is the code written for the Week 14 Lab of MTSU’s JOUR 3841 course - Data Skills for Media Professionals - taught by Dr. Ken Blake.

After removing standard stop words and also the terms “title” and “tca”, some of the most easily identifiable words in the “tidy_text” data frame are “board”, “school”, “department,” and “public”. These words appear 275, 245, 219, and 201 times respectively. Therefore, I believe the most common topic in these Bills is education, and that a majority of these Bills are focused on education and school boards.

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

library(tidyverse)
library(tidytext)

mydata <- read.csv("https://raw.githubusercontent.com/drkblake/Data/main/TNBills22_23.csv")
view(mydata)

tidy_text <- mydata %>% 
  unnest_tokens(word,description) %>% 
  count(word, sort = TRUE)

# Deleting standard stop words
data("stop_words")
tidy_text <- tidy_text %>%
  anti_join(stop_words)

my_stopwords <- tibble(word = c("title",
                                "tca"))
tidy_text <- tidy_text %>% 
  anti_join(my_stopwords)