Final Exam

Part #1-This map represents the amount of renter-occupied households across different counties across the state of Tennessee, with this map helping show the percentage of renters in counties. The highest percentage of rent-based households in this data set cam be seen in Davidson county, which can be attributed to the fact that it is the county Nashville is located in, a major metropolitan city which might have more renters than homeowners.

##################################################################
# County level, Nashville MSA 
##################################################################
# Installing and loading required packages

if (!require("tidyverse")) install.packages("tidyverse")

## Loading required package: tidyverse

## Warning: package 'tidyverse' was built under R version 4.3.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

if (!require("tidycensus")) install.packages("tidycensus")

## Loading required package: tidycensus

if (!require("sf")) install.packages("sf")

## Loading required package: sf
## Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE

if (!require("mapview")) install.packages("mapview")

## Loading required package: mapview

## Warning: package 'mapview' was built under R version 4.3.3

library(tidyverse)
library(tidycensus)
library(sf)
library(mapview)

# Transmitting API key

census_api_key("9fbd73ce43871afbb3837dd9a865f064940a6a53")

## To install your API key for use in future sessions, run this function with `install = TRUE`.

# Fetching ACS codebooks

DetailedTables <- load_variables(2022, "acs5", cache = TRUE)
SubjectTables <- load_variables(2022, "acs5/subject", cache = TRUE)
ProfileTables <- load_variables(2022, "acs5/profile", cache = TRUE)
All_ACS_Variables <- bind_rows(DetailedTables, ProfileTables)
All_ACS_Variables <- bind_rows(All_ACS_Variables, SubjectTables)
rm (DetailedTables, SubjectTables, ProfileTables)

# Specify a variable to estimate
VariableList = 
  c(Estimate_ = "DP04_0047P")

# Fetching data

mydata <- get_acs(
  geography = "county",
  state = "TN",
  variables = VariableList,
  year = 2022,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

## Getting data from the 2018-2022 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## Using the ACS Data Profile

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |==========================                                            |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |=================================================                     |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%

# Reformatting data

mydata <-
  separate_wider_delim(mydata,
                       NAME,
                       delim = ", ",
                       names = c("County", "State"))

# Filtering data

mydata <- mydata %>% 
  filter(County == "Cheatham County"|
           County == "Davidson County"|
           County == "Dickson County"|
           County == "Robertson County"|
           County == "Rutherford County"|
           County == "Sumner County"|
           County == "Williamson County"|
           County == "Wilson County")

# Mapping data

mapdata <- mydata %>% 
  rename(Estimate = Estimate_E, Estimate_MOE = Estimate_M)

mapdata <- st_as_sf(mapdata)

mapviewOptions(basemaps.color.shuffle = FALSE)
mapview(mapdata, zcol = "Estimate",
        layer.name = "Estimate",
        popup = TRUE)

# Exporting data in .csv format

CSVdata <- st_drop_geometry(mapdata)
write.csv(CSVdata, "mydata.csv", row.names = FALSE)

-I was able to conclude that there were 39 posts with video in them, yet the overall impressions garnered from videos or photos can be compared in similar fashions, with the overall difference being discernible that videos have not made a noticeable impression overall in the popularity of the team’s social media.

Part #2

# Install and load tidyverse
if (!require("tidyverse"))
  install.packages("tidyverse")
library(tidyverse)

# Read the data
# NOTE: You may edit the URL to load a different dataset

mydata <- read.csv("https://raw.githubusercontent.com/drkblake/Data/main/SocialData.csv")
head(mydata,10)

##    ID  Type Impressions
## 1   1 Photo         695
## 2   2  Text         940
## 3   3 Photo        1196
## 4   4 Photo         936
## 5   5 Photo        1389
## 6   6 Photo         857
## 7   7  Text         797
## 8   8 Photo        1810
## 9   9 Photo        1086
## 10 10 Video        1416

# Specify the DV and IV
# NOTE: You may edit the FGP and Team variable names
mydata$DV <- mydata$Impressions
mydata$IV <- mydata$Type
  
# Graph the group distributions and averages
averages <- group_by(mydata, IV) %>%
  summarise(mean = mean(DV, na.rm = TRUE))
ggplot(mydata, aes(x = DV)) +
  geom_histogram() +
  facet_grid(IV ~ .) +
  geom_histogram(color = "black", fill = "#1f78b4") +
  geom_vline(data = averages, aes(xintercept = mean, ))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# Calculate and show the group counts, means, standard
# deviations, minimums, and maximums
group_by(mydata, IV) %>%
  summarise(
    count = n(),
    mean = mean(DV, na.rm = TRUE),
    sd = sd(DV, na.rm = TRUE),
    min = min(DV, na.rm = TRUE),
    max = max(DV, na.rm = TRUE))

## # A tibble: 3 × 6
##   IV    count  mean    sd   min   max
##   <chr> <int> <dbl> <dbl> <int> <int>
## 1 Photo    58 1035.  297.   397  1810
## 2 Text     43  999.  278.   515  1746
## 3 Video    39 1370.  307.   829  1952

options(scipen = 999)
oneway.test(mydata$DV ~ mydata$IV,
            var.equal = FALSE)

## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  mydata$DV and mydata$IV
## F = 19.119, num df = 2.000, denom df = 85.525, p-value = 0.000000137

# If the ANOVA detects significant difference, run
# this post-hoc procedure to learn which
# group pairs differed significantly.

anova_1 <- aov(mydata$DV ~ mydata$IV)
TukeyHSD(anova_1)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = mydata$DV ~ mydata$IV)
## 
## $`mydata$IV`
##                  diff       lwr      upr     p adj
## Text-Photo  -36.35605 -176.6202 103.9081 0.8126345
## Video-Photo 334.87710  190.5414 479.2128 0.0000005
## Video-Text  371.23315  217.1076 525.3587 0.0000002

Part #3

# Load packages

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

## Loading required package: tidytext

## Warning: package 'tidytext' was built under R version 4.3.3

library(tidyverse)
library(tidytext)

# Read the data

mydata <- read.csv("https://raw.githubusercontent.com/drkblake/Data/main/WhiteHouse.csv")

# Extract individual words to a "tidytext" data frame

tidy_text <- mydata %>% 
  unnest_tokens(word,Full.Text,token="ngrams",n=2) %>% 
  count(word, sort = TRUE)

# Delete standard stop words

data("stop_words")
tidy_text <- tidy_text %>%
  anti_join(stop_words)

## Joining with `by = join_by(word)`

# Delete custom stop words

my_stopwords <- tibble(word = c("https",
                                "t.co",
                                "rt"))
tidy_text <- tidy_text %>% 
  anti_join(my_stopwords)

## Joining with `by = join_by(word)`

# Define search terms and count items that include them
# "Biden" terms are used as an example

searchterms <- "health|care|debt"
mydata$HealthTerms <- ifelse(grepl(searchterms,
                             mydata$Full.Text,
                             ignore.case = TRUE),1,0)
sum(mydata$HealthTerms)

## [1] 857

sum(mydata$HealthTerms)/5508

## [1] 0.1555919

-For part #3 I decided to focus on the relevance of healthcare discussion in white house posts, with there being a mentioning of health, care & debt an overall collective 857 times whilst it is mentioned 15% overall in white house posts. This could be due to recent initiatives to improve health care or overall legislation updates, but this is how the health and overall wellness is being discussed

Final Exam

William Wright

2024-04-26