Project

Simple plot explorations

# Combine "Black" and "Black or African American"
drugs_combined <- drug_deaths %>%
  mutate(Race = fct_recode(as.factor(Race), "Black" = "Black or African American"))
# Combine "Native American, Other" and "American Indian or Alaska Native"
drugs_combined <- drugs_combined %>%
  mutate(Race = fct_recode(as.factor(Race), "Native American" = "American Indian or Alaska Native", "Native American" = "Native American, Other"))
# Simple Preliminary Visualization
ggplot(drugs_combined, aes(x = Race)) +
  geom_bar(stat = "count") +
  labs(title = "Drug Deaths by Race in Conneticut (2012-2020)", x = "Race", y = "Count")

theme (axis.text = element_text(angle = 90, size = 5))
List of 1
 $ axis.text:List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : num 5
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : num 90
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi FALSE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi FALSE
 - attr(*, "validate")= logi TRUE

Statistical Analysis: Histogram and Box-plot

ggplot(drugs_combined, aes(x = Sex, y = Age, fill = Sex)) +
  geom_boxplot() +
  labs(title = "Spread of Overdoses by Age and Sex", x = "Sex", y = "Age") +
  theme_minimal()
Warning: Removed 2 rows containing non-finite values (`stat_boxplot()`).

drugs_combined |>
  drop_na(Sex) |>
  ggplot(aes(x = Sex, y = Age, fill = Sex)) +
  geom_boxplot() +
  labs(title = "Spread of Drug Deaths by Age and Sex", x = "Sex", y = "Age") +
  theme_minimal() +
  scale_y_continuous(breaks = seq(0, 100, by = 10))  # Adjust the breaks as needed
Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).

ggplot(drugs_combined, aes(x = Age, fill = "count")) +
  geom_histogram(binwidth = 10, color = "black", alpha = 0.9) +
  labs(title = "Histogram of Age", x = "Age", y = "Frequency") +
  scale_x_continuous(breaks = seq(0, 100, by = 10), labels = c("0-9", "10-19", "20-29", "30-39", "40-49", "50-59", "60-69", "70-79", "80-89", "90-99", "100")) +
  theme_minimal()
Warning: Removed 2 rows containing non-finite values (`stat_bin()`).

Explanation of Visualization: For my statistical visualizations I created a box-plot and a histogram. The purpose of these visualizations is to see the spread in the data set. The box plot has two distinct plots, one for the age distributions of each sex. In the initial visualiztion there was a N/A section for 8 data points for which there was no assigned sex. I decided to remove these and create a second visualization. Because it was 8 observations and since this data set was so large I do not believe this took away from the integrity of my exploration. The second statistical visualization I created was a histogram, it displayed the spread of age in drug deaths in the absence of sex. Comined these graphs show that the spread in age is very similar for each sex. The histogram shows that the highest ammount of drug deaths are within the age range of 40-49 and 50-59, while surprisingly they are lower in the age range of 20-29.

Primary Visualizations. Based on what I discovered I wanted to explore cause of deaths among those passed away. I created an interactive bar chart.

mainvis <- ggplot(drugs_combined, aes(x = `Injury County`)) +
  geom_bar(fill = "orange", color = "black") +
  labs(title = "County Injury Rates",
       x = "County of Injury",
       y = "Count") +
  theme_minimal()
theme(axis.text.x = element_text(angle = 45, hjust = 1))
List of 1
 $ axis.text.x:List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 1
  ..$ vjust        : NULL
  ..$ angle        : num 45
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi FALSE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi FALSE
 - attr(*, "validate")= logi TRUE
print(mainvis)

After seeing that there was not much spread in gender and age when it came to drug deaths in Connecticut I decided to explore location to see if there was a concentration in drug deaths in certain areas.

As a result of this, I aimed to explore factors like location to understand this issue. When I initially used Tableau and developed a map I noticed a substantial portion of the drug deaths were from people out of state. A decent amount of the people dying were from up and down the east coast, highly concentrated in Florida. And of the out of state residents passing away in Florida a lot of the incidents occurred in Hotels/Motels. This does not negate the fact that a substantial portion of drug deaths are still experienced by CT citizens. However, through Tableau I was able to conclude they were happening in lower income areas where the budgets are not as high. My final visualization involved looking at the counts for each of the counties.

Another interesting trend was also found, drug deaths involving opiates have been on the incline over the past years. With this data in mind a lot of legislative responses could take place such as increasing funding in certain areas as well as outreach programs and public health initiatives that target older populations who seem to be the ones suffering the most in this epidemic.

“How Rich Is Each US State? | Chamber of Commerce.” Chamber of Commerce,www.chamberofcommerce.org/how-rich-is-each-us-state/.