Load required libraries

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.1     ✔ purrr   1.0.1
## ✔ tibble  3.1.8     ✔ dplyr   1.1.0
## ✔ tidyr   1.3.0     ✔ stringr 1.5.0
## ✔ readr   2.1.4     ✔ forcats 1.0.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(dplyr)
library(readxl)

Question: Where are Bees Seen to be the Most Endangered in the US?

Explanation of Data:

The data used in the charts is on bee colony data throughout the US. Data set given shows information on all 50 states in the US for many years with many different numeric and categorical variables. Some of the numeric data includes: colony_max, colony_lost, colony_lost_pct, colony_added, colony_reno, colony_reno_pct, Growth of colonies. Categorical data seen throughout data sheet includes: state and Months. Overall the goal of the collection of all this data is to use R studio to manipulate the chart so that and understand can be met about where bees are the most endangered

variable: description: year Year months month state State Name (note there is United States and Other States) colony_n Number of colonies colony_max Maximum colonies colony_lost Colonies lost colony_lost_pct Percent of total colonies lost colony_added Colonies added colony_reno Colonies renovated colony_reno_pct Percent of colonies renovated

Imported Data

datasimp <- read_excel("../00_data/Datasimpler.xlsx")

data <- read_excel("../00_data/MyData-Charts.xlsx")

Relevant R Code

datasimp %>% select(States, `Colony_Gain/Loss`)
## # A tibble: 45 × 2
##    States      `Colony_Gain/Loss`
##    <chr>                    <dbl>
##  1 Alabama                    100
##  2 Arizona                   4300
##  3 Arkansas                  7400
##  4 California              136000
##  5 Colorado                  -700
##  6 Connecticut                900
##  7 Florida                  19000
##  8 Georgia                   5000
##  9 Hawaii                    -170
## 10 Idaho                    20500
## # … with 35 more rows
plot_data <- data %>% group_by(year, state) %>% summarise(avg_pct = mean(colony_lost_pct, na.rm = TRUE)) %>% filter(year == 2021) %>% arrange(-avg_pct) %>% slice(1:10)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
plot_data %>% ggplot(aes(x = avg_pct, y = fct_reorder(state, avg_pct))) + geom_col()

Analyse

This bar graph shows the colony lost percentage through all the states while grouping year and state together. I decided to focus on the year 2021 to make sure I had the most recent and relevant data. Overall this graph shows that Missouri has had the highest decline in colony percentage over the years showing that it must be the most endangered since for there location they are seeing the highest amount of bee decline out of what they saw based on the year before.

plot_data <- data %>% group_by(year, state) %>% summarise(colony_lost = max(colony_lost, na.rm = TRUE)) %>% filter(year == 2021) %>% arrange(-colony_lost) %>% slice(1:10)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
plot_data %>% ggplot(aes(x = colony_lost, y = fct_reorder(state, colony_lost))) + geom_col()

Analyse

This bar graph groups year and state together and looks at colony lost overall in the US for each state. Overall this graph shows that California has the most amount of bees dying yearly. This shows that California has the most bee deaths seen out of every state yearly. I don’t think that determines the endangerment level since the full story isn’t necessarily seen here.

Conclusion

In conclusion the data shows that this question doesn’t necessarily have one answer since the statement “most endangered” is pretty vague. When looking at overall bees killed every year the state that has the worst numbers can be seen as California as seen on the second graph. Although when the overall percentage of colonies lost is examined it can be seen that Missouri has the worst numbers as of 2021. This is I believe a more important data point since when viewing California bee colony gain which shows that it also has the highest amount of bee colonies gained. Therefore California should expect to see more bees dying since overall they are one of the bee capital of the world due to there high bee colony growth. Missouri on the other hand I believe is the most endangered state for bee’s since at average, when looking at the 1 st graph, the highest percentage of bees can be seen dying in Missouri over 2021. Missouri data shows that the threat of bees going extinct is very high due to there yearly bee colony percentage decline. This is a result in a lack of bee colony growth and a high percentage of bees dying. I believe Missouri may be determine to be a bad location for bees environmentally since there are so few there in general. Overall, if you are a bee I would avoid heading to Missouri over any state since you will have the highest percentage of death.