Import data

# excel file
data <- read_excel("../02_module5/beesdata.xlsx")
data

State one question

Do larger total colonies lose more percentage of colonies than smaller total colonies?

(Hypothesis) Larger colonies will lose a larger percentage of colonies.

Plot data

ggplot(data = data) + 
    geom_smooth(mapping = aes(x = colony_n, y = colony_lost)) + 
    labs(title = "How Larger Colonies effect Colony Losses", x = "Number of Colonies", y = "Colonies Lost")

ggplot(data = data, aes(x = colony_n, y = colony_lost)) +
  geom_point(color= "red") +
  geom_smooth(method = "lm") +
    labs(title = "How Larger Colonies effect Colony Losses", x = "Number of Colonies", y = "Colonies Lost")

ggplot(data = data, aes(x = colony_lost_pct)) +
  geom_density() + 
  labs(title = "Density of Colony Loss Percentages", x = "Colony Lost Percentage", y = "Density")

Interpret

The first and second visualizations show how as the number of colonies increase, so too does the number of colonies lost. The second visualization depicts how spread out these observations are; there are clearly a lot more lower percentage of losses than larger percentages. This is shown by the amount of points at the bottom of the graph versus towards the top. Still, the visualization proves that more numbers of colonies equates to larger numbers of colony losses. The third density graph is just proving what was found in the second visualization; that the largest portion of data occurs in lower percentages of colony losses.

There is evidence of a positive relationship between higher number of colonies and higher number of colony losses.