library(tidyverse)
library(googlesheets4)
library(janitor)
# d <- read_sheet("https://docs.google.com/spreadsheets/d/1ixyEMHdDxlQ-0pzm3MGZo0l-HXRLUe944WI5Mffp4D0/edit#gid=0")
#
# write_csv(d, "snails.csv")
d <- read_csv("snails.csv")
Not much about snails. I’m pretty skeptical that snails will tell us much about air quality. Numerically, I’d say I’m 20% confident that a measure of snail health correlates to a measure of air or water quality
d <- d %>%
clean_names()
d %>%
mutate(year = lubridate::year(observation_date)) %>%
count(shell_composition, year) %>%
filter(!is.na(shell_composition)) %>%
ggplot(aes(x = year, y = n, color = shell_composition, group = shell_composition)) +
geom_point() +
geom_line()
Not learning a ton from this. Not clear differences in shell composition over time, I think.
What if we look proportionally?
d %>%
mutate(year = lubridate::year(observation_date)) %>%
count(shell_composition, year) %>%
ungroup() %>%
group_by(year) %>%
mutate(proportion = n / sum(n)) %>%
filter(!is.na(shell_composition)) %>%
ggplot(aes(x = year, y = proportion, color = shell_composition, group = shell_composition)) +
geom_point() +
geom_line()
Hmm. That actually moves the needle for me a bit.
I’m now up to a 40% confidence estimate that there’s a correlation between these things, but I have a lot more I want to know.
https://kubsch.shinyapps.io/Confidence_Updater/
20% confident, still, in my hypothesis
the evidence somewhat favors my hypothesis
I need more evidence and remain undecided about it (42.86% confidence)