Harold Nelson
2024-07-14
I will walk through the exercises in the lab using a different dataset, OAW2309. It contains weather records from the Olympia airport.
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata
## Rows: 30,075
## Columns: 7
## $ DATE <date> 1941-05-13, 1941-05-14, 1941-05-15, 1941-05-16, 1941-05-17, 1941…
## $ PRCP <dbl> 0.00, 0.00, 0.30, 1.08, 0.06, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
## $ TMAX <dbl> 66, 63, 58, 55, 57, 59, 58, 65, 68, 85, 84, 75, 72, 59, 61, 59, 6…
## $ TMIN <dbl> 50, 47, 44, 45, 46, 39, 40, 50, 42, 46, 46, 50, 41, 37, 48, 46, 4…
## $ mo <fct> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6,…
## $ dy <int> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 2…
## $ yr <dbl> 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941,…
Use the maximum temperature.
TMAXmean <- mean(OAW2309$TMAX)
TMAXsd <- sd(OAW2309$TMAX)
ggplot(data = OAW2309, mapping = aes(x = TMAX)) +
geom_blank() +
geom_density() +
#geom_histogram(aes(y = ..density..)) +
stat_function(fun = dnorm, args = c(mean = TMAXmean, sd = TMAXsd), col = "tomato")
There is a better way to do this in base R.
Use Precipitation.
PRCPmean <- mean(OAW2309$PRCP)
PRCPsd <- sd(OAW2309$PRCP)
ggplot(data = OAW2309, mapping = aes(x = PRCP)) +
geom_blank() +
geom_density() +
#geom_histogram(aes(y = ..density..)) +
stat_function(fun = dnorm, args = c(mean = PRCPmean, sd = PRCPsd), col = "tomato")
There is a better way to do this in base R.
We’ll use TMAX.
## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## ℹ Please use `tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## tibble [30,075 × 1] (S3: tbl_df/tbl/data.frame)
## $ sim_norm: num [1:30075] 73.7 54.2 54.6 74.6 79.1 ...
Look a this technique with PRCP, which is far from normal.
Look a this technique with TMAX, which is close to normal.