Harold Nelson
2/15/2022
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.0.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
Look at today’s weather in the weather channel. What are TMAX, TMIN, and the probability of rain. How do these values compare with the historical data for this day and month?
As I write these notes it is February 15, 2022. From the weather channel, TMIN = 39, TMAX = 49, and the probability of rain is 3%. Start with summary(). Just use PRCP, TMAX, and TMIN.
## PRCP TMAX TMIN
## Min. :0.0000 Min. :25.00 Min. :10.00
## 1st Qu.:0.0000 1st Qu.:45.00 1st Qu.:29.00
## Median :0.1050 Median :49.00 Median :33.50
## Mean :0.2516 Mean :48.56 Mean :33.42
## 3rd Qu.:0.3700 3rd Qu.:53.00 3rd Qu.:38.25
## Max. :2.1800 Max. :62.00 Max. :49.00
We can see that TMAX is right on the mean and median. However, TMIN is about six degrees above the central values. To compare the probability of rain, we need to do another computation to see the historical value. Do this.
## [1] 0.725
So 72.5% of the time there is measurable precipitation on February 15. Our current value of 3% is very low.
How do you think the weather one month in the future will be. How will it compare with today? Start with a summary().
## PRCP TMAX TMIN
## Min. :0.0000 Min. :43.0 Min. :24.00
## 1st Qu.:0.0000 1st Qu.:49.0 1st Qu.:30.00
## Median :0.0400 Median :53.0 Median :34.00
## Mean :0.1950 Mean :53.4 Mean :34.44
## 3rd Qu.:0.2225 3rd Qu.:58.0 3rd Qu.:38.25
## Max. :2.0800 Max. :73.0 Max. :47.00
## [1] 0.6375
Here’s what I see.
On what days has the maximum daily temperature exceeded 100 degrees? How many? When?
## # A tibble: 13 × 2
## DATE TMAX
## <date> <dbl>
## 1 1941-07-15 103
## 2 1941-07-16 103
## 3 1942-06-30 101
## 4 1942-07-01 102
## 5 1981-08-09 104
## 6 1981-08-10 102
## 7 1994-07-20 102
## 8 2006-07-21 101
## 9 2009-07-28 101
## 10 2009-07-29 104
## 11 2021-06-26 102
## 12 2021-06-27 105
## 13 2021-06-28 110
Repeat the exercise above for days with a TMIN below zero.
## # A tibble: 12 × 2
## DATE TMIN
## <date> <dbl>
## 1 1950-02-01 -1
## 2 1955-11-15 -1
## 3 1972-01-27 -7
## 4 1972-01-28 -4
## 5 1972-02-02 -1
## 6 1972-12-08 -3
## 7 1972-12-10 -1
## 8 1978-12-31 -5
## 9 1979-01-01 -8
## 10 1983-12-22 -3
## 11 1983-12-23 -7
## 12 1998-12-22 -1
What were the ten heaviest days of rainfall in Olympia. Sort them by date.
## # A tibble: 10 × 2
## DATE PRCP
## <date> <dbl>
## 1 1951-02-09 3.64
## 2 1956-12-09 3.5
## 3 1962-11-19 4.33
## 4 1990-01-09 3.82
## 5 1990-11-24 4.08
## 6 2001-11-14 3.64
## 7 2003-10-20 4.12
## 8 2006-11-06 4.31
## 9 2009-01-07 4.82
## 10 2022-01-06 3.99
How frequently do we see a minimum temperature below 32 combined with rain in January.
## # A tibble: 1 × 1
## `mean(bad)`
## <dbl>
## 1 0.169
## # A tibble: 1 × 1
## `mean(bad)`
## <dbl>
## 1 0.146
## # A tibble: 1 × 1
## `mean(bad)`
## <dbl>
## 1 0.115
Which days of the year have the highest probability of rain? List them in order of probability.
oly_airport %>%
group_by(mo, dy) %>%
summarize(p_rain = mean(PRCP > 0)) %>%
ungroup() %>%
arrange(p_rain) %>%
tail(10)
## `summarise()` has grouped output by 'mo'. You can override using the `.groups` argument.
## # A tibble: 10 × 3
## mo dy p_rain
## <fct> <fct> <dbl>
## 1 12 9 0.716
## 2 12 15 0.716
## 3 12 18 0.716
## 4 12 19 0.716
## 5 2 15 0.725
## 6 12 10 0.728
## 7 11 24 0.741
## 8 12 2 0.741
## 9 2 29 0.75
## 10 12 20 0.765