Feb 10 Weather

Harold Nelson

2/10/2021

How’s the Weather?

This little presentation looks at weather for the Olympia area. If you’re new, what should you expect over the next few months?

I thought that July 2017 was unusually hot. Was it really different?

Setup

Get the necessary packages and data.

Answer

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.4
## ✓ tibble  3.0.5     ✓ dplyr   1.0.3
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(readr)
load("olyw1018.RData")

Look at the data and see what we have.

str(olyw1018)
## tibble [28,705 × 9] (S3: tbl_df/tbl/data.frame)
##  $ STATION_NAME: chr [1:28705] "OLYMPIA PRIEST PT PA WA US" "OLYMPIA PRIEST PT PA WA US" "OLYMPIA PRIEST PT PA WA US" "OLYMPIA PRIEST PT PA WA US" ...
##  $ DATE        : Date[1:28705], format: "1948-01-01" "1948-01-02" ...
##  $ PRCP        : num [1:28705] 0.82 0.15 0.62 0.53 0 0.39 0.51 0.68 0 1.32 ...
##  $ SNOW        : num [1:28705] 0 0 0 0 0 0 0 0 0 0 ...
##  $ TMAX        : num [1:28705] 50 43 48 45 46 45 52 49 48 43 ...
##  $ TMIN        : num [1:28705] 40 40 35 35 31 33 40 36 29 31 ...
##  $ yr          : num [1:28705] 1948 1948 1948 1948 1948 ...
##  $ mo          : num [1:28705] 1 1 1 1 1 1 1 1 1 1 ...
##  $ dy          : int [1:28705] 1 2 3 4 5 6 7 8 9 10 ...
summary(olyw1018)
##  STATION_NAME            DATE                 PRCP             SNOW         
##  Length:28705       Min.   :1948-01-01   Min.   :0.0000   Min.   : 0.00000  
##  Class :character   1st Qu.:1959-10-04   1st Qu.:0.0000   1st Qu.: 0.00000  
##  Mode  :character   Median :1979-05-28   Median :0.0000   Median : 0.00000  
##                     Mean   :1980-03-14   Mean   :0.1404   Mean   : 0.03357  
##                     3rd Qu.:1999-01-21   3rd Qu.:0.1400   3rd Qu.: 0.00000  
##                     Max.   :2018-09-21   Max.   :4.8200   Max.   :14.20000  
##                                                                             
##       TMAX             TMIN             yr             mo        
##  Min.   : 18.00   Min.   :-8.00   Min.   :1948   Min.   : 1.000  
##  1st Qu.: 50.00   1st Qu.:33.00   1st Qu.:1959   1st Qu.: 4.000  
##  Median : 59.00   Median :40.00   Median :1979   Median : 7.000  
##  Mean   : 60.45   Mean   :39.85   Mean   :1980   Mean   : 6.501  
##  3rd Qu.: 71.00   3rd Qu.:47.00   3rd Qu.:1999   3rd Qu.: 9.000  
##  Max.   :104.00   Max.   :69.00   Max.   :2018   Max.   :12.000  
##  NA's   :13       NA's   :14                                     
##        dy       
##  Min.   : 1.00  
##  1st Qu.: 8.00  
##  Median :16.00  
##  Mean   :15.72  
##  3rd Qu.:23.00  
##  Max.   :31.00  
## 

The dataset contains daily observations of weather from July 1948 forward.

olyw1018 %>% 
  group_by(mo,dy) %>%
  summarize(prain = mean(PRCP > 0,na.rm=TRUE)) %>% 
  ungroup() -> rainy
## `summarise()` has grouped output by 'mo'. You can override using the `.groups` argument.

Prediction

Let’s create a permanent weather prediction for any future year based on average historical values for each day.

olyw1018 %>% 
  group_by(mo,dy) %>% 
  summarize(prain = mean(PRCP > 0),
            dmax = mean(TMAX),
            midmax = median(TMAX),
            dmin = mean(TMIN)) %>% 
             
    ungroup() %>% 
  mutate(date =make_date(2020,mo,dy))-> cal20
## `summarise()` has grouped output by 'mo'. You can override using the `.groups` argument.
cal20 %>% ggplot(aes(x=date,y=dmax)) +
  geom_point(alpha=.5,size=.5) +
  ggtitle("Average Daily Maximum Temperature") +
  theme(plot.title = element_text(hjust = 0.5)) -> v0
ggplotly(v0)

Fix the Date

Put your code here showing only the months without the year.

All of the Data

Here’s a graph showing all of the historical values.

olyw1018 %>% 
   mutate(date =make_date(2020,mo,dy))-> cal20_all

cal20_all %>% ggplot(aes(x=date,y=TMAX)) +
  geom_point(alpha=.05,size=.5) +
  ggtitle("All Daily Maximum Temperatures") +
  theme(plot.title = element_text(hjust = 0.5)) -> v2
ggplotly(v2)

2018 Enhanced

Here’s the same graph with 2018 values enhanced.

cal18 = filter(olyw1018,yr == 2018) %>% 
        mutate(date =make_date(2020,mo,dy))
cal20_all %>% 
  ggplot(aes(x=date,y=TMAX)) +
  geom_point(alpha=.1,size=.5) +
  geom_point(data=cal18,aes(date,TMAX),alpha=.8,color="red",size=.6) +
  ggtitle("All Daily Maximum Temperatures") +
  theme(plot.title = element_text(hjust = 0.5)) -> v1
ggplotly(v1)

Replace the cloud with a smoother.

cal18 = filter(olyw1018,yr == 2018) %>% 
        mutate(date =make_date(2020,mo,dy))
cal20_all %>% 
  ggplot(aes(x=date,y=TMAX)) +
  geom_smooth(size=.5,color="blue") +
  geom_point(data=cal18,aes(date,TMAX),alpha=.8,color="red",size=.6) +
  ggtitle("Smoothed Daily Maximum Temperatures\n Calendar 2018 is Red") +
  
  theme(plot.title = element_text(hjust = 0.5))
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 13 rows containing non-finite values (stat_smooth).

Median Maximum Temperature

cal18 = filter(olyw1018,yr == 2018) %>% 
        mutate(date =make_date(2020,mo,dy))
cal20%>% 
  ggplot(aes(x=date,y=midmax)) +
  geom_point(size= 1,color="blue") +
  geom_point(data=cal18,aes(date,TMAX),alpha=.8,color="red",size=.6) +
  ggtitle("Daily Maximum Temperatures\n Calendar 2018 is Red\n Median is Blue") +
  
  theme(plot.title = element_text(hjust = 0.5)) -> v3
ggplotly(v3)

Rain?

cal20 %>% ggplot(aes(x=date,y=prain)) +
  geom_point(alpha=.5,size=.5) +
  ggtitle("Probability of Rain") +
  theme(plot.title = element_text(hjust = 0.5)) -> v4
ggplotly(v4)