Explore OAW23

Harold Nelson

2026-06-08

Task 1

Run the command to make the tidyverse available to your R session

Solution

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.1     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.3     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Task 2

After downloading OAW2309.Rdata from Moodle, load it and use glimpse() to inspect it.

Solution

load("OAW2309.Rdata")
glimpse(OAW2309)
## Rows: 30,075
## Columns: 7
## $ DATE <date> 1941-05-13, 1941-05-14, 1941-05-15, 1941-05-16, 1941-05-17, 1941…
## $ PRCP <dbl> 0.00, 0.00, 0.30, 1.08, 0.06, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
## $ TMAX <dbl> 66, 63, 58, 55, 57, 59, 58, 65, 68, 85, 84, 75, 72, 59, 61, 59, 6…
## $ TMIN <dbl> 50, 47, 44, 45, 46, 39, 40, 50, 42, 46, 46, 50, 41, 37, 48, 46, 4…
## $ mo   <fct> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6,…
## $ dy   <int> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 2…
## $ yr   <dbl> 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941,…

Task 3

Use filter to create a dataframe with just the observations from January 1. Call the new dataframe Jan1. Use head() to verify the result.

Solution

Jan1 = OAW2309 %>% 
  filter(mo == 1 & dy == 1)

head(Jan1)
## # A tibble: 6 × 7
##   DATE        PRCP  TMAX  TMIN mo       dy    yr
##   <date>     <dbl> <dbl> <dbl> <fct> <int> <dbl>
## 1 1942-01-01  0       35    11 1         1  1942
## 2 1943-01-01  0.05    42    34 1         1  1943
## 3 1944-01-01  0.61    48    35 1         1  1944
## 4 1945-01-01  0       51    40 1         1  1945
## 5 1946-01-01  0.35    52    43 1         1  1946
## 6 1947-01-01  0       41    25 1         1  1947

Task 4

Warming? Use this data to consider the possibility that weather in Olympia has been getting warmer. Create an appropriate graph using the variables yr and TMAX.

Solution

Jan1 %>% 
  ggplot(aes(x = yr, y = TMAX)) +
  geom_point() +
  geom_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Task 5

Creat a new dataframe JFM15.It should contain data for the 15th of January, February, and March. Use head() to verify.

Solution

JFM15 = OAW2309 %>% 
  filter(mo %in% c(1,2,3),
         dy == 15)
head(JFM15)
## # A tibble: 6 × 7
##   DATE        PRCP  TMAX  TMIN mo       dy    yr
##   <date>     <dbl> <dbl> <dbl> <fct> <int> <dbl>
## 1 1942-01-15  0       41    29 1        15  1942
## 2 1942-02-15  0       50    27 2        15  1942
## 3 1942-03-15  0.07    45    27 3        15  1942
## 4 1943-01-15  0.05    47    29 1        15  1943
## 5 1943-02-15  0       52    28 2        15  1943
## 6 1943-03-15  0       51    31 3        15  1943

Task 6

Compare the values of TMAX for these three days with a boxplot.

JFM15 %>% 
  ggplot(aes(x = mo, y = TMAX)) +
  geom_boxplot() + 
  ggtitle("TMAX Values")

## Task 7

A different comparison. Use facetting to show the differences among these three months. Make the basic plot a histogram.

Solution

JFM15 %>% 
  ggplot(aes(x = TMAX)) +
  geom_histogram() + 
  ggtitle("TMAX Values") +
  facet_wrap(~mo)
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Task 8

A different geom. Repeat the previous exercise but use geom_density() instead of geom_histogram().

Solution

JFM15 %>% 
  ggplot(aes(x = TMAX)) +
  geom_density() + 
  ggtitle("TMAX Values") +
  facet_wrap(~mo)

Task 9

Use mutate() to create a new variable TDIFF in the OAW2309 dataframe. It is TMAX - TMIN. Use head() to verify your work.

Solution

OAW2309 = OAW2309 %>% 
  mutate(TDIFF = TMAX - TMIN)

head(OAW2309)
## # A tibble: 6 × 8
##   DATE        PRCP  TMAX  TMIN mo       dy    yr TDIFF
##   <date>     <dbl> <dbl> <dbl> <fct> <int> <dbl> <dbl>
## 1 1941-05-13  0       66    50 5        13  1941    16
## 2 1941-05-14  0       63    47 5        14  1941    16
## 3 1941-05-15  0.3     58    44 5        15  1941    14
## 4 1941-05-16  1.08    55    45 5        16  1941    10
## 5 1941-05-17  0.06    57    46 5        17  1941    11
## 6 1941-05-18  0       59    39 5        18  1941    20

Task 10

Create a new dataframe SUM_DIFF with one row for each of the 12 months. The new variables in this dataframe are mean_diff and sd_diff. Arrange the dataframe by mean_diff.

Solution

SUM_DIFF = OAW2309 %>% 
  group_by(mo) %>% 
  summarize(mean_diff = mean(TDIFF),
            sd_diff = sd(TDIFF)) %>% 
  arrange(mean_diff)

head(SUM_DIFF)
## # A tibble: 6 × 3
##   mo    mean_diff sd_diff
##   <fct>     <dbl>   <dbl>
## 1 12         12.2    5.42
## 2 1          13.0    6.25
## 3 11         15.0    6.80
## 4 2          16.5    7.74
## 5 3          19.6    8.58
## 6 10         20.5    8.73

Task 11

Create a scatterplot of mean_diff and sd_diff.

Solution

SUM_DIFF %>% 
  ggplot(aes(x = mean_diff,y = sd_diff)) +
  geom_point()

Task 12

Create a scatterplot of mean_diff and mo.

Solution

SUM_DIFF %>% 
ggplot(aes(x = mo, y = mean_diff)) +
  geom_point()

Task 13

Use the basic dataframe OAW2309 to create a new dataframe SKINNY with just DATE, TMAX, and PRCP. Create a scatterplot with TMAX on the horizontal axis and PRCP on the vertical axis. Use the size parameter of geom_point to get a reasonable graph. Try sizes below .1.

Solution

SKINNY = OAW2309 %>% 
  select(DATE,TMAX,PRCP)
head(SKINNY)
## # A tibble: 6 × 3
##   DATE        TMAX  PRCP
##   <date>     <dbl> <dbl>
## 1 1941-05-13    66  0   
## 2 1941-05-14    63  0   
## 3 1941-05-15    58  0.3 
## 4 1941-05-16    55  1.08
## 5 1941-05-17    57  0.06
## 6 1941-05-18    59  0
SKINNY %>% 
  ggplot(aes(TMAX,PRCP)) +
  geom_point(size = .01)

Task 14

Rerun the graphic with a different geom. Use geom_smooth()

Solution

SKINNY %>% 
  ggplot(aes(TMAX,PRCP)) +
  geom_smooth()
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'