This assignment is based on the Feb 10 Weather notes I discussed on that day.
Using as much of my code as you need, replicate my graph showing the mean value of the daily maximum temperature with the following changes for every calendar day.
Use the median, the minimum, and the maximum instead of the mean. Give them different colors, but don’t bother with a legend.
Fix the date display so that no specific year appears.
Use an alternative to the default theme.
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0 v purrr 0.3.4
## v tibble 3.0.0 v dplyr 0.8.5
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:dplyr':
##
## intersect, setdiff, union
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(ggthemes)
load("olyw1018.RData")
# Put your code here.
weather <- olyw1018 %>%
group_by(mo,dy) %>%
summarize(maxtempmedian = median(TMAX), maxtempmax = max(TMAX), maxtempmin = min(TMAX)) %>%
ungroup() %>%
mutate(date = make_date(2020, mo, dy)) #gives default date for all columns for the ggplot
weather <- weather %>%
pivot_longer(cols = starts_with("max"),
names_to = "measure",
values_to = "temp",
values_drop_na = TRUE) #pivot longer so color can be used in the visualization
vis1 <- ggplot(weather, aes(x=date,y=temp, color = measure)) +
geom_point(alpha=.5,size=.5, show.legend = FALSE) +
ggtitle("Average Daily Maximum Temperature") +
theme(plot.title = element_text(hjust = 0.5)) +
scale_x_date(date_labels = "%b-%d") + #removes year from tick labels
theme_minimal()
ggplotly(vis1)
We have continuous variables TMAX and PRCP to describe the weather for a day. Add the following discrete variables to the dataframe.
The new variable warmth has the value “Cold” if the value of TMAX is below 40. It has the value “Warm” if TMAX is between 40 and 75. It is “Hot” otherwise.
The new variable wetness has the value “Dry” if PRCP is 0. It has the value “Damp” if PRCP is positive but below .2. Otherwise it has the value “Wet”.
Create these variables using the dplyr function case_when().
Use appropriate graphics to describe the relationships between the continuous variables and the discrete variables on which they were based to verify that your code worked. Pick a different theme from Problem 1.
# Place your code here.
weather2 <- olyw1018 %>%
mutate(
tempfeel = case_when(
TMAX < 40 ~ "Cold",
TMAX >= 40 & TMAX <= 75 ~ "Warm",
TMAX > 75 ~ "Hot"
),
wetfeel = case_when(
PRCP == 0 ~ "Dry",
PRCP > 0 & PRCP < 0.2 ~ "Damp",
PRCP >= 0.2 ~ "Wet"
)
)
ggplot(weather2, aes(x=reorder(wetfeel, PRCP), y=PRCP, color = wetfeel)) +
geom_point(alpha = 0.2) +
labs(title='Precipitation as a Categorical Variable', x='Precipitation Feel', y='Precipitation Amount') +
theme_economist_white()
ggplot(weather2, aes(x=reorder(tempfeel, TMAX), y=TMAX, color = tempfeel)) +
geom_point(alpha = 0.2) +
labs(title='Temperature as a Categorical Variable', x='Temp Feel', y='Temp Max') +
theme_economist_white()
We saw a few different graphs useful in examining the relationship between two categorical variables. Try two of them here. Use different themes.
# Place your code here.
ggplot(weather2, aes(x=reorder(tempfeel, TMAX), fill = wetfeel)) +
stat_count(position = 'dodge') +
labs(title='Comparison of Temp Feel and Wet Feel', x='Temp Feel', y='Count') +
theme_fivethirtyeight()
ggplot(weather2, aes(x=reorder(tempfeel, TMAX), y = TMAX, color = wetfeel)) +
geom_boxplot() +
labs(title='Comparison of Temp Feel and Wet Feel', x='Temp Feel', y='Temp Max') +
theme_tufte()
There is no required code. Describe the relationship between warmth and wetness. Also comment on the relative effectiveness of the graphs you used.
It appears that when the temperature is cold, days that are damp, dry, or wet occur at a roughly similar rate. For warm days, there is a much higher number of dry days, and about an equal number of damp and wet days. On hot days, it is mostly dry days with more damp days than wet days.
On cold days, the average max temperature is about 30 degrees. For warm days, the average max temperature is higher at around 65 for dry days and lower near 55 for damp and wet days. On hot days, the average max temperature remains at around 80 for all levels of precipitation.
I think both graphs work well to show the relationship. The bar graph gives an overview of the distribution of all thr types of days, cold damp days, cold dry day, etc. The boxplot gives more details and shows how the those categorical variables work with the max temperature of the day.