Let’s find out if the high temperature on February 13th 2025 in your home town (or a town of your choice) is similar to the high temperature over the past 60 years. Find a data source of high temperatures in your city of over the last 60 years In R, create a data set (i.e., vector or data frame) for three consecutive 20 year increments.
#WEATHER DATA FROM MEDFORD, OR
weather <- read.csv("Medford weather data.csv")
View(weather)
weatherYears <- weather$Year
groupOne <- subset(weather, weatherYears < 1985)
groupTwo <- subset(weather, weatherYears < 2005 & weatherYears > 1984 )
groupThree <- subset(weather, weatherYears > 2004)
Find the z score for the high temperature for 02/13/2025 in comparison to each of these three twenty-year periods.
feb13 <- weather$High.Temperature[61]
meanOne <- mean(groupOne$High.Temperature)
meanTwo <- mean(groupTwo$High.Temperature)
meanThree <- mean(groupThree$High.Temperature)
sdOne <- sd(groupOne$High.Temperature)
sdTwo <- sd(groupTwo$High.Temperature)
sdThree <- sd(groupThree$High.Temperature)
#Z-SCORES
zScoreOne <- ((feb13 - meanOne)/sdOne)
zScoreTwo <-((feb13 - meanTwo)/sdTwo)
zScoreThree <- ((feb13 - meanThree)/sdThree)
cat("The Z-score for Feb 13th, 2025 comparative to the years 1965 to 1985 is", zScoreOne, "\n")
## The Z-score for Feb 13th, 2025 comparative to the years 1965 to 1985 is 0.6175386
cat("The Z-score for Feb 13th, 2025 comparative to the years 1985 to 2005 is", zScoreTwo, "\n")
## The Z-score for Feb 13th, 2025 comparative to the years 1985 to 2005 is 0.37278
cat("The Z-score for Feb 13th, 2025 comparative to the years 2005 to 2025 is", zScoreThree, "\n")
## The Z-score for Feb 13th, 2025 comparative to the years 2005 to 2025 is 0.3907323
Create a plot that illustrates your results. What does this result tell you about climate change? Describe how you might use z-scores to monitor for climate change.
zScores <- c(zScoreOne, zScoreTwo, zScoreThree)
means <- c(meanOne, meanTwo, meanThree)
plot(zScores, type = "b", col = "navy")
plot(means, type ="b", col ="red")
The first plot is the progression of Z scores through the three groups of years. For reference, Group One is 1965-1985. Group Two is 1985-2005. Lastly, Group Three is 2005-2025. This data set contains temperature data collected from Medford, Oregon. The second plot is the progression of the average high temperatures in those three time periods. Over the past 60 years, the average highest temperature of the day has been hotter.
All z scores are positive, which means that 02/13/2025’s high temperature is greater than the average high temperatures for each group.
These plots when looked at together show a story about when comparing the temperature of 02/13/2025 to each time period over the past 60 years, the z-score of 02/13/2025 is getting closer to 0 (the closer the z-score is to 0, the closer its raw score is closer to the mean). What that means is that 60 years ago, 2025’s high temperature is furthest from the mean, aka it was an abnormally high temperature for that time period (lower probability of having a day with that high of a temperature). But as we get to Group Two and Group Three, the z-score got much closer to 0(the mean). That suggests that what used to be a less probable temperature is now much closer to being the average high temperature.
#GROUP ONE GRAPH
groupOneTemps <- groupOne$High.Temperature
ggplot(data = groupOne, aes(x = groupOneTemps))+
labs(title = "Group One: 1965-1985")+
geom_histogram(fill = "lightblue",
color = "navy",
binwidth = 1) +
xlab("Temperature") +
ylab("Frequency")+
geom_vline(xintercept = feb13, #this is drawing a line to show where feb 13, 2025's temp is compared to other dates
color = "red",
linetype = "solid") +
annotate("text", x = feb13 +2.5, y = 2, #this line is helping position the annotation on the graph
label = "<- Feb 13, 2025", color = "black")
#GROUP tWO GRAPH
groupTwoTemps <- groupTwo$High.Temperature
ggplot(data = groupTwo, aes(x = groupTwoTemps))+
labs(title = "Group Two: 1985-2005")+
geom_histogram(fill = "lightgreen",
color = "forestgreen",
binwidth = 1,
) +
xlab("Temperature") +
ylab("Frequency")+
geom_vline(xintercept = feb13,
color = "red",
linetype = "solid") +
annotate("text", x = feb13 +3.25, y = 2,
label = "<- Feb 13, 2025", color = "black")
#GROUP THREE GRAPH
groupThreeTemps <- groupThree$High.Temperature
ggplot(data = groupThree, aes(x = groupThreeTemps))+
labs(title = "Group Three: 2005-2025")+
geom_histogram(fill = "lightpink",
color = "maroon",
binwidth = 1) +
xlab("Temperature") +
ylab("Frequency")+
geom_vline(xintercept = feb13,
color = "red",
linetype = "solid") +
annotate("text", x = feb13 +2.5, y = 2.5,
label = "<- Feb 13, 2025", color = "black")
To help further visualize where Feb 13th, 2025’s high temperature falls compared to the three different time periods, I plotted the frequencies of the temperatures. In group one, you can see the red line (02/13/2025’s temperature) falls above majority of the data and above the mean (which mirrors the Z-score it got compared to group one temperatures, z score of 0.62). With groups two and three, you can see that while Feb 13th, 2025 is stil above the average, it is above the average less than group one.
Describe how you might use z-scores to monitor for climate change:
I would use z-scores to monitor for climate change by contininously mapping z-scores from the current year in comparisopn to each year for the past 100 years or so. I would do what I did and plot each z-score. If the z-scores continiously trend closer to 0, I will take that as a sign that our earth is getting hotter. I would also plot the average highest temperatures of one day per year to see how the means are changing as well. If the means are increasing and the z-scores are getting close to 0 or staying close to 0, that is a clear factor of climate change.