require(zoo)
require(moments)
require(lubridate)
Plot and decompose AirPassengers.
#?AirPassengers
#The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960.
airp<-AirPassengers
str(airp);tsp(airp);summary(airp);head(airp);length(airp)
## Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...
## [1] 1949.000 1960.917 12.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 104.0 180.0 265.5 280.3 360.5 622.0
## [1] 112 118 132 129 121 135
## [1] 144
plot(airp, xlab = "Year", ylab = "Thousands of Air Passengers", main = "Monthly International Air Passenger data")
This looks like it should. From 1949 to 1960 with a frequency of 12. This will be interesting to decompose since there is not as smooth of a seasonality to the data as there is with co2. There should be a clear trend but I wonder if a non-linear trend would be more applicable.
boxplot(airp~cycle(airp), xlab = "Year", ylab = "Thousands of Air Passengers", main = "Monthly International Air Passenger data")
You can see which months are the ‘peak’ months but also that the data has a lot of variability within months. I don’t really like the boxplot since there is clearly an increasing trend that makes comparing months across all years hollow.
LOST<-decompose(AirPassengers, type = "additive")
str(LOST);head(LOST$trend);tail(LOST$trend)
## List of 6
## $ x : Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...
## $ seasonal: Time-Series [1:144] from 1949 to 1961: -24.75 -36.19 -2.24 -8.04 -4.51 ...
## $ trend : Time-Series [1:144] from 1949 to 1961: NA NA NA NA NA ...
## $ random : Time-Series [1:144] from 1949 to 1961: NA NA NA NA NA ...
## $ figure : num [1:12] -24.75 -36.19 -2.24 -8.04 -4.51 ...
## $ type : chr "additive"
## - attr(*, "class")= chr "decomposed.ts"
## [1] NA NA NA NA NA NA
## [1] NA NA NA NA NA NA
plot(LOST)
My first thought is that the random is does not look random. Not that that is what random means, but still. My very rough guess is that since seasonal has to be even over the whole series the ‘random’ noise is actually showing the change is seasonality. Applied to this data, I would say that betwen 1949 and 1960 more people are international air passengers(trend), more people are air passengers during the peak (seasonal), and that the proportion of total travel that happens during the peaks has increased (random). On the seasonal plot, you can see the offseason, shoulder season, and peak season.
Also you can see that the first and last six months are lost for trend and random.
str, summary, and head before plotting kbli.kbli <- readRDS("airportZoo.rds")
str(kbli);summary(kbli);head(kbli)
## 'zoo' series from 1941-01-01 to 2019-03-31
## Data: num [1:28548, 1:3] 0 0 0 7.6 0.5 0 4.6 0 11.4 0 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:3] "prcp" "tmax" "tmin"
## Index: Date[1:28548], format: "1941-01-01" "1941-01-02" "1941-01-03" "1941-01-04" "1941-01-05" ...
## Index prcp tmax tmin
## Min. :1941-01-01 Min. : 0.000 Min. :-12.80 Min. :-18.900
## 1st Qu.:1960-08-16 1st Qu.: 0.000 1st Qu.: 10.00 1st Qu.: 1.700
## Median :1980-03-01 Median : 0.000 Median : 14.40 Median : 5.600
## Mean :1980-02-29 Mean : 2.413 Mean : 14.43 Mean : 5.494
## 3rd Qu.:1999-09-15 3rd Qu.: 2.500 3rd Qu.: 19.40 3rd Qu.: 10.000
## Max. :2019-03-31 Max. :83.800 Max. : 35.60 Max. : 19.400
## prcp tmax tmin
## 1941-01-01 0.0 6.7 -5.6
## 1941-01-02 0.0 7.2 -6.7
## 1941-01-03 0.0 5.6 -4.4
## 1941-01-04 7.6 7.2 1.1
## 1941-01-05 0.5 5.0 -6.1
## 1941-01-06 0.0 6.1 1.1
Same data as A visit to the zoo by Andy Bunn. You see the that the object is class zoo with daily data points for prcp, tmax and tmin(precipitation, maximum temperature, and minimum temperature). I would add that the summary statistics for the variables leads me to believe that they are in millimeters and \(^\circ\)Celsius.
plot(kbli)
Plotting kbli shows a clear seasonality to tmax and tmin. prcp is harder to decipher so it will be interesting to see how this procedes.
Aggregate variablesprcpprcp<-kbli[,colnames(kbli) =="prcp"]
a_prcp<-aggregate(prcp, as.yearmon, sum)
str(a_prcp)
## 'zoo' series from Jan 1941 to Mar 2019
## Data: num [1:938] 88.2 53.5 30.5 46.3 93.3 24.9 12.6 68.5 95.1 78.1 ...
## Index: 'yearmon' num [1:938] Jan 1941 Feb 1941 Mar 1941 Apr 1941 ...
I created a zoo object with the Date index and prcp but without the two temperature variables. This was done so that the prcp is summed instead of averaged like tmax and tmin will be. prcp was aggregated to monthly resolution just as temp will be.
tempstemp<-kbli[,colnames(kbli) !="prcp"]
a_temp<-aggregate(temp, as.yearmon, mean)
str(a_temp)
## 'zoo' series from Jan 1941 to Mar 2019
## Data: num [1:938, 1:2] 10 12.2 15.8 16.9 17.6 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:938] "1" "32" "60" "91" ...
## ..$ : chr [1:2] "tmax" "tmin"
## Index: 'yearmon' num [1:938] Jan 1941 Feb 1941 Mar 1941 Apr 1941 ...
Summed max and min temperature would be an interesting metric, but the average is much more applicable. a_prcp and a_temp are both the same length.
## ..$ : chr [1:938] "1" "32" "60" "91" ... is interesting as it shows the remaining rows or some sort of equivalent. This doesn’t show for a_prcp which leads me to believe it is a result of the mean fx opposed to the sum fx.
prcp and temp just to decomposea_kbli<-cbind(a_prcp,a_temp)
str(a_kbli)
## 'zoo' series from Jan 1941 to Mar 2019
## Data: num [1:938, 1:3] 88.2 53.5 30.5 46.3 93.3 24.9 12.6 68.5 95.1 78.1 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:3] "a_prcp" "tmax" "tmin"
## Index: 'yearmon' num [1:938] Jan 1941 Feb 1941 Mar 1941 Apr 1941 ...
head(a_kbli);head(a_prcp);head(a_temp)
## a_prcp tmax tmin
## Jan 1941 88.2 10.00323 0.2967742
## Feb 1941 53.5 12.15357 -0.1571429
## Mar 1941 30.5 15.75484 1.6645161
## Apr 1941 46.3 16.85333 3.3466667
## May 1941 93.3 17.63226 5.8225806
## Jun 1941 24.9 19.92000 7.8833333
## Jan 1941 Feb 1941 Mar 1941 Apr 1941 May 1941 Jun 1941
## 88.2 53.5 30.5 46.3 93.3 24.9
## tmax tmin
## Jan 1941 10.00323 0.2967742
## Feb 1941 12.15357 -0.1571429
## Mar 1941 15.75484 1.6645161
## Apr 1941 16.85333 3.3466667
## May 1941 17.63226 5.8225806
## Jun 1941 19.92000 7.8833333
plot(a_kbli)
It’s alive! After being pulled apart and decimated (dodecimated?) seperately the variables fit back together and are ready for decompose
decompMaybe it was unnecessary to bind the variables back together as they will put through decomp seperately but it made for a good little plot.
prcpThis next step took a lot of work to get through. decompose didn’t seem to enjoy my objects and would spit back a variety of things from non-conformable rays and incompatible methods to the catchall Error in decompose(ts_prcp) : time series has no or less than 2 periods.
I figured that if I wasn’t sure what was going on I should just define my own ts object with the correct start and end that had the correct frequency.
ts_prcp<-ts(a_prcp, start = c(1941,1), end = c(2019,3), frequency=12)
tspd<-decompose(ts_prcp)
plot(tspd)
This looks about how I would expect it to. There is a non-random trend likely due to ENSO strengths, a highly regular seasonal pattern, and a good amount of noise thrown in there as well. In Bellingham we would expect higher precipitation during the winters and lower during the summers.
jan.pr<-as.vector(subset(a_prcp, months(time(a_prcp)) == "January"))
jul.pr<-as.vector(subset(a_prcp, months(time(a_prcp)) == "July"))
sub<-jan.pr-jul.pr
boxplot(jan.pr, jul.pr, sub, notch = T, ylab = "Monthly Precip (mm)", main = '1941-2019',names=c("Jan","July", "Jan-July"))
As expected, January has higher mean precip than July does. What surprises me is that the difference between Jan and July by year is not consistent. The third boxplot shows that some years January has ~230 mm more precipitation that in July of the same year while there are also years that July gets more precipitation!
At first I didn’t believe it but here it is:
head(jan.pr);head(jul.pr);head(sub)
## [1] 88.20 33.00 37.65 132.00 104.30 139.20
## [1] 12.6 64.2 43.0 0.3 4.0 21.2
## [1] 75.60 -31.20 -5.35 131.70 100.30 118.00
The second year (1942) has a higher cumulative precipitation in July than in January.
tempsa_tmax<-a_temp[,colnames(a_temp) =="tmax"]
ts_tmax<-ts(a_tmax, start = c(1941,1), end = c(2019,3), frequency=12)
tstmax<-decompose(ts_tmax)
plot(tstmax)
a_tmin<-a_temp[,colnames(a_temp) =="tmin"]
ts_tmin<-ts(a_tmin, start = c(1941,1), end = c(2019,3), frequency=12)
tstmin<-decompose(ts_tmin)
plot(tstmin)
I am going to aggregate each variable by year (sum for prcp, mean for tmax and tmin). Then a simple linear model will be calculated and plotted with the aggregate.
prcp annual trendsprcp_yr <- aggregate(ts_prcp,FUN = sum, nfrequency = 1)
lmprcp <- lm(prcp_yr~time(prcp_yr))
plot(prcp_yr,ylab= "Precipitation (mm)", xlab = "Year", main = "Sum Annual Precipitation at Bellingham Airport")
abline(lmprcp,col="darkred")
There is a trend over the years but there is a lot of variability in the model. Let’s look at it.
summary(lmprcp)
##
## Call:
## lm(formula = prcp_yr ~ time(prcp_yr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -275.711 -93.912 -3.807 92.292 312.657
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.7135 1273.7723 0.027 0.978
## time(prcp_yr) 0.4281 0.6434 0.665 0.508
##
## Residual standard error: 127.9 on 76 degrees of freedom
## Multiple R-squared: 0.00579, Adjusted R-squared: -0.007291
## F-statistic: 0.4426 on 1 and 76 DF, p-value: 0.5079
I don’t think I have ever seen a negative adj R-squared before. High p-value, huge residuals, and such a low R-squared verifies that there is not a perceptible trend.
tmax annual trendstmax_yr <- aggregate(ts_tmax,FUN = mean, nfrequency = 1)
lmtmax <- lm(tmax_yr~time(tmax_yr))
plot(tmax_yr,ylab=expression(paste("Temperature, ",degree,"C")), xlab = "Year", main = "Annual Mean Max Temperature at Bellingham Airport")
abline(lmtmax,col="darkred")
This is, again, very irregular. Looking at the lm summary should be helpful. I’m realizing that the year is probably less of a meaningfull unit of time than maybe a cycle of ENSO effects.
summary(lmtmax)
##
## Call:
## lm(formula = tmax_yr ~ time(tmax_yr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.84354 -0.51508 -0.00901 0.43000 1.95690
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.539270 7.869100 0.958 0.341
## time(tmax_yr) 0.003472 0.003975 0.873 0.385
##
## Residual standard error: 0.7904 on 76 degrees of freedom
## Multiple R-squared: 0.00994, Adjusted R-squared: -0.003087
## F-statistic: 0.763 on 1 and 76 DF, p-value: 0.3851
Still not near a meaningful trend by year.
tmin annual trendstmin_yr <- aggregate(ts_tmin,FUN = mean, nfrequency = 1)
lmtmin <- lm(tmin_yr~time(tmin_yr))
plot(tmin_yr,ylab=expression(paste("Temperature, ",degree,"C")), xlab = "Year", main = "Annual Mean Min Temperature at Bellingham Airport")
abline(lmtmin,col="darkred")
Now this could be a more significant relationship.
summary(lmtmin)
##
## Call:
## lm(formula = tmin_yr ~ time(tmin_yr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8478 -0.4926 0.1101 0.4801 1.4304
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -56.649841 7.064632 -8.019 1.01e-11 ***
## time(tmin_yr) 0.031389 0.003569 8.796 3.29e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7096 on 76 degrees of freedom
## Multiple R-squared: 0.5045, Adjusted R-squared: 0.4979
## F-statistic: 77.37 on 1 and 76 DF, p-value: 3.285e-13
It’s even useful if you’re a frequentist. Very interesting, I wish I knew anything about this and why tmax would have a different trend than tmin.
I am going to show the code for prcp then will hide it for tmax and tmin because it is so extensive.
My first step to look at the trends in winter, spring, summer, and autumn was to take the rolling sum/mean over three months. I then took that value from the center month of each season (Jan=Winter, April=Spring, July=Summer, Oct=Autumn). I next ensured that the data was a ts object with the starting time and frequency that I wanted. I created a linear model for each season and also for the entire data set.
prcprollprcp<-rollsum(a_prcp, 3, align = "center")
str(rollprcp)
## 'zoo' series from Feb 1941 to Feb 2019
## Data: num [1:936] 172 130 170 164 131 ...
## Index: 'yearmon' num [1:936] Feb 1941 Mar 1941 Apr 1941 May 1941 ...
#Jan is the sum of winter; Apr is spring;Jul is summer; Oct is autumn.
win.pr<-(subset(rollprcp, months(time(rollprcp)) == "January"))
str(win.pr)
## 'zoo' series from Jan 1942 to Jan 2019
## Data: num [1:77] 178 253 251 271 308 ...
## Index: 'yearmon' num [1:77] Jan 1942 Jan 1943 Jan 1945 Jan 1946 ...
spr.pr<-(subset(rollprcp, months(time(rollprcp)) == "April"))
str(spr.pr)
## 'zoo' series from Apr 1941 to Apr 2018
## Data: num [1:78] 170 170 164 125 215 ...
## Index: 'yearmon' num [1:78] Apr 1941 Apr 1942 Apr 1943 Apr 1944 ...
sum.pr<-(subset(rollprcp, months(time(rollprcp)) == "July"))
str(sum.pr)
## 'zoo' series from Jul 1941 to Jul 2018
## Data: num [1:78] 106 182.5 128.8 62.1 27.1 ...
## Index: 'yearmon' num [1:78] Jul 1941 Jul 1942 Jul 1943 Jul 1944 ...
aut.pr<-(subset(rollprcp, months(time(rollprcp)) == "October"))
str(aut.pr)
## 'zoo' series from Oct 1941 to Oct 2018
## Data: num [1:78] 284 194 160 211 394 ...
## Index: 'yearmon' num [1:78] Oct 1941 Oct 1942 Oct 1943 Oct 1944 ...
ts.win.pr<-ts(win.pr, start = c(1942), end = c(2019), frequency=1)
ts.spr.pr<-ts(spr.pr, start = c(1941), end = c(2019), frequency=1)
ts.sum.pr<-ts(sum.pr, start = c(1941), end = c(2019), frequency=1)
ts.aut.pr<-ts(aut.pr, start = c(1941), end = c(2018), frequency=1)
lmwinpr<-lm(ts.win.pr~time(ts.win.pr))
lmsprpr<-lm(ts.spr.pr~time(ts.spr.pr))
lmsumpr<-lm(ts.sum.pr~time(ts.sum.pr))
lmautpr<-lm(ts.aut.pr~time(ts.aut.pr))
lmallpr<-lm(rollprcp~time(rollprcp))
summary(lmwinpr);summary(lmsprpr);summary(lmsumpr);summary(lmautpr);summary(lmallpr)
##
## Call:
## lm(formula = ts.win.pr ~ time(ts.win.pr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -155.91 -67.03 -17.70 56.97 281.55
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1549.6730 910.6071 1.702 0.0929 .
## time(ts.win.pr) -0.6262 0.4598 -1.362 0.1772
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 91.42 on 76 degrees of freedom
## Multiple R-squared: 0.02383, Adjusted R-squared: 0.01098
## F-statistic: 1.855 on 1 and 76 DF, p-value: 0.1772
##
## Call:
## lm(formula = ts.spr.pr ~ time(ts.spr.pr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -90.384 -36.491 -7.146 34.337 123.082
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1029.9395 482.4164 -2.135 0.0359 *
## time(ts.spr.pr) 0.6215 0.2436 2.551 0.0127 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 49.38 on 77 degrees of freedom
## Multiple R-squared: 0.07792, Adjusted R-squared: 0.06595
## F-statistic: 6.507 on 1 and 77 DF, p-value: 0.01273
##
## Call:
## lm(formula = ts.sum.pr ~ time(ts.sum.pr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -95.524 -29.140 -5.219 25.496 85.123
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 757.9967 396.8600 1.910 0.0599 .
## time(ts.sum.pr) -0.3309 0.2004 -1.651 0.1028
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 40.62 on 77 degrees of freedom
## Multiple R-squared: 0.0342, Adjusted R-squared: 0.02166
## F-statistic: 2.727 on 1 and 77 DF, p-value: 0.1028
##
## Call:
## lm(formula = ts.aut.pr ~ time(ts.aut.pr))
##
## Residuals:
## Min 1Q Median 3Q Max
## -187.789 -35.111 6.645 48.146 162.566
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -619.3716 770.9856 -0.803 0.424
## time(ts.aut.pr) 0.4484 0.3895 1.151 0.253
##
## Residual standard error: 77.44 on 76 degrees of freedom
## Multiple R-squared: 0.01714, Adjusted R-squared: 0.00421
## F-statistic: 1.326 on 1 and 76 DF, p-value: 0.2532
##
## Call:
## lm(formula = rollprcp ~ time(rollprcp))
##
## Residuals:
## Min 1Q Median 3Q Max
## -200.00 -83.15 -15.51 65.05 404.23
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -29.0280 312.1915 -0.093 0.926
## time(rollprcp) 0.1260 0.1577 0.799 0.424
##
## Residual standard error: 108.6 on 934 degrees of freedom
## Multiple R-squared: 0.0006831, Adjusted R-squared: -0.0003869
## F-statistic: 0.6384 on 1 and 934 DF, p-value: 0.4245
plot.zoo(cbind(rollprcp),
plot.type = "single",
col = c("black"), ylim = c(0,750), ylab = "Precipitation (mm)", xlab = "Year", main = "3 month rolling sum of precipitation with seasonal trend lines")
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn", "All"),
fill = c("cyan", "green", "gold", "red", "navy"), cex=0.8, horiz = TRUE, title = "Season")
abline(lmwinpr, col = "cyan", lwd = 2.5)
abline(lmsprpr, col = "green", lwd = 2.5)
abline(lmsumpr, col = "gold", lwd = 2.5)
abline(lmautpr, col = "red", lwd = 2.5)
abline(lmallpr, col = "navy", lwd = 2.5)
This is more interesting than the annual trend lines. Just by looking at both the plot and the information from the linear models, you can see that there is an increase over the years for spring and autumn and a decrease for winter and summer. Spring is even significant. I think the most interesting aspect is how the seasons compare to each other and the total trend on the plot. Let’s just plot the seasonal information with better resolution.
plot(cbind(ts.win.pr, ts.spr.pr, ts.sum.pr, ts.aut.pr), plot.type = "single", col = c("cyan", "green", "gold", "red", "navy"), ylim = c(0,750), ylab= "Precipitation (mm)"
, xlab = "Year", main = "Sum of precipitation by season", lwd = 1.5)
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn"),
fill = c("cyan", "green", "gold", "red"), cex=0.8, horiz = TRUE, title = "Season")
This is very interesting but I know nothing about weather ¯\(0.0)/¯ I do think that this would be very useful for finding more patterns in the data.
tmaxplot.zoo(cbind(rolltmax),
plot.type = "single",
col = c("black"), ylim = c(0,30), ylab=expression(paste("Temperature, ",degree,"C"))
, xlab = "Year", main = "3 month rolling average of maximum temperature with seasonal trend ")
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn", "All"),
fill = c("cyan", "green", "gold", "red", "navy"), cex=0.8, horiz = TRUE, title = "Season")
abline(lmwintM, col = "cyan", lwd = 2.5)
abline(lmsprtM, col = "green", lwd = 2.5)
abline(lmsumtM, col = "gold", lwd = 2.5)
abline(lmauttM, col = "red", lwd = 2.5)
abline(lmalltM, col = "navy", lwd = 2.5)
The plot and the linear model statistics are less interesting than for precipitation. I hope that the tmin will be different since it was different on the annual trends above.
Let’s repeat what we did for prcp and look at just the seasonal data.
plot(cbind(ts.win.tM, ts.spr.tM, ts.sum.tM, ts.aut.tM), plot.type = "single", col = c("cyan", "green", "gold", "red", "navy"), ylim = c(3,27), ylab=expression(paste("Temperature, ",degree,"C"))
, xlab = "Year", main = "3 month rolling average of maximum temperature by season", lwd = 1.5)
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn"),
fill = c("cyan", "green", "gold", "red"), cex=0.8, horiz = TRUE, title = "Season")
Again, I wish I knew exactly how interesting this was. There are some years that the tmax dips for all seasons but there are also years where summer goes up and the other seasons go down.
tminplot.zoo(cbind(rolltmin),
plot.type = "single",
col = c("black"), ylim = c(-4,18), ylab=expression(paste("Temperature, ",degree,"C"))
, xlab = "Year", main = "3 month rolling average of minimum temperature with seasonal trends ")
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn", "All"),
fill = c("cyan", "green", "gold", "red", "navy"), cex=0.8, horiz = TRUE, title = "Season")
abline(lmwintm, col = "cyan", lwd = 2.5)
abline(lmsprtm, col = "green", lwd = 2.5)
abline(lmsumtm, col = "gold", lwd = 2.5)
abline(lmauttm, col = "red", lwd = 2.5)
abline(lmalltm, col = "navy", lwd = 2.5)
That is good. Pretty, pretty, pretty good. Shows the same trend line as the annual data and it is interestingly across all seasons.
Since that annual trend was significant I think it is worth looking at all of the linear models.
summary(lmwintm);summary(lmsprtm);summary(lmsumtm);summary(lmauttm);summary(lmalltm)
##
## Call:
## lm(formula = ts.win.tm ~ time(ts.win.tm))
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1109 -0.6309 0.1554 0.9487 2.5955
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -40.336290 13.736139 -2.937 0.00439 **
## time(ts.win.tm) 0.020614 0.006935 2.972 0.00396 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.379 on 76 degrees of freedom
## Multiple R-squared: 0.1041, Adjusted R-squared: 0.09236
## F-statistic: 8.835 on 1 and 76 DF, p-value: 0.003956
##
## Call:
## lm(formula = ts.spr.tm ~ time(ts.spr.tm))
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2490 -0.7047 0.1378 0.7128 1.9277
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -57.650001 9.600934 -6.005 5.92e-08 ***
## time(ts.spr.tm) 0.031456 0.004849 6.488 7.63e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9827 on 77 degrees of freedom
## Multiple R-squared: 0.3534, Adjusted R-squared: 0.345
## F-statistic: 42.09 on 1 and 77 DF, p-value: 7.626e-09
##
## Call:
## lm(formula = ts.sum.tm ~ time(ts.sum.tm))
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.10598 -0.55741 0.08196 0.57954 2.58280
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -55.004493 8.979301 -6.126 3.56e-08 ***
## time(ts.sum.tm) 0.033387 0.004535 7.363 1.70e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9191 on 77 degrees of freedom
## Multiple R-squared: 0.4131, Adjusted R-squared: 0.4055
## F-statistic: 54.21 on 1 and 77 DF, p-value: 1.699e-10
##
## Call:
## lm(formula = ts.aut.tm ~ time(ts.aut.tm))
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.12792 -0.61341 -0.01509 0.62617 1.90639
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -54.969949 9.523141 -5.772 1.61e-07 ***
## time(ts.aut.tm) 0.030630 0.004811 6.367 1.33e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9566 on 76 degrees of freedom
## Multiple R-squared: 0.3479, Adjusted R-squared: 0.3393
## F-statistic: 40.54 on 1 and 76 DF, p-value: 1.331e-08
##
## Call:
## lm(formula = rolltmin ~ time(rolltmin))
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.8398 -3.3761 -0.3026 3.6548 8.1167
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -53.948182 11.220176 -4.808 1.77e-06 ***
## time(rolltmin) 0.030014 0.005666 5.297 1.47e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.904 on 934 degrees of freedom
## Multiple R-squared: 0.02917, Adjusted R-squared: 0.02813
## F-statistic: 28.06 on 1 and 934 DF, p-value: 1.467e-07
All look to be significant but with adj R-squared at or below 0.4. That is probably still pretty good for data like this.
plot(cbind(ts.win.tm, ts.spr.tm, ts.sum.tm, ts.aut.tm), plot.type = "single", col = c("cyan", "green", "gold", "red", "navy"), ylim = c(-5,18), ylab=expression(paste("Temperature, ",degree,"C"))
, xlab = "Year", main = "3 month rolling average of minimum temperature by season", lwd = 1.5)
legend("topleft",
legend=c("Winter", "Spring", "Summer", "Autumn"),
fill = c("cyan", "green", "gold", "red"), cex=0.8, horiz = TRUE, title = "Season")
I have found it interesting how similar spring and autumn are for all of these plots. I think that this has been a great exercise for practicing the code but wish I knew more about the science so that I could understand it better and know what to pull out and what to leave behind.