On class last Thursday we reviewed topics in 6.3 and 6.4. 6.3 was specifically focused on seasonal variation. We compared data sets to observe two types of seasonal variation, constant and increasing. Ideally we would like to see data sets that show constant variation, which means that the magnitude of the observed variation stays constant over time. The increasing variation, as you could imagine, would show an increasing variation in magnitude over time. We can build models for both of these types, but the constant variation is much easier and more accurate to model. A common technique is to perform transformations that can make the data appear like a constant variation model.

Below is an example of a data set with an increasing seasonal variatin

library(faraway)
## Warning: package 'faraway' was built under R version 3.4.4
library(alr3)
data(airpass)
attach(airpass)
head(airpass)
##   pass     year
## 1  112 49.08333
## 2  118 49.16667
## 3  132 49.25000
## 4  129 49.33333
## 5  121 49.41667
## 6  135 49.50000
plot(year,pass, type ="l")

##Notice the increasing variance as time increases. We can classify it as a increasing seasonal variance. We perform a transformation to get a more constant plot
plot(log(year),log(pass),type="l")

##The log transformation appears to give it a better constant variance appearance

Creating seasonal models

year1 <- floor(airpass$year)
monthvar <- airpass$year - year1
monthfactor <-factor(round(monthvar*12)) 
cbind(airpass$year, monthvar)
##                   monthvar
##   [1,] 49.08333 0.08333333
##   [2,] 49.16667 0.16666667
##   [3,] 49.25000 0.25000000
##   [4,] 49.33333 0.33333333
##   [5,] 49.41667 0.41666667
##   [6,] 49.50000 0.50000000
##   [7,] 49.58333 0.58333333
##   [8,] 49.66667 0.66666667
##   [9,] 49.75000 0.75000000
##  [10,] 49.83333 0.83333333
##  [11,] 49.91667 0.91666667
##  [12,] 50.00000 0.00000000
##  [13,] 50.08333 0.08333333
##  [14,] 50.16667 0.16666667
##  [15,] 50.25000 0.25000000
##  [16,] 50.33333 0.33333333
##  [17,] 50.41667 0.41666667
##  [18,] 50.50000 0.50000000
##  [19,] 50.58333 0.58333333
##  [20,] 50.66667 0.66666667
##  [21,] 50.75000 0.75000000
##  [22,] 50.83333 0.83333333
##  [23,] 50.91667 0.91666667
##  [24,] 51.00000 0.00000000
##  [25,] 51.08333 0.08333333
##  [26,] 51.16667 0.16666667
##  [27,] 51.25000 0.25000000
##  [28,] 51.33333 0.33333333
##  [29,] 51.41667 0.41666667
##  [30,] 51.50000 0.50000000
##  [31,] 51.58333 0.58333333
##  [32,] 51.66667 0.66666667
##  [33,] 51.75000 0.75000000
##  [34,] 51.83333 0.83333333
##  [35,] 51.91667 0.91666667
##  [36,] 52.00000 0.00000000
##  [37,] 52.08333 0.08333333
##  [38,] 52.16667 0.16666667
##  [39,] 52.25000 0.25000000
##  [40,] 52.33333 0.33333333
##  [41,] 52.41667 0.41666667
##  [42,] 52.50000 0.50000000
##  [43,] 52.58333 0.58333333
##  [44,] 52.66667 0.66666667
##  [45,] 52.75000 0.75000000
##  [46,] 52.83333 0.83333333
##  [47,] 52.91667 0.91666667
##  [48,] 53.00000 0.00000000
##  [49,] 53.08333 0.08333333
##  [50,] 53.16667 0.16666667
##  [51,] 53.25000 0.25000000
##  [52,] 53.33333 0.33333333
##  [53,] 53.41667 0.41666667
##  [54,] 53.50000 0.50000000
##  [55,] 53.58333 0.58333333
##  [56,] 53.66667 0.66666667
##  [57,] 53.75000 0.75000000
##  [58,] 53.83333 0.83333333
##  [59,] 53.91667 0.91666667
##  [60,] 54.00000 0.00000000
##  [61,] 54.08333 0.08333333
##  [62,] 54.16667 0.16666667
##  [63,] 54.25000 0.25000000
##  [64,] 54.33333 0.33333333
##  [65,] 54.41667 0.41666667
##  [66,] 54.50000 0.50000000
##  [67,] 54.58333 0.58333333
##  [68,] 54.66667 0.66666667
##  [69,] 54.75000 0.75000000
##  [70,] 54.83333 0.83333333
##  [71,] 54.91667 0.91666667
##  [72,] 55.00000 0.00000000
##  [73,] 55.08333 0.08333333
##  [74,] 55.16667 0.16666667
##  [75,] 55.25000 0.25000000
##  [76,] 55.33333 0.33333333
##  [77,] 55.41667 0.41666667
##  [78,] 55.50000 0.50000000
##  [79,] 55.58333 0.58333333
##  [80,] 55.66667 0.66666667
##  [81,] 55.75000 0.75000000
##  [82,] 55.83333 0.83333333
##  [83,] 55.91667 0.91666667
##  [84,] 56.00000 0.00000000
##  [85,] 56.08333 0.08333333
##  [86,] 56.16667 0.16666667
##  [87,] 56.25000 0.25000000
##  [88,] 56.33333 0.33333333
##  [89,] 56.41667 0.41666667
##  [90,] 56.50000 0.50000000
##  [91,] 56.58333 0.58333333
##  [92,] 56.66667 0.66666667
##  [93,] 56.75000 0.75000000
##  [94,] 56.83333 0.83333333
##  [95,] 56.91667 0.91666667
##  [96,] 57.00000 0.00000000
##  [97,] 57.08333 0.08333333
##  [98,] 57.16667 0.16666667
##  [99,] 57.25000 0.25000000
## [100,] 57.33333 0.33333333
## [101,] 57.41667 0.41666667
## [102,] 57.50000 0.50000000
## [103,] 57.58333 0.58333333
## [104,] 57.66667 0.66666667
## [105,] 57.75000 0.75000000
## [106,] 57.83333 0.83333333
## [107,] 57.91667 0.91666667
## [108,] 58.00000 0.00000000
## [109,] 58.08333 0.08333333
## [110,] 58.16667 0.16666667
## [111,] 58.25000 0.25000000
## [112,] 58.33333 0.33333333
## [113,] 58.41667 0.41666667
## [114,] 58.50000 0.50000000
## [115,] 58.58333 0.58333333
## [116,] 58.66667 0.66666667
## [117,] 58.75000 0.75000000
## [118,] 58.83333 0.83333333
## [119,] 58.91667 0.91666667
## [120,] 59.00000 0.00000000
## [121,] 59.08333 0.08333333
## [122,] 59.16667 0.16666667
## [123,] 59.25000 0.25000000
## [124,] 59.33333 0.33333333
## [125,] 59.41667 0.41666667
## [126,] 59.50000 0.50000000
## [127,] 59.58333 0.58333333
## [128,] 59.66667 0.66666667
## [129,] 59.75000 0.75000000
## [130,] 59.83333 0.83333333
## [131,] 59.91667 0.91666667
## [132,] 60.00000 0.00000000
## [133,] 60.08333 0.08333333
## [134,] 60.16667 0.16666667
## [135,] 60.25000 0.25000000
## [136,] 60.33333 0.33333333
## [137,] 60.41667 0.41666667
## [138,] 60.50000 0.50000000
## [139,] 60.58333 0.58333333
## [140,] 60.66667 0.66666667
## [141,] 60.75000 0.75000000
## [142,] 60.83333 0.83333333
## [143,] 60.91667 0.91666667
## [144,] 61.00000 0.00000000
# Changes levels of factor to be month names 
levels(monthvar) <- c("Jan", "Feb", "Mar", "Apr", "May", 
                      "Jun", "Jul", "Aug", "Sep", "Oct",
                      "Nov", "Dec")  

summary(monthvar) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.2292  0.4583  0.4583  0.6875  0.9167
mod <- lm(log(pass) ~ year1+monthfactor)
coef(mod)
##   (Intercept)         year1  monthfactor1  monthfactor2  monthfactor3 
##  -1.214997885   0.120825657   0.031389870   0.019403852   0.159699778 
##  monthfactor4  monthfactor5  monthfactor6  monthfactor7  monthfactor8 
##   0.138499729   0.146195892   0.278410898   0.392422029   0.393195995 
##  monthfactor9 monthfactor10 monthfactor11 
##   0.258630198   0.130540762  -0.003108143
summary(mod)
## 
## Call:
## lm(formula = log(pass) ~ year1 + monthfactor)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.156370 -0.041016  0.003677  0.044069  0.132324 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -1.214998   0.081277 -14.949  < 2e-16 ***
## year1          0.120826   0.001432  84.399  < 2e-16 ***
## monthfactor1   0.031390   0.024253   1.294    0.198    
## monthfactor2   0.019404   0.024253   0.800    0.425    
## monthfactor3   0.159700   0.024253   6.585 1.00e-09 ***
## monthfactor4   0.138500   0.024253   5.711 7.19e-08 ***
## monthfactor5   0.146196   0.024253   6.028 1.58e-08 ***
## monthfactor6   0.278411   0.024253  11.480  < 2e-16 ***
## monthfactor7   0.392422   0.024253  16.180  < 2e-16 ***
## monthfactor8   0.393196   0.024253  16.212  < 2e-16 ***
## monthfactor9   0.258630   0.024253  10.664  < 2e-16 ***
## monthfactor10  0.130541   0.024253   5.382 3.28e-07 ***
## monthfactor11 -0.003108   0.024253  -0.128    0.898    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0593 on 131 degrees of freedom
## Multiple R-squared:  0.9835, Adjusted R-squared:  0.982 
## F-statistic: 649.4 on 12 and 131 DF,  p-value: < 2.2e-16
plot(log(pass)~year,type="l")
lines(airpass$year, mod$fitted.values, col= "blue")

We can see that this new transformed model does a fairly well job at matching our data.