Q3.4: The data file hours contains monthly values of the average hours worked per week in the U.S. manufacturing sector for July 1982 through June 1987
library(TSA)
data(hours)
hours
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1982 38.9 39.0 38.9 39.0 39.3 39.7
## 1983 39.2 38.8 39.6 39.8 39.9 40.3 40.0 40.2 40.8 40.7 40.8 41.2
## 1984 40.6 40.7 40.7 40.9 40.6 40.8 40.3 40.4 40.7 40.5 40.7 41.2
## 1985 40.3 39.7 40.4 40.1 40.3 40.6 40.1 40.5 40.8 40.8 40.9 41.7
## 1986 40.7 40.3 40.7 40.5 40.6 40.8 40.2 40.7 41.0 40.7 41.0 41.5
## 1987 40.8 40.8 40.9 40.4 40.9 41.1
plot(hours, type='l', ylab='Month hours')
q <- season(hours)
points(y=hours, x = time(hours), pch=as.vector(q))
December always seems to be high and Feb pretty low
Q3.5: The data file wages contains monthly values of the average hourly wages (in dol- lars) for workers in the U.S. apparel and textile products industry for July 1981 through June 1987. (a) Display and interpret the time series plot for these data. (b) Use least squares to fit a linear time trend to this time series. Interpret the regression output. Save the standardized residuals from the fit for further anal- ysis. (c) Construct and interpret the time series plot of the standardized residuals from part (b). (d) Use least squares to fit a quadratic time trend to the wages time series. Inter- pret the regression output. Save the standardized residuals from the fit for fur- ther analysis. (e) Construct and interpret the time series plot of the standardized residuals from part (d).
data(wages)
plot(wages, type='o', ylab='wages per hour')
#linear model
wages.lm = lm(wages~time(wages))
summary(wages.lm) #r square seems perfect
##
## Call:
## lm(formula = wages ~ time(wages))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.23828 -0.04981 0.01942 0.05845 0.13136
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.490e+02 1.115e+01 -49.24 <2e-16 ***
## time(wages) 2.811e-01 5.618e-03 50.03 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08257 on 70 degrees of freedom
## Multiple R-squared: 0.9728, Adjusted R-squared: 0.9724
## F-statistic: 2503 on 1 and 70 DF, p-value: < 2.2e-16
plot(y=rstandard(wages.lm), x=as.vector(time(wages)), type = 'o')
#Quadratic model trend
wages.qm = lm(wages ~ time(wages) + I(time(wages)^2))
summary(wages.qm)
##
## Call:
## lm(formula = wages ~ time(wages) + I(time(wages)^2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.148318 -0.041440 0.001563 0.050089 0.139839
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.495e+04 1.019e+04 -8.336 4.87e-12 ***
## time(wages) 8.534e+01 1.027e+01 8.309 5.44e-12 ***
## I(time(wages)^2) -2.143e-02 2.588e-03 -8.282 6.10e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05889 on 69 degrees of freedom
## Multiple R-squared: 0.9864, Adjusted R-squared: 0.986
## F-statistic: 2494 on 2 and 69 DF, p-value: < 2.2e-16
#time series plot of the standardized residuals
plot(y=rstandard(wages.qm), x=as.vector(time(wages)), type = 'o')
Link: http://www.r-tutor.com/elementary-statistics/simple-linear-regression/standardized-residual
Q3.6: The data file beersales contains monthly U.S. beer sales (in millions of barrels) for the period January 1975 through December 1990. (a) Display and interpret the plot the time series plot for these data. (b) Now construct a time series plot that uses separate plotting symbols for the various months. Does your interpretation change from that in part (a)? (c) Use least squares to fit a seasonal-means trend to this time series. Interpret the regression output. Save the standardized residuals from the fit for further anal- ysis. (d) Construct and interpret the time series plot of the standardized residuals from part (c). Be sure to use proper plotting symbols to check on seasonality in the standardized residuals. (e) Use least squares to fit a seasonal-means plus quadratic time trend to the beer sales time series. Interpret the regression output. Save the standardized residu- als from the fit for further analysis. (f) Construct and interpret the time series plot of the standardized residuals from part (e). Again use proper plotting symbols to check for any remaining sea- sonality in the residuals.
data("beersales")
plot(beersales, type='o')
points(y = beersales, x= time(beersales), pch = as.vector(season(beersales)))
#Using least squares
month = season(beersales)
beersales.lm = lm(beersales ~ month)
summary(beersales.lm) #linear between month and sales gives better R square(0.69) compared to year(0.15)
##
## Call:
## lm(formula = beersales ~ month)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5745 -0.4772 0.1759 0.7312 2.1023
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.48568 0.26392 47.309 < 2e-16 ***
## monthFebruary -0.14259 0.37324 -0.382 0.702879
## monthMarch 2.08219 0.37324 5.579 8.77e-08 ***
## monthApril 2.39760 0.37324 6.424 1.15e-09 ***
## monthMay 3.59896 0.37324 9.643 < 2e-16 ***
## monthJune 3.84976 0.37324 10.314 < 2e-16 ***
## monthJuly 3.76866 0.37324 10.097 < 2e-16 ***
## monthAugust 3.60877 0.37324 9.669 < 2e-16 ***
## monthSeptember 1.57282 0.37324 4.214 3.96e-05 ***
## monthOctober 1.25444 0.37324 3.361 0.000948 ***
## monthNovember -0.04797 0.37324 -0.129 0.897881
## monthDecember -0.42309 0.37324 -1.134 0.258487
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.056 on 180 degrees of freedom
## Multiple R-squared: 0.7103, Adjusted R-squared: 0.6926
## F-statistic: 40.12 on 11 and 180 DF, p-value: < 2.2e-16
plot(y=rstandard(beersales.lm), x=as.vector(time(beersales)), type = 'o')
#Using quadratic time trend
#Quadratic model trend
beersales.qm = lm(beersales ~ month + I(time(beersales)^2))
#summary(beersales.qm) #gives 0.86 r squared value
#time series plot of the standardized residuals
plot(y=rstandard(beersales.qm), x= as.vector(time(beersales)), type='o')
points(y = rstandard(beersales.qm), x= as.vector(time(beersales)), pch = as.vector(season(beersales)))
Q3.7: The data file winnebago contains monthly unit sales of recreational vehicles from Winnebago, Inc., from November 1966 through February 1972. (a) Display and interpret the time series plot for these data. (b) Use least squares to fit a line to these data. Interpret the regression output. Plot the standardized residuals from the fit as a time series. Interpret the plot. (c) Now take natural logarithms of the monthly sales figures and display and Trends interpret the time series plot of the transformed values. (d) Use least squares to fit a line to the logged data. Display and interpret the time series plot of the standardized residuals from this fit. (e) Now use least squares to fit a seasonal-means plus linear time trend to the logged sales time series and save the standardized residuals for further analysis. Check the statistical significance of each of the regression coefficients in the model. (f) Display the time series plot of the standardized residuals obtained in part (e). Interpret the plot.
data("winnebago")
plot(winnebago, type='l', ylab="sales-monthly")
points(y = winnebago, x = time(winnebago), pch = as.vector(season(winnebago)))
#Using least squares
winnebago.lm = lm(winnebago ~ time(winnebago))
summary(winnebago.lm)
##
## Call:
## lm(formula = winnebago ~ time(winnebago))
##
## Residuals:
## Min 1Q Median 3Q Max
## -419.58 -93.13 -12.78 94.96 759.21
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -394885.68 33539.77 -11.77 <2e-16 ***
## time(winnebago) 200.74 17.03 11.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 209.7 on 62 degrees of freedom
## Multiple R-squared: 0.6915, Adjusted R-squared: 0.6865
## F-statistic: 138.9 on 1 and 62 DF, p-value: < 2.2e-16
#plotting the residuals
plot(y = rstandard(winnebago.lm), x = as.vector(time(winnebago)), type = 'o')
points(y = rstandard(winnebago.lm), x = as.vector(time(winnebago)), pch = as.vector(season(winnebago)))
#take natural logarithms of the monthly sales
plot(log(winnebago), type = 'l')
points(y = log(winnebago), x = time(winnebago), pch = as.vector(season(winnebago)))
#Least squares to fill log data
winnebago.log.lm = lm(log(winnebago) ~ time(log(winnebago)))
summary(winnebago.log.lm)
##
## Call:
## lm(formula = log(winnebago) ~ time(log(winnebago)))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.03669 -0.20823 0.04995 0.25662 0.86223
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -984.93878 62.99472 -15.63 <2e-16 ***
## time(log(winnebago)) 0.50306 0.03199 15.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3939 on 62 degrees of freedom
## Multiple R-squared: 0.7996, Adjusted R-squared: 0.7964
## F-statistic: 247.4 on 1 and 62 DF, p-value: < 2.2e-16
#plotting the residuals
plot(y = rstandard(winnebago.log.lm), x = as.vector(time(winnebago)), type='o')
points(y = rstandard(winnebago.log.lm), x = as.vector(time(winnebago)), pch = as.vector(season(winnebago)))
# least squares to fit a seasonal-means plus linear time trend to the logged sales
month = season(winnebago)
winnebago.log.lm2 = lm( log(winnebago) ~ month + time(log(winnebago)))
summary(winnebago.log.lm2) # r square 86%
##
## Call:
## lm(formula = log(winnebago) ~ month + time(log(winnebago)))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.92501 -0.16328 0.03344 0.20757 0.57388
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -997.33061 50.63995 -19.695 < 2e-16 ***
## monthFebruary 0.62445 0.18182 3.434 0.001188 **
## monthMarch 0.68220 0.19088 3.574 0.000779 ***
## monthApril 0.80959 0.19079 4.243 9.30e-05 ***
## monthMay 0.86953 0.19073 4.559 3.25e-05 ***
## monthJune 0.86309 0.19070 4.526 3.63e-05 ***
## monthJuly 0.55392 0.19069 2.905 0.005420 **
## monthAugust 0.56989 0.19070 2.988 0.004305 **
## monthSeptember 0.57572 0.19073 3.018 0.003960 **
## monthOctober 0.26349 0.19079 1.381 0.173300
## monthNovember 0.28682 0.18186 1.577 0.120946
## monthDecember 0.24802 0.18182 1.364 0.178532
## time(log(winnebago)) 0.50909 0.02571 19.800 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3149 on 51 degrees of freedom
## Multiple R-squared: 0.8946, Adjusted R-squared: 0.8699
## F-statistic: 36.09 on 12 and 51 DF, p-value: < 2.2e-16
plot(y = rstandard(winnebago.log.lm2), x = as.vector(time(winnebago)), type='o')
points(y = rstandard(winnebago.log.lm2), x = as.vector(time(winnebago)), pch = as.vector(season(winnebago)))