Problem 2.10
Consider the two weekly time series oil and gas. The oil series is in dollars per barrel, while the gas series is in cents per gallon.
Solution:
The two weekly time series of oil and gas can be plotted in Figure 1 using the R code below.
library(astsa)
par(mfrow=c(2,1),mar=c(5,4,2,4))
plot(oil, col="#10710a", main="Weekly Time Series of Oil", ylab="Dollars per Barrel")
plot(gas, col="#003399", main="Weekly Time Series of Gas", ylab="Cents per Gallon")
Recalling the previous lesson, a stationary time series is characterized by statistical properties such as a constant mean value function \(\small \mu_t\) (which, does not depend on time) and autocovariance function \(\small \gamma(s,t)\), which depends on \(\small s\) and \(\small t\) only through their difference \(\small |s-t|\).
Now, considering the plots in Figure 1, we can observe that the mean of oil and gas prices increases constantly with time until 2008. But between the period 2008 and 2010, it then falls abruptly and then increases again. With this observation, it is clear that the mean of the oil and gas prices increases is not constant over time. Hence, these series are not stationary.
Solution:
Let \(\small x_t\) be the oil or gas price series. Suppose \(\small y_t\) is the percentage change in prices from one month onto the next. That is, we can write it as \[\small y_t=\frac{x_t-x_{t-1}}{x_{t-1}}\]
This can be simplified into \[\small \begin{align} x_t&=x_{t-1}y_t+x_{t-1} \\ x_t&=x_{t-1}(1+y_t) \\ 1+y_t&=\frac{x_t}{x_{t-1}} \end{align}\]
Taking the log of both sides of the equation, \[\small \begin{align} \log (1+y_t)&=\log \bigg( \frac{x_t}{x_{t-1}} \bigg) \\ \log (1+y_t)&=\log x_t - \log x_{t-1} \end{align}\]
But \(\small \log (1+y_t)=y_t-\frac{y^2_t}{2}+\frac{y^3_t}{3}-...\) for \(\small -1<y_t{\le}1\). If \(\small y_t\) is near zero, the higher-order terms in the expansion are negligible. That is, \[\small \log (1+y_t) \approx y_t\]
So, the transformed series can be written as \[\small \begin{align} y_t&=\log x_t - \log x_{t-1} \\ y_t&=\nabla \log x_t \end{align}\]
Solution:
The plots of the transformed data in part (b) and its corresponding autocorrelation function (ACF) as shown in Figure 2 can be obtained using the R code below.
par(mfrow=c(2,2))
plot(diff(log(oil)), main="Oil", col="#10710a", ylab = "diff(oil)")
acf(diff(log(oil)), 208, main="ACF Diff(Oil)")
plot(diff(log(gas)), main="Gas", col="#003399", ylab= "diff(gas)")
acf(diff(log(gas)), 208, main="ACF Diff(Gas)")
As shown in the plots above, the transformed data seems to be stationary with the exception of oil prices in 2009 and gas prices near 2006. Moreover, we can observe from the corresponding ACF plots that both show exponential decay and perform like white noise in their correlation structure. This behavior is an indication of a stationary series.
Solution:
The cross-correlation function (CCF) plot of the transformed data as shown in Figure 3 can be generated using the R code below.
ccf(diff(log(oil)),diff(log(gas)), 208, ylab="CCF")
As shown in the CCF plot above, the two series are highly correlated. The most dominant cross correlations occur when lag is zero which is 0.665. We can note that the correlations in this region are positive in which an above average value of oil is likely to lead to an above average value of gas at the same time. Likewise, a below average of oil is associated with a likely below average gas value at the same time.
Solution:
The lagged scatterplot matrix can be generated to check nonlinearity of the gas growth rate series against the oil growth rate series for up to three weeks of lead time. We can obtain its scatterplot matrix as shown in Figure 4 using the R code below.
lag2.plot(diff(log(oil)), diff(log(gas)), 3)
The resulting figure displays the gas growth rate series on the vertical axis is plotted against the lagged oil prices on the horizontal axis. It exhibits the sample cross-correlations as well as the lowess fits. It indicates a strong linear relationship between gas and oil at lag zero with ACF at 0.66. We also see positive but weak linear relation of 0.18, 0.01, and 0.1 when lags are 1, 2, and 3, respectively.
Solution:
If we consider adding a dummy variable to account the change between gasoline and oil prices, we can fit the regression model
\[\small G_t = \bigg\{\begin{array}{left} \alpha_1 + \alpha_2 + \beta_1 O_t + \beta_2 O_{t-1} + w_t & if~~O_t \ge 0 \\ \alpha_1 + \beta_1 O_t + \beta_2 O_{t-1} + w_t & if~~O_t < 0 \end{array}\]
where \(\small I_t\) is the indicator of no growth or positive growth in oil price, \(\small O_t\) is the oil growth rate at time t, and \(\small G_t\) is the gas growth rate at time t.
Using the R code below, we get this regression result as follows:
poil = diff(log(oil))
pgas = diff(log(gas))
indi = ifelse (poil < 0, 0, 1)
mess = ts.intersect(pgas, poil, poilL = lag(poil,-1), indi)
summary(fit <- lm(pgas~poil + poilL + indi, data=mess))
##
## Call:
## lm(formula = pgas ~ poil + poilL + indi, data = mess)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.18451 -0.02161 -0.00038 0.02176 0.34342
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.006445 0.003464 -1.860 0.06338 .
## poil 0.683127 0.058369 11.704 < 2e-16 ***
## poilL 0.111927 0.038554 2.903 0.00385 **
## indi 0.012368 0.005516 2.242 0.02534 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04169 on 539 degrees of freedom
## Multiple R-squared: 0.4563, Adjusted R-squared: 0.4532
## F-statistic: 150.8 on 3 and 539 DF, p-value: < 2.2e-16
Based on this result, we can see that the regression was significant with p-value of 2.2e-16 (p-value is less than 0.05).
Solution:
Based on the result obtained in part (i), the following are the fitted models:
when there is positive or no growth in oil price in time t \[\small \begin{align} G_t&=-0.006445+0.012368+0.683127O_t+0.111927O_{t-1}+w_t \\ &=0.005923+0.012368+0.683127O_t+0.111927O_{t-1}+w_t \end{align}\]
when there is negative growth in oil price in time t \[\small G_t=-0.006445+0.683127O_t+0.111927O_{t-1}+w_t\]
With these 2 fitted models, we could see that the gasoline prices respond more quickly when oil prices are rising (all coefficients are positive) than when oil prices are falling (one coefficient is negative). This supports the asymmetry hypothesis.
Solution:
Based on the result obtained in (i), the fitted model has a residual standard error (RSE) of 0.04169 on 539 degrees of freedom. This implies that the gasoline prices deviate from the oil prices by approximately 0.04169 units in average.
Now, Figure 5 shows the plot of the residuals of the fitted model. We can obtain this using the R code below.
plot(ts(resid(fit)),ylab="",main="Residuals")
The residuals plot above shows a white noise series. To confirm this, Figure 6 shows the plot the ACF of the residuals. Please refer to the R code below.
acf(resid(fit))
As shown above, the ACF of the residual is indeed a white noise. This indicates the fitted model is good.
Problem 2.11
Use two different smoothing techniques described in Section 2.3 to estimate the trend in the global temperature series globtemp. Comment.
Solution:
To estimate the trend in the global temperature series, globtemp, we can use the locally weighted scatterplot smoothers (lowess) and the smoothing splines techniques.
With the lowess technique, we use 5% of the data to obtain the estimate. Figure 7 can be reproduced using the R code below.
plot(globtemp, main = "Lowess")
lines(lowess(globtemp, f=.05), lwd=2, col="#003399")
lines(lowess(globtemp), lty=2, lwd=2, col="#A3081B")
On the other hand, with the smoothing splines technique, we use a spar value of 0.5 to emphasize the data and a spar value of 1 to emphasize the trend. Note that the smoothing parameter spar in R is monotically related to lambda, which is a trade-off between linear regression and the data itself. Figure 8 can be reproduced using the R code below.
plot(globtemp, main = " Smoothing Splines ")
lines(smooth.spline(time(globtemp), globtemp, spar=.5), lwd=2, col="#003399")
lines(smooth.spline(time(globtemp), globtemp, spar= 1), lty=2, lwd=2, col="#A3081B")
As observed, the estimated trend in each smoothing technique is almost the same. It just appears that the smoothing splines technique is a bit smoother than the lowess technique.