Problem 2.10


Consider the two weekly time series oil and gas. The oil series is in dollars per barrel, while the gas series is in cents per gallon.



    1. Plot the data on the same graph. Which of the simulated series displayed in Section 1.2 do these series most resemble? Do you believe the series are stationary (explain your answer)?

      Solution:
      The two weekly time series of oil and gas can be plotted in Figure 1 using the R code below.

      library(astsa)
      par(mfrow=c(2,1),mar=c(5,4,2,4))
      plot(oil, col="#10710a", main="Weekly Time Series of Oil", ylab="Dollars per Barrel")
      plot(gas, col="#003399", main="Weekly Time Series of Gas", ylab="Cents per Gallon")

      Figure 1. Weekly time series of oil and gas


      Recalling the previous lesson, a stationary time series is characterized by statistical properties such as a constant mean value function \(\small \mu_t\) (which, does not depend on time) and autocovariance function \(\small \gamma(s,t)\), which depends on \(\small s\) and \(\small t\) only through their difference \(\small |s-t|\).

      Now, considering the plots in Figure 1, we can observe that the mean of oil and gas prices increases constantly with time until 2008. But between the period 2008 and 2010, it then falls abruptly and then increases again. With this observation, it is clear that the mean of the oil and gas prices increases is not constant over time. Hence, these series are not stationary.



    1. In economics, it is often the percentage change in price (termed growth rate or return), rather than the absolute price change, that is important. Argue that a transformation of the form \(\small y_t=\nabla \log x_t\) might be applied to the data, where \(\small x_t\) is the oil or gas price series. Hint: Recall Footnote 1.2.

      Solution:
      Let \(\small x_t\) be the oil or gas price series. Suppose \(\small y_t\) is the percentage change in prices from one month onto the next. That is, we can write it as \[\small y_t=\frac{x_t-x_{t-1}}{x_{t-1}}\]

      This can be simplified into \[\small \begin{align} x_t&=x_{t-1}y_t+x_{t-1} \\ x_t&=x_{t-1}(1+y_t) \\ 1+y_t&=\frac{x_t}{x_{t-1}} \end{align}\]

      Taking the log of both sides of the equation, \[\small \begin{align} \log (1+y_t)&=\log \bigg( \frac{x_t}{x_{t-1}} \bigg) \\ \log (1+y_t)&=\log x_t - \log x_{t-1} \end{align}\]

      But \(\small \log (1+y_t)=y_t-\frac{y^2_t}{2}+\frac{y^3_t}{3}-...\) for \(\small -1<y_t{\le}1\). If \(\small y_t\) is near zero, the higher-order terms in the expansion are negligible. That is, \[\small \log (1+y_t) \approx y_t\]

      So, the transformed series can be written as \[\small \begin{align} y_t&=\log x_t - \log x_{t-1} \\ y_t&=\nabla \log x_t \end{align}\]



    1. Transform the data as described in part (b), plot the data on the same graph, look at the sample ACFs of the transformed data, and comment.

      Solution:
      The plots of the transformed data in part (b) and its corresponding autocorrelation function (ACF) as shown in Figure 2 can be obtained using the R code below.

      par(mfrow=c(2,2))
      plot(diff(log(oil)), main="Oil", col="#10710a", ylab = "diff(oil)")
      acf(diff(log(oil)), 208, main="ACF Diff(Oil)")
      plot(diff(log(gas)), main="Gas", col="#003399", ylab= "diff(gas)")
      acf(diff(log(gas)), 208, main="ACF Diff(Gas)")

      Figure 2. Transformed time series data of oil and gas with its corresponding autocorrelation function


      As shown in the plots above, the transformed data seems to be stationary with the exception of oil prices in 2009 and gas prices near 2006. Moreover, we can observe from the corresponding ACF plots that both show exponential decay and perform like white noise in their correlation structure. This behavior is an indication of a stationary series.



    1. Plot the CCF of the transformed data and comment the small, but significant values when gas leads oil might be considered as feedback.

      Solution:
      The cross-correlation function (CCF) plot of the transformed data as shown in Figure 3 can be generated using the R code below.

      ccf(diff(log(oil)),diff(log(gas)), 208, ylab="CCF")

      Figure 3. Cross-correlation function of the transformed time series data of oil and gas


      As shown in the CCF plot above, the two series are highly correlated. The most dominant cross correlations occur when lag is zero which is 0.665. We can note that the correlations in this region are positive in which an above average value of oil is likely to lead to an above average value of gas at the same time. Likewise, a below average of oil is associated with a likely below average gas value at the same time.



    1. Exhibit scatterplots of the oil and gas growth rate series for up to three weeks of lead time of oil prices; include a nonparametric smoother in each plot and comment on the results (e.g., Are there outliers? Are the relationships linear?).

      Solution:
      The lagged scatterplot matrix can be generated to check nonlinearity of the gas growth rate series against the oil growth rate series for up to three weeks of lead time. We can obtain its scatterplot matrix as shown in Figure 4 using the R code below.

      lag2.plot(diff(log(oil)), diff(log(gas)), 3)

      Figure 4. Lagged scatterplot matrix to check nonlinearity of the gas growth rate series against the oil growth rate series for up to three weeks of lead time


      The resulting figure displays the gas growth rate series on the vertical axis is plotted against the lagged oil prices on the horizontal axis. It exhibits the sample cross-correlations as well as the lowess fits. It indicates a strong linear relationship between gas and oil at lag zero with ACF at 0.66. We also see positive but weak linear relation of 0.18, 0.01, and 0.1 when lags are 1, 2, and 3, respectively.



    1. There have been a number of studies questioning whether gasoline prices respond more quickly when oil prices are rising than when oil prices are falling (“asymmetry”). We will attempt to explore this question here with simple lagged regression; we will ignore some obvious problems such as outliers and autocorrelated errors, so this will not be a definitive analysis. Let \(\small G_t\) and \(\small O_t\) denote the gas and oil growth rates.

      1. Fit the regression (and comment on the results) \[\small G_t=\alpha_1 + \alpha_2 I_t + \beta_1 O_t + \beta_2 O_{t-1} + w_t\] where \(\small I_t=1\) if \(\small O_t \ge 0\) and 0 otherwise (\(\small I_t\) is the indicator of no growth or positive growth in oil price).


      Solution:
      If we consider adding a dummy variable to account the change between gasoline and oil prices, we can fit the regression model

      \[\small G_t = \bigg\{\begin{array}{left} \alpha_1 + \alpha_2 + \beta_1 O_t + \beta_2 O_{t-1} + w_t & if~~O_t \ge 0 \\ \alpha_1 + \beta_1 O_t + \beta_2 O_{t-1} + w_t & if~~O_t < 0 \end{array}\]

      where \(\small I_t\) is the indicator of no growth or positive growth in oil price, \(\small O_t\) is the oil growth rate at time t, and \(\small G_t\) is the gas growth rate at time t.

      Using the R code below, we get this regression result as follows:

      poil = diff(log(oil))
      pgas = diff(log(gas))
      indi = ifelse (poil < 0, 0, 1)
      mess = ts.intersect(pgas, poil, poilL = lag(poil,-1), indi)
      summary(fit <- lm(pgas~poil + poilL + indi, data=mess))
      ## 
      ## Call:
      ## lm(formula = pgas ~ poil + poilL + indi, data = mess)
      ## 
      ## Residuals:
      ##      Min       1Q   Median       3Q      Max 
      ## -0.18451 -0.02161 -0.00038  0.02176  0.34342 
      ## 
      ## Coefficients:
      ##              Estimate Std. Error t value Pr(>|t|)    
      ## (Intercept) -0.006445   0.003464  -1.860  0.06338 .  
      ## poil         0.683127   0.058369  11.704  < 2e-16 ***
      ## poilL        0.111927   0.038554   2.903  0.00385 ** 
      ## indi         0.012368   0.005516   2.242  0.02534 *  
      ## ---
      ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
      ## 
      ## Residual standard error: 0.04169 on 539 degrees of freedom
      ## Multiple R-squared:  0.4563, Adjusted R-squared:  0.4532 
      ## F-statistic: 150.8 on 3 and 539 DF,  p-value: < 2.2e-16

      Based on this result, we can see that the regression was significant with p-value of 2.2e-16 (p-value is less than 0.05).



    1. What is the fitted model when there is negative growth in oil price at time t? What is the fitted model when there is no or positive growth in oil price? Do these results support the asymmetry hypothesis?

      Solution:
      Based on the result obtained in part (i), the following are the fitted models:

      • when there is positive or no growth in oil price in time t \[\small \begin{align} G_t&=-0.006445+0.012368+0.683127O_t+0.111927O_{t-1}+w_t \\ &=0.005923+0.012368+0.683127O_t+0.111927O_{t-1}+w_t \end{align}\]

      • when there is negative growth in oil price in time t \[\small G_t=-0.006445+0.683127O_t+0.111927O_{t-1}+w_t\]

      With these 2 fitted models, we could see that the gasoline prices respond more quickly when oil prices are rising (all coefficients are positive) than when oil prices are falling (one coefficient is negative). This supports the asymmetry hypothesis.



    1. Analyze the residuals from the fit and comment.

      Solution:
      Based on the result obtained in (i), the fitted model has a residual standard error (RSE) of 0.04169 on 539 degrees of freedom. This implies that the gasoline prices deviate from the oil prices by approximately 0.04169 units in average.

      Now, Figure 5 shows the plot of the residuals of the fitted model. We can obtain this using the R code below.

      plot(ts(resid(fit)),ylab="",main="Residuals") 

      Figure 5. Residuals of the fitted model



      The residuals plot above shows a white noise series. To confirm this, Figure 6 shows the plot the ACF of the residuals. Please refer to the R code below.

      acf(resid(fit))

      Figure 6. Autocorrelation function of the Residuals



      As shown above, the ACF of the residual is indeed a white noise. This indicates the fitted model is good.





Problem 2.11


Use two different smoothing techniques described in Section 2.3 to estimate the trend in the global temperature series globtemp. Comment.


Solution:
To estimate the trend in the global temperature series, globtemp, we can use the locally weighted scatterplot smoothers (lowess) and the smoothing splines techniques.

With the lowess technique, we use 5% of the data to obtain the estimate. Figure 7 can be reproduced using the R code below.

plot(globtemp, main = "Lowess")
lines(lowess(globtemp, f=.05), lwd=2, col="#003399")
lines(lowess(globtemp), lty=2, lwd=2, col="#A3081B")

Figure 7. Lowess of the globtemp series



On the other hand, with the smoothing splines technique, we use a spar value of 0.5 to emphasize the data and a spar value of 1 to emphasize the trend. Note that the smoothing parameter spar in R is monotically related to lambda, which is a trade-off between linear regression and the data itself. Figure 8 can be reproduced using the R code below.

plot(globtemp, main = " Smoothing Splines ")
lines(smooth.spline(time(globtemp), globtemp, spar=.5), lwd=2, col="#003399")
lines(smooth.spline(time(globtemp), globtemp, spar= 1), lty=2, lwd=2, col="#A3081B")

Figure 8. Smoothing splines fit to the globtemp series



As observed, the estimated trend in each smoothing technique is almost the same. It just appears that the smoothing splines technique is a bit smoother than the lowess technique.



— Nothing follows —