Ch.1: Q. 1-5, Ch. 2: Q. 1,3,6
Ch.1:
Q1: The goal of this study is to compare travel behavior patterns before and after September 11, 2001. This involves comparing a predictive data trend to a descriptive data trend. The predictive data trend was obtained by using data before September 11, 2001 to generate or predict data after September 11, 2001. This predictive portion was then compared to a descriptive data trend coming from the actual data post September 11, 2001. The actual data is simply the historical data being analyzed hence is descriptive (no forecasting involved). Therefore the goal of this study is both predictive and descriptive.
Q2: With respect to the consideration of a forecast horizon, we need to determine how far into the future we want to predict. Since the data is given monthly, it makes sense to forecast \(k\) months into the future. We may wish to forecast quarterly, that is, a 3-month step into the future. A next month forecast (\(k=1\)) would be even more precise and would be my preferred forecast horizon. If we had data given daily, it would be interesting to use a next-day forecast horizon to compare forecasts using data before 9/11 to the actual data for days after 9/11.
Q3: Since there are three different types of miles travelled (airplane, rail, vehicle), three series need to be forecasted. The forecasted data will be compared to the actual data from September 2001 to April 2004. So if new data is not being accumulated during the analysis and report generation, then the forecasting will be a one-time event. The data of the miles travelled is contained in the file Sept11Travel.xls available at the Bureau of Transportation Statistics (BTS) website and the software is dependent on which software BTS uses in their analysis. I assume BTS has highly qualified statisticians/data analysts at their disposal to do the forecasting.
Q4: \(t=1,2,3\) in the Air series denotes the first three time periods (first three months) for the series of monthly airline revenue passenger miles. The first time period, \(t=1\), denotes the month of January 1990.
Q5: From the file “Sept11Travel.xls”, we locate the date “Jan-90” in the first column and in the “Air RPM (000s)” column we see the value \(y_1 = 35,153,577\). The next two values below this give us \(y_2\) and \(y_3\) which are equal to \(32,965,187\) and \(39,993,913\), respectively. To be clear, these values are the number of actual airline revenue passenger miles (thousands) for the months of January 1990, February 1990, and March 1990, respectively.
Ch.2:
Q1: The plots of the three pre-event (01/1991 to 08/2001) time series are displayed below.
There certainly appears to be seasonal components for all three time series, as shown by the periodic travel “peaks” for the summer months.
To better visualize the trend, we suppress seasonality by first aggregating quarterly.
Now aggregating by year yields the following:
For the Air and Vehicle time series, there is a clear positive linear trend while the Rail series shows a somewhat negative trend up to year 1997 then seems to level out. We proceed by fitting linear trend lines to the Air and Vehicle series and a quadratic trend line to the Rail series.
\[\textbf{Air Series Linear Model} \\ \text{ } \\ \left.\begin{array}{l c c c c} \text{Coefficient} & \text{Estimate} & \text{Std. Error} & \text{p-value} \\ \text{Intercept} & 35790086 & 868588 & <2^{-16} \\ \text{Trend} & -991842 & 145738 & 3.65^{-10} \end{array}\right. \] \[\textbf{Rail Series Quadratic Model} \\ \text{ } \\ \left.\begin{array}{l c c c} \text{Coefficient} & \text{Estimate} & \text{Std. Error} & \text{p-value} \\ \text{Intercept} & 570791561 & 16364895 & <2^{-16} \\ \text{Trend} & -1732057 & 585658 & 3.65^{-10} \\ \text{Trend}^2 & 5738 & 4398 & 0.19439 \end{array}\right. \] \[\textbf{Vehicle Series Linear Model} \\ \text{ } \\ \left.\begin{array}{l c c c c} \text{Coefficient} & \text{Estimate} & \text{Std. Error} & \text{p-value} \\ \text{Intercept} & 172.78475 & 2.52783 & <2^{-16} \\ \text{Trend} & 0.45184 & 0.03401 & <2^{-16} \end{array}\right. \]
As observed from the above results, linear fits for the Air and Vehicle series seems very appropriate and achieve very statistically significant p-values for the Trend estimates. Although the Rail quadratic model does not achieve a statistically significant p-value associated with the quadratic term, a linear fit yields a negative trend, yet the latter portion of the series increases. Thus a linear fit for the Rail series does not seem appropriate and a quadratic fit is kept.
Plotting the Air and Vehicle series with a log-transformation on the y-axis does not produce a more linear fit. We try to obtain a more linear trend for the Rail series by performing the following log-transformation on the y-axis:
It cannot be observed that a more linear trend is achieved by the log-transformation and as noted previously a quadratic fit might be more appropriate than a linear fit for the Rail series.
Q3: (a) A time plot of the quarterly data is shown below:
We do see a positive linear trend for the first two years, but the trend decreases linearly for the year 1987 then increases linearly for the year 1988. We will attempt to fit a linear trend to the data.
A summary of the linear model is given below. \[ \left.\begin{array}{l c c c} \text{Coefficient} & \text{Estimate} & \text{Std. Error} & \text{p-value} \\ \text{Intercept} & 4167.663 & 111.035 & <2^{-16} \\ \text{Trend} & 24.494 & 9.269 & 0.0165 \end{array}\right. \] Although the linear trend line does not seem to fit the data very well, we do achieve a statistically significant (p=0.0165) trend for the linear model at the 95% significance level. Thus we can say that a linear trend does seem to be present in the series.
Q6: (a) A time plot of the data is shown below:
Although we cannot easily pinpoint with simple observation the average value of the series from the plot, the level is always present in a series. Like the level, noise (random variation) is assumed to be unobservable but always present. We do however witness some random variation from the plot. The trend is quite easily observed from the plot and can be described as approximately quadratic. There also appears to be some seasonality with sales peaks occuring around the Fall season every year (Nov. 1995, Oct. 1996, Sept. 1997).