Iron is one of the key commodities traded worldwide. ‘The Economist’ has referred to iron as the most important commodity after oil. It’s because iron is the one of the raw materials to make steel. Steel production goes into buildings and infrastructure, automotive, mechanical equipment and more. Because of its high demand, iron production is closely related to economy as well. Therefore, creating a model and performing forecasting on iron production is important to those who study economics.
For this study, quarterly iron production in Australia from 1956 to 1994 data will be used. This data shows boom and bust of iron production in Australia. This shows good demonstration on how iron production changed as number of competitors increases.
Figure 1
We will start by visualizing the data. From Figure 1, we can see significant growth rate in iron production from 1956 to about 1975. This is due to high demand in iron after the World War II. This brought massive boom in the industry. There is significant drop in 1980. In the early 1980s, many industries began to use manufacturers in China and South East Asia where they could purchase goods for less money, including iron and steel. Because the behavior of this time series, it might be a better idea not to use time series regression.
Figure 2
From Figure 2, we can see the ACF decreasing slowly while the PACF looks white noise. This is a sign that this time series is non-stationary, and detrending will be required.
Figure 3
To decrease the fluctuation of the time series, the log transformation was applied. As we can see from Figure 3, it fluctuation did not disappear, but it did decrease a little bit.
Figure 4
Since behavior of this time series changes over time, I have decided to apply Lowess smoothing to the data. From Figure 4, ‘Actual vs. Fitted’ plot, we can see that it fits pretty good, but does not capture the significant decreases from year 1980 and year 1990.
Figure 5
As we can see from Figure 5, first order differencing and second order differencing did not show significant difference. And the first order difference looks for the model.
Figure 6
By analyzing Figure 6, the ACF and the PACF of the time series, I have decided to use ARIMA (3,1,0). Both plots do not show any sign of seasonality. To make sure there is no better model, several different AIC values from different ARIMA models were compared.
| MA0 | MA1 | MA2 | MA3 | |
|---|---|---|---|---|
| AR0 | -302.9367 | -314.2411 | -316.9349 | -315.0675 |
| AR1 | -307.5330 | -315.2961 | -314.9564 | -315.9805 |
| AR2 | -315.4985 | -317.6391 | -319.3311 | -319.4251 |
| AR3 | -322.9428 | -323.1305 | -321.4552 | -319.4645 |
From the table above, we can see ARIMA(3,1,1) has the least AIC value. However, ARIMA(3,1,0) has a smaller MA parameter and AIC value is bigger only by a small amount. Therefore, I have decided to use ARIMA (3,1,0).
## initial value -2.405283
## iter 2 value -2.486043
## iter 3 value -2.502656
## iter 4 value -2.505649
## iter 5 value -2.505703
## iter 6 value -2.505721
## iter 7 value -2.505721
## iter 7 value -2.505721
## iter 7 value -2.505721
## final value -2.505721
## converged
## initial value -2.512672
## iter 2 value -2.512682
## iter 3 value -2.512694
## iter 4 value -2.512696
## iter 5 value -2.512696
## iter 6 value -2.512696
## iter 6 value -2.512696
## iter 6 value -2.512696
## final value -2.512696
## converged
Figure 7
From Figure 7, a plot of the standard residuals does not show any obvious patterns. There is an outlier close to 4 standard deviations around year 1980. However, rest of the points looks good. The ACF of the residuals look white noise. The norm QQ plot looks good. Most of the points seems to be within the bound. The Q-statistic is not significant at the lags shown. The p-values are above 0.05 at any given lags. This model is good to use.
Figure 8
Figure 8 is a plot of Actual vs fitted value on log applied time series. There are some gaps between actual and fitted value but looks good overall.
Figure 9
Figure 9 shows predicted value on log applied time series. We can that predicted value slowly increasing.
| pred | se | |
|---|---|---|
| 1994 Qtr4 | 1916.333 | 1.084304 |
| 1995 Qtr1 | 1900.281 | 1.101536 |
| 1995 Qtr2 | 1941.450 | 1.108358 |
| 1995 Qtr3 | 1969.898 | 1.112788 |
| 1995 Qtr4 | 1982.620 | 1.122813 |
| 1996 Qtr1 | 1989.613 | 1.132799 |
| 1996 Qtr2 | 2007.898 | 1.140607 |
| 1996 Qtr3 | 2028.855 | 1.146967 |
Based on diagnosis of the modeling and prediction, we can conclude that ARIMA (3,1,0) model is easy to handle and captured main feature of the time series. However, as we can see from Figure 1, there are sudden increase and decrease as time increases. And there was no seasonal pattern with this time series, but it is possible that there might be a seasonal pattern with newer data.