I used simple summary statistics and plotting tools to look at trends in the data.
data(lynx)
plot(lynx)
summary(lynx)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 39.0 348.2 771.0 1538.0 2567.0 6991.0
class(lynx)
## [1] "ts"
The data was collected over a roughly 100 year period from 1821-1934. The plot reveals some very large fluctuations in trappings of Canadian lynx, with sharp increases and decreases every decade or so. class(lynx) reveals that the data set is in fact a time series (‘ts’).
library(moments)
## Warning: package 'moments' was built under R version 3.2.3
kt <-kurtosis(lynx)
sk <- skewness(lynx)
my.xlim <- range(lynx)
h<-hist(lynx, breaks=10, col="lightblue", xlab="lynx numbers",
main="",xlim=my.xlim)
xfit<-seq(min(lynx),max(lynx),length=100)
yfit<-dnorm(xfit,mean=mean(lynx),sd=sd(lynx))
yfit <- yfit*diff(h$mids[1:2])*length(lynx)
lines(xfit, yfit, col="darkblue", lwd=2)
boxplot(lynx, horizontal=TRUE, outline=TRUE, axes=FALSE,
ylim=my.xlim, col = "lightgreen", add = TRUE, boxwex=5)
text(x =3700 , y=40, labels = paste("Kurtosis=",round(kt,2)),pos = 4)
text(x =3700 , y=33, labels = paste("Skewness=",round(sk,2)),pos = 4)
I fit the data to a linear model to determine the extent to which the number of trapped lynx varied from year to year.
lynxtime <- time(lynx)
lynx.lm <- lm(lynx ~ lynxtime)
summary(lynx.lm)
##
## Call:
## lm(formula = lynx ~ lynxtime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1594 -1211 -755 1032 5366
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4630.034 8493.112 -0.545 0.587
## lynxtime 3.285 4.523 0.726 0.469
##
## Residual standard error: 1589 on 112 degrees of freedom
## Multiple R-squared: 0.004689, Adjusted R-squared: -0.004198
## F-statistic: 0.5276 on 1 and 112 DF, p-value: 0.4691
I interpreted the lm output to mean that the number of trapped lynx increased annually by 3 animals over the length of the data set.
I then used a moving average filtering method in order to smooth the data.
ma10 <- filter(x=lynx, filter=rep(x=1/10,times=10), sides=2)
ma5 <- filter(x=lynx, filter=rep(x=1/5,times=5), sides=2)
plot(lynx,col="grey")
lines(ma10,col="red",lwd=2)
lines(ma5,col="blue",lwd=2)
abline(lynx.lm, col="black",lwd=2, lty="dashed")
Note the trend line has a very slight increase over the time period.
I took some advise and played around with the filter lengths a little.
ma8 <- filter(x=lynx, filter=rep(x=1/8,times=8), sides=2)
ma5 <- filter(x=lynx, filter=rep(x=1/5,times=5), sides=2)
plot(lynx,col="grey")
lines(ma8,col="red",lwd=2)
lines(ma5,col="blue",lwd=2)
abline(lynx.lm, col="black",lwd=2, lty="dashed")
ma6 <- filter(x=lynx, filter=rep(x=1/6,times=6), sides=2)
ma5 <- filter(x=lynx, filter=rep(x=1/5,times=5), sides=2)
plot(lynx,col="grey")
lines(ma6,col="red",lwd=2)
lines(ma5,col="blue",lwd=2)
abline(lynx.lm, col="black",lwd=2, lty="dashed")
It seems that a filter rep of 1/8 is better than that of 10 or 6, and is a more of a median value.
lynxforecasts <- HoltWinters(lynx, beta=FALSE, gamma=FALSE)
lynxforecasts
## Holt-Winters exponential smoothing without trend and without seasonal component.
##
## Call:
## HoltWinters(x = lynx, beta = FALSE, gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.9999339
## beta : FALSE
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 3395.951
lynxforecasts$fitted
## Time Series:
## Start = 1822
## End = 1934
## Frequency = 1
## xhat level
## 1822 269.00000 269.00000
## 1823 320.99656 320.99656
## 1824 584.98255 584.98255
## 1825 870.98109 870.98109
## 1826 1474.96007 1474.96007
## 1827 2820.91102 2820.91102
## 1828 3927.92681 3927.92681
## 1829 5942.86679 5942.86679
## 1830 4950.06564 4950.06564
## 1831 2577.15688 2577.15688
## 1832 523.13579 523.13579
## 1833 98.02810 98.02810
## 1834 183.99432 183.99432
## 1835 278.99372 278.99372
## 1836 408.99141 408.99141
## 1837 2284.87598 2284.87598
## 1838 2684.97355 2684.97355
## 1839 3408.95214 3408.95214
## 1840 1824.10478 1824.10478
## 1841 409.09355 409.09355
## 1842 151.01706 151.01706
## 1843 45.00701 45.00701
## 1844 67.99848 67.99848
## 1845 212.99041 212.99041
## 1846 545.97799 545.97799
## 1847 1032.96780 1032.96780
## 1848 2128.92754 2128.92754
## 1849 2535.97309 2535.97309
## 1850 957.10438 957.10438
## 1851 361.03941 361.03941
## 1852 376.99894 376.99894
## 1853 225.01005 225.01005
## 1854 359.99108 359.99108
## 1855 730.97547 730.97547
## 1856 1637.94004 1637.94004
## 1857 2724.92814 2724.92814
## 1858 2870.99034 2870.99034
## 1859 2119.04971 2119.04971
## 1860 684.09487 684.09487
## 1861 299.02546 299.02546
## 1862 236.00417 236.00417
## 1863 244.99941 244.99941
## 1864 551.97971 551.97971
## 1865 1622.92920 1622.92920
## 1866 3310.88841 3310.88841
## 1867 6720.77457 6720.77457
## 1868 4254.16307 4254.16307
## 1869 687.23581 687.23581
## 1870 255.02857 255.02857
## 1871 472.98559 472.98559
## 1872 358.00760 358.00760
## 1873 783.97184 783.97184
## 1874 1593.94645 1593.94645
## 1875 1675.99458 1675.99458
## 1876 2250.96199 2250.96199
## 1877 1426.05454 1426.05454
## 1878 756.04430 756.04430
## 1879 299.03021 299.03021
## 1880 201.00648 201.00648
## 1881 228.99815 228.99815
## 1882 468.98413 468.98413
## 1883 735.98235 735.98235
## 1884 2041.91366 2041.91366
## 1885 2810.94916 2810.94916
## 1886 4430.89290 4430.89290
## 1887 2511.12692 2511.12692
## 1888 389.14029 389.14029
## 1889 73.02090 73.02090
## 1890 39.00225 39.00225
## 1891 48.99934 48.99934
## 1892 58.99934 58.99934
## 1893 187.99147 187.99147
## 1894 376.98751 376.98751
## 1895 1291.93951 1291.93951
## 1896 4030.81893 4030.81893
## 1897 3495.03542 3495.03542
## 1898 587.19224 587.19224
## 1899 105.03188 105.03188
## 1900 152.99683 152.99683
## 1901 386.98453 386.98453
## 1902 757.97547 757.97547
## 1903 1306.96371 1306.96371
## 1904 3464.85734 3464.85734
## 1905 6990.76690 6990.76690
## 1906 6313.04481 6313.04481
## 1907 3794.16653 3794.16653
## 1908 1836.12945 1836.12945
## 1909 345.09857 345.09857
## 1910 381.99756 381.99756
## 1911 807.97184 807.97184
## 1912 1387.96166 1387.96166
## 1913 2712.91241 2712.91241
## 1914 3799.92814 3799.92814
## 1915 3091.04687 3091.04687
## 1916 2985.00701 2985.00701
## 1917 3789.94678 3789.94678
## 1918 674.20599 674.20599
## 1919 81.03922 81.03922
## 1920 80.00007 80.00007
## 1921 107.99815 107.99815
## 1922 228.99200 228.99200
## 1923 398.98876 398.98876
## 1924 1131.95154 1131.95154
## 1925 2431.91406 2431.91406
## 1926 3573.92450 3573.92450
## 1927 2935.04224 2935.04224
## 1928 1537.09242 1537.09242
## 1929 529.06664 529.06664
## 1930 485.00291 485.00291
## 1931 661.98830 661.98830
## 1932 999.97766 999.97766
## 1933 1589.96100 1589.96100
## 1934 2656.92946 2656.92946
plot(lynxforecasts)
I’m not really sure what these forecasts are revealing, but there seems to be a consistent skew to the right for the forecasted data.