## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1969 1687 1508 1507 1385 1632 1511 1559 1630 1579 1653 2152 2148
## 1970 1752 1765 1717 1558 1575 1520 1805 1800 1719 2008 2242 2478
## 1971 2030 1655 1693 1623 1805 1746 1795 1926 1619 1992 2233 2192
## 1972 2080 1768 1835 1569 1976 1853 1965 1689 1778 1976 2397 2654
## 1973 2097 1963 1677 1941 2003 1813 2012 1912 2084 2080 2118 2150
## 1974 1608 1503 1548 1382 1731 1798 1779 1887 2004 2077 2092 2051
## 1975 1577 1356 1652 1382 1519 1421 1442 1543 1656 1561 1905 2199
## 1976 1473 1655 1407 1395 1530 1309 1526 1327 1627 1748 1958 2274
## 1977 1648 1401 1411 1403 1394 1520 1528 1643 1515 1685 2000 2215
## 1978 1956 1462 1563 1459 1446 1622 1657 1638 1643 1683 2050 2262
## 1979 1813 1445 1762 1461 1556 1431 1427 1554 1645 1653 2016 2207
## 1980 1665 1361 1506 1360 1453 1522 1460 1552 1548 1827 1737 1941
## 1981 1474 1458 1542 1404 1522 1385 1641 1510 1681 1938 1868 1726
## 1982 1456 1445 1456 1365 1487 1558 1488 1684 1594 1850 1998 2079
## 1983 1494 1057 1218 1168 1236 1076 1174 1139 1427 1487 1483 1513
## 1984 1357 1165 1282 1110 1297 1185 1222 1284 1444 1575 1737 1763
This time series conveys the number of deaths per month in the UK caused by driving. The time frame is done in monthly increments ranging from the year of 1969 - 1984.
This project will analyse predictions made by the Meta Prophet using the monthly time series data on the number of deaths caused by driving in the UK from 1969 to 1984. The accuracy of these predictions will be assessed, any discrepancies will be analysed to determine the reasons as to why the predictions may not coincide with the actual data.
Step 1
Install the prophet package from CRAN
Step 2
## Loading required package: Rcpp
## Loading required package: rlang
This will load the prophet package into R
Step 3
This installs the remote package, which will be used to load other packages not in the CRAN
Step 4
## Downloading GitHub repo facebook/prophet@v1.1.6-patched-pypi
##
## ── R CMD build ─────────────────────────────────────────────────────────────────
## checking for file 'C:\Users\ah22123\AppData\Local\Temp\RtmpmymPOw\remotes10c452c87494\facebook-prophet-2a57e9d\R/DESCRIPTION' ... checking for file 'C:\Users\ah22123\AppData\Local\Temp\RtmpmymPOw\remotes10c452c87494\facebook-prophet-2a57e9d\R/DESCRIPTION' ... ✔ checking for file 'C:\Users\ah22123\AppData\Local\Temp\RtmpmymPOw\remotes10c452c87494\facebook-prophet-2a57e9d\R/DESCRIPTION'
## ─ preparing 'prophet': (794ms)
## checking DESCRIPTION meta-information ... checking DESCRIPTION meta-information ... ✔ checking DESCRIPTION meta-information
## ─ cleaning src
## ─ checking for LF line-endings in source and make files and shell scripts (675ms)
## ─ checking for empty or unneeded directories
## Omitted 'LazyData' from DESCRIPTION
## ─ building 'prophet_1.1.6.tar.gz'
## Warning: Warning: file 'prophet/configure' did not have execute permissions: corrected
##
##
## Warning: package 'prophet' is in use and will not be installed
This will install the latest release of prophet from GitHub
Step 5
This will load the dataset of “UKDriverDeaths” as shown above
Step 6
Start date of the data set
Step 7
This will create a sequence of dates, this will link each data point within “UKDriverDeaths” data set to a month starting from 01/01/1969
Step 8
This will create a data frame that will cooperate with prophet, “ds” is the data sequence and “y” is in the number of deaths.
Step 9
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
This will fit the prophet model to UKDriverDeaths.df
Step 10
This will create a predictive future data frame.
Step 11
This will use the fitted model m to predict for the future.
Step 12
This will plot the forecast on a graph
Linear Regression Analysis
model <- lm(y ~ ds, data = UKDriverDeaths.df)
UKDriverDeaths.df$ds_numeric <- as.numeric(UKDriverDeaths.df$ds)
model <- lm(y ~ ds_numeric, data = UKDriverDeaths.df)
summary(model)##
## Call:
## lm(formula = y ~ ds_numeric, data = UKDriverDeaths.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -501.44 -181.41 -41.26 172.39 870.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1865.33464 33.86402 55.083 <2e-16 ***
## ds_numeric -0.07676 0.01110 -6.913 7e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 259.6 on 190 degrees of freedom
## Multiple R-squared: 0.201, Adjusted R-squared: 0.1968
## F-statistic: 47.79 on 1 and 190 DF, p-value: 6.998e-11
This code will fit a linear model then it will convert dates into a numeric format for regression. It then proceeds to update the model to use numeric dates and finally the summary of the model is produced this contains important pieces of information including: Coefficients, Residual Statistics, R-squared, F-Statistic and p-value.
With these figures I am able to assess how well the model fits my data and understand its relationships between time and number of deaths of drivers in the UK.
Residuals: Min 1Q Median 3Q Max -501.44 -181.41 -41.26 172.39 870.42 Coefficients: Estimate Std. Error t value (Intercept) 1865.33464 33.86402 55.083 ds_numeric -0.07676 0.01110 -6.913 Pr(>|t|) (Intercept) <2e-16 *** ds_numeric 7e-11 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 259.6 on 190 degrees of freedom Multiple R-squared: 0.201, Adjusted R-squared: 0.1968 F-statistic: 47.79 on 1 and 190 DF, p-value: 6.998e-11
Residuals are the differences between the observed values and the predicted values:
Min: The smallest residual is -501.44, suggesting the model underestimated this data point by 501.44.
1Q (First Quartile): 25% of the residuals are less than -181.41, this is a small underestimation.
Median: The median residual is -41.26, close to zero, suggesting the model is overall relatively accurate.
Third Quartile: 75% of the residuals are less than 172.39, indicating an overestimation.
Max: The largest residual is 870.42, showing the model overpredicted this data point by a significant amount.
Coefficients
The intercept is the estimated value of y is estimated at 1865.33 and standard error is 33.86. T-value is 55.083 and it is a measure of how many standard deviations the estimated coefficient is away from 0. The p-value being very small suggests the intercept is far from 0.
ds is the coefficient for the slope on the time variable. Indicates change in deaths for one unit increase in time. Estimated figure is -0.07676 showing a small decline in deaths by time. Coinciding with this, the standard error and t-statistic both show statistical significance as they show a negative trend in the data, as previously discussed.
Residual Standard Error is 259.6, conveys the average of how much estimations stray from the actual data.
R-squared is 0.201, this is not considerably high which leads to show there are other factors not included in the model which will be discussed later.
The model indicates that there is a statistically significant decrease in the number of driver deaths over time, although the effect is small, our general predictor conveys its truthfulness. However, the model only explains about 20.1% of the variance in driver deaths, some of the include improvements in technology of cars and general rules and driving regulations being implemented.
As seen in the graph above, in 1969 the monthly death rate is seen to be roughly 1700, with a rough annual number of 20,400. This decreased down to approximately 1500 by 1984 making this as annual figure of around 18,000, this being roughly a 11.76% decrease.
It is then further forecasted to decrease further by 1991 to around 1100 making the annual approximation to be 13,200 making this a 35.29% decrease since 1969.
This graph shows the number of deaths on GB roads between 1926-2016, with an evident significant decrease as the years progress. Specifically looking at my chosen time frame it is evident from this graph that between 1969-1984 there is a significant decrease in the deaths on the roads. From 1966 with the death rate being 7,985 to 1990 being just over 5000. This being roughly a 36.54% decrease which closely follows my prediction value percentage decrease. The overall pattern is similar shown in the real time data and the predicted data, despite the numbers not being exactly accurate the depreciation was clearly acknowledged with the percentage decrease being relatively similar.
A company within the UK has launched an all electric sport vehicle for children. This was an initiative formally formed by the government efforts to reduce road deaths by training the adolescent in road safety and driving.
The aim was to teach children the basics of driving and road safety as preventative measure against road deaths.
Furthermore, there have been overall significant improvments within vehicle safety, this is due to engineering, research and crash data analytics. With a large focus placed on modern engineering, proggressing car models have incorporated numerous safety features, which enhance the driving experience and ensure an increased in occupant protection.
These features include:
Seat belts: With the evolution from optional lap belts to mandatory 3-point seat belts, which tighten during crashes, significantly enhancing safety.
Air bags: From the initial design in the 1950s there has been monumental progression to a more sophisticated design that prevents injury/death upon release. Inclusive of frontal and side air bags.
Electronic Stability Control: Required on all new cars since 2011, avoids loss of vehicle control.
Backup Cameras: Required on all new cars since 2018, enhances rear visibility to prevent backover crashes.
Blind Spot Warning: Alerts drivers to nearby lane vehicles using cameras or sensors, allowing for safer lane changes.
Safety Impact: According to NHTSA statistics, modern cars are significantly safer than the vehicles manufactured in the late 1950s. Safety progress has significantly minimized risks of fatalities and saved over 600,000 lives between 1960 and 2012.
Road deaths in the UK have seen a 4% year on year reduction at 1,645 deaths. With the highest reduction in deaths being motorcyclists at 12%, followed by car users reduced by 5%, and pedal cyclist deaths reduced by 7%.
NHTSA: https://www.nhtsa.gov/how-vehicle-safety-has-improved-over-decades
Young Driver: https://www.youngdriver.eu/news/atco-road-safety
The Free Encylopedia: https://en.wikipedia.org/wiki/Reported_Road_Casualties_Great_Britain#