Given how volatile the oil market is, I would like to use exponential smoothing to find out if industries throughout the United States will have enough oil in their inventory to meet their customers’ expectations.
temps <- read.table("C:/Users/moham/Downloads/Georgia Tech University/WK3/week 3 Homework-Summer/week 3 data-summer/data 7.2/temps.txt", header=TRUE)
end_of_summer <- rep(NA, 20)
for (i in 1:20) {
year_ts <- ts(temps[, i+1])
hw <- HoltWinters(year_ts, beta=FALSE, gamma=FALSE)
smoothed <- hw$fitted[,1]
threshold <- 85
below <- smoothed < threshold
for (j in 1:(length(below)-6)) {
if (all(below[j:(j+6)])) {
end_of_summer[i] <- j
break
}
}
}
years <- 1996:2015
plot(years, end_of_summer, type="b",
xlab="Year", ylab="Day of Season (1=Jul 1)",
main="Unofficial End of Summer Over 20 Years",
pch=19, col="steelblue")
abline(lm(end_of_summer ~ years), col="red", lwd=2)
summary(lm(end_of_summer ~ years))
##
## Call:
## lm(formula = end_of_summer ~ years)
##
## Residuals:
## Min 1Q Median 3Q Max
## -62.735 -7.764 6.281 20.580 27.183
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 228.71053 2081.32040 0.110 0.914
## years -0.08195 1.03780 -0.079 0.938
##
## Residual standard error: 26.76 on 18 degrees of freedom
## Multiple R-squared: 0.0003463, Adjusted R-squared: -0.05519
## F-statistic: 0.006236 on 1 and 18 DF, p-value: 0.9379
Conclusion: There is no statistically significant evidence that the unofficial end of summer has shifted over 20 years (p = 0.938).
I want to predict my electricity bill for my apartment in San Fernando Valley, Los Angeles County to save money next year. Predictors: average monthly temperature, number of people, work from home days, billing cycle length, and apartment square footage.
uscrime <- read.table("C:/Users/moham/Downloads/Georgia Tech University/WK3/week 3 Homework-Summer/week 3 data-summer/data 8.2/uscrime.txt", header=TRUE)
model <- lm(Crime ~ ., data=uscrime)
summary(model)
##
## Call:
## lm(formula = Crime ~ ., data = uscrime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -395.74 -98.09 -6.69 112.99 512.67
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.984e+03 1.628e+03 -3.675 0.000893 ***
## M 8.783e+01 4.171e+01 2.106 0.043443 *
## So -3.803e+00 1.488e+02 -0.026 0.979765
## Ed 1.883e+02 6.209e+01 3.033 0.004861 **
## Po1 1.928e+02 1.061e+02 1.817 0.078892 .
## Po2 -1.094e+02 1.175e+02 -0.931 0.358830
## LF -6.638e+02 1.470e+03 -0.452 0.654654
## M.F 1.741e+01 2.035e+01 0.855 0.398995
## Pop -7.330e-01 1.290e+00 -0.568 0.573845
## NW 4.204e+00 6.481e+00 0.649 0.521279
## U1 -5.827e+03 4.210e+03 -1.384 0.176238
## U2 1.678e+02 8.234e+01 2.038 0.050161 .
## Wealth 9.617e-02 1.037e-01 0.928 0.360754
## Ineq 7.067e+01 2.272e+01 3.111 0.003983 **
## Prob -4.855e+03 2.272e+03 -2.137 0.040627 *
## Time -3.479e+00 7.165e+00 -0.486 0.630708
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 209.1 on 31 degrees of freedom
## Multiple R-squared: 0.8031, Adjusted R-squared: 0.7078
## F-statistic: 8.429 on 15 and 31 DF, p-value: 3.539e-07
new_city <- data.frame(
M=14.0, So=0, Ed=10.0, Po1=12.0, Po2=15.5,
LF=0.640, M.F=94.0, Pop=150, NW=1.1,
U1=0.120, U2=3.6, Wealth=3200,
Ineq=20.1, Prob=0.04, Time=39.0
)
predict(model, new_city, interval="prediction")
## fit lwr upr
## 1 155.4349 -1370.845 1681.715
summary(uscrime[, -16])
## M So Ed Po1
## Min. :11.90 Min. :0.0000 Min. : 8.70 Min. : 4.50
## 1st Qu.:13.00 1st Qu.:0.0000 1st Qu.: 9.75 1st Qu.: 6.25
## Median :13.60 Median :0.0000 Median :10.80 Median : 7.80
## Mean :13.86 Mean :0.3404 Mean :10.56 Mean : 8.50
## 3rd Qu.:14.60 3rd Qu.:1.0000 3rd Qu.:11.45 3rd Qu.:10.45
## Max. :17.70 Max. :1.0000 Max. :12.20 Max. :16.60
## Po2 LF M.F Pop
## Min. : 4.100 Min. :0.4800 Min. : 93.40 Min. : 3.00
## 1st Qu.: 5.850 1st Qu.:0.5305 1st Qu.: 96.45 1st Qu.: 10.00
## Median : 7.300 Median :0.5600 Median : 97.70 Median : 25.00
## Mean : 8.023 Mean :0.5612 Mean : 98.30 Mean : 36.62
## 3rd Qu.: 9.700 3rd Qu.:0.5930 3rd Qu.: 99.20 3rd Qu.: 41.50
## Max. :15.700 Max. :0.6410 Max. :107.10 Max. :168.00
## NW U1 U2 Wealth
## Min. : 0.20 Min. :0.07000 Min. :2.000 Min. :2880
## 1st Qu.: 2.40 1st Qu.:0.08050 1st Qu.:2.750 1st Qu.:4595
## Median : 7.60 Median :0.09200 Median :3.400 Median :5370
## Mean :10.11 Mean :0.09547 Mean :3.398 Mean :5254
## 3rd Qu.:13.25 3rd Qu.:0.10400 3rd Qu.:3.850 3rd Qu.:5915
## Max. :42.30 Max. :0.14200 Max. :5.800 Max. :6890
## Ineq Prob Time
## Min. :12.60 Min. :0.00690 Min. :12.20
## 1st Qu.:16.55 1st Qu.:0.03270 1st Qu.:21.60
## Median :17.60 Median :0.04210 Median :25.80
## Mean :19.40 Mean :0.04709 Mean :26.60
## 3rd Qu.:22.75 3rd Qu.:0.05445 3rd Qu.:30.45
## Max. :27.60 Max. :0.11980 Max. :44.00
Conclusion: The predicted crime rate is 155 crimes per 100,000 people. The model explains 80% of variance (R²=0.803). The wide prediction interval reflects high uncertainty due to Po2 and LF sitting at the edge of training data.