Question 7.1 - Exponential Smoothing

Given how volatile the oil market is, I would like to use exponential smoothing to find out if industries throughout the United States will have enough oil in their inventory to meet their customers’ expectations.

Question 7.2 - End of Summer Analysis

temps <- read.table("C:/Users/moham/Downloads/Georgia Tech University/WK3/week 3 Homework-Summer/week 3 data-summer/data 7.2/temps.txt", header=TRUE)

end_of_summer <- rep(NA, 20)
for (i in 1:20) {
  year_ts <- ts(temps[, i+1])
  hw <- HoltWinters(year_ts, beta=FALSE, gamma=FALSE)
  smoothed <- hw$fitted[,1]
  threshold <- 85
  below <- smoothed < threshold
  for (j in 1:(length(below)-6)) {
    if (all(below[j:(j+6)])) {
      end_of_summer[i] <- j
      break
    }
  }
}

years <- 1996:2015
plot(years, end_of_summer, type="b",
     xlab="Year", ylab="Day of Season (1=Jul 1)",
     main="Unofficial End of Summer Over 20 Years",
     pch=19, col="steelblue")
abline(lm(end_of_summer ~ years), col="red", lwd=2)

summary(lm(end_of_summer ~ years))
## 
## Call:
## lm(formula = end_of_summer ~ years)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -62.735  -7.764   6.281  20.580  27.183 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  228.71053 2081.32040   0.110    0.914
## years         -0.08195    1.03780  -0.079    0.938
## 
## Residual standard error: 26.76 on 18 degrees of freedom
## Multiple R-squared:  0.0003463,  Adjusted R-squared:  -0.05519 
## F-statistic: 0.006236 on 1 and 18 DF,  p-value: 0.9379

Conclusion: There is no statistically significant evidence that the unofficial end of summer has shifted over 20 years (p = 0.938).

Question 8.1 - Linear Regression Situation

I want to predict my electricity bill for my apartment in San Fernando Valley, Los Angeles County to save money next year. Predictors: average monthly temperature, number of people, work from home days, billing cycle length, and apartment square footage.

Question 8.2 - US Crime Rate Prediction

uscrime <- read.table("C:/Users/moham/Downloads/Georgia Tech University/WK3/week 3 Homework-Summer/week 3 data-summer/data 8.2/uscrime.txt", header=TRUE)

model <- lm(Crime ~ ., data=uscrime)
summary(model)
## 
## Call:
## lm(formula = Crime ~ ., data = uscrime)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -395.74  -98.09   -6.69  112.99  512.67 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -5.984e+03  1.628e+03  -3.675 0.000893 ***
## M            8.783e+01  4.171e+01   2.106 0.043443 *  
## So          -3.803e+00  1.488e+02  -0.026 0.979765    
## Ed           1.883e+02  6.209e+01   3.033 0.004861 ** 
## Po1          1.928e+02  1.061e+02   1.817 0.078892 .  
## Po2         -1.094e+02  1.175e+02  -0.931 0.358830    
## LF          -6.638e+02  1.470e+03  -0.452 0.654654    
## M.F          1.741e+01  2.035e+01   0.855 0.398995    
## Pop         -7.330e-01  1.290e+00  -0.568 0.573845    
## NW           4.204e+00  6.481e+00   0.649 0.521279    
## U1          -5.827e+03  4.210e+03  -1.384 0.176238    
## U2           1.678e+02  8.234e+01   2.038 0.050161 .  
## Wealth       9.617e-02  1.037e-01   0.928 0.360754    
## Ineq         7.067e+01  2.272e+01   3.111 0.003983 ** 
## Prob        -4.855e+03  2.272e+03  -2.137 0.040627 *  
## Time        -3.479e+00  7.165e+00  -0.486 0.630708    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 209.1 on 31 degrees of freedom
## Multiple R-squared:  0.8031, Adjusted R-squared:  0.7078 
## F-statistic: 8.429 on 15 and 31 DF,  p-value: 3.539e-07
new_city <- data.frame(
  M=14.0, So=0, Ed=10.0, Po1=12.0, Po2=15.5,
  LF=0.640, M.F=94.0, Pop=150, NW=1.1,
  U1=0.120, U2=3.6, Wealth=3200,
  Ineq=20.1, Prob=0.04, Time=39.0
)

predict(model, new_city, interval="prediction")
##        fit       lwr      upr
## 1 155.4349 -1370.845 1681.715
summary(uscrime[, -16])
##        M               So               Ed             Po1       
##  Min.   :11.90   Min.   :0.0000   Min.   : 8.70   Min.   : 4.50  
##  1st Qu.:13.00   1st Qu.:0.0000   1st Qu.: 9.75   1st Qu.: 6.25  
##  Median :13.60   Median :0.0000   Median :10.80   Median : 7.80  
##  Mean   :13.86   Mean   :0.3404   Mean   :10.56   Mean   : 8.50  
##  3rd Qu.:14.60   3rd Qu.:1.0000   3rd Qu.:11.45   3rd Qu.:10.45  
##  Max.   :17.70   Max.   :1.0000   Max.   :12.20   Max.   :16.60  
##       Po2               LF              M.F              Pop        
##  Min.   : 4.100   Min.   :0.4800   Min.   : 93.40   Min.   :  3.00  
##  1st Qu.: 5.850   1st Qu.:0.5305   1st Qu.: 96.45   1st Qu.: 10.00  
##  Median : 7.300   Median :0.5600   Median : 97.70   Median : 25.00  
##  Mean   : 8.023   Mean   :0.5612   Mean   : 98.30   Mean   : 36.62  
##  3rd Qu.: 9.700   3rd Qu.:0.5930   3rd Qu.: 99.20   3rd Qu.: 41.50  
##  Max.   :15.700   Max.   :0.6410   Max.   :107.10   Max.   :168.00  
##        NW              U1                U2            Wealth    
##  Min.   : 0.20   Min.   :0.07000   Min.   :2.000   Min.   :2880  
##  1st Qu.: 2.40   1st Qu.:0.08050   1st Qu.:2.750   1st Qu.:4595  
##  Median : 7.60   Median :0.09200   Median :3.400   Median :5370  
##  Mean   :10.11   Mean   :0.09547   Mean   :3.398   Mean   :5254  
##  3rd Qu.:13.25   3rd Qu.:0.10400   3rd Qu.:3.850   3rd Qu.:5915  
##  Max.   :42.30   Max.   :0.14200   Max.   :5.800   Max.   :6890  
##       Ineq            Prob              Time      
##  Min.   :12.60   Min.   :0.00690   Min.   :12.20  
##  1st Qu.:16.55   1st Qu.:0.03270   1st Qu.:21.60  
##  Median :17.60   Median :0.04210   Median :25.80  
##  Mean   :19.40   Mean   :0.04709   Mean   :26.60  
##  3rd Qu.:22.75   3rd Qu.:0.05445   3rd Qu.:30.45  
##  Max.   :27.60   Max.   :0.11980   Max.   :44.00

Conclusion: The predicted crime rate is 155 crimes per 100,000 people. The model explains 80% of variance (R²=0.803). The wide prediction interval reflects high uncertainty due to Po2 and LF sitting at the edge of training data.