Question 5.1

# clear env
rm(list = ls())
# import packages
library(outliers)
set.seed(12)
uscrime <- read.table("/Users/wstamatis/OMSA/ISYE 6501/Week 3 Homework/uscrime.txt",
                   stringsAsFactors = FALSE, header = TRUE)
temps <- read.table("/Users/wstamatis/OMSA/ISYE 6501/Week 3 Homework/temps.txt",
                               stringsAsFactors = FALSE, header = TRUE)
                               
# check for outliers using grubbs.test
grubbs.test(uscrime$Crime)

    Grubbs test for one outlier

data:  uscrime$Crime
G = 2.81290, U = 0.82426, p-value = 0.07887
alternative hypothesis: highest value 1993 is an outlier
# verify with a five-number summary
summary(uscrime$Crime)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  342.0   658.5   831.0   905.1  1057.5  1993.0 
# visualize
plot(uscrime$Crime)

uscrime$Crime[0:10]
 [1]  791 1635  578 1969 1234  682  963 1555  856  705

We see that 1993 is the clearest outlier, with 1969 being a close second.

Question 6.1

I work for an EdTech platform, and tracking active users is a very important metric for us. We want to be able to reach out to schools that aren’t using our product BEFORE we lose them, and applying the CUSUM approach we can detect schools that are at risk of dropping and reach out to them. To be honest, I would experiment a little to find an accurate critical value and threshold. To start out, I would choose a relatively critical value and low threshold. I would gradually increase the threshold until we noticed that we actually started losing accounts, at which point I would adjust to find some stability.

Question 6.2.1

Using July through October daily-high-temperature for Atlanta for 1996. # through 2015, use a CUSUM approach to identify when unofficial summer ends. (i.e., when the weather starts cooling off) each year.

For this question, I chose to average out the temperature for each day across all years. This gave me the best insight into general trends in Atlanta summer. I varied the critical value (and visualized in order to choose a threshold) and determined that, with a C = 4 and a threshold of 85, the weather begins to cool off (on average) on August 25th.

# average the temperature for each day across the years
date_avgs <- rowMeans(temps[c(2:length(temps))], dims=1, na.rm=T)
# compute the mean of the (now averaged) time series
da_mu <- mean(date_avgs)
# compute the difference between the mean of the time series and each "day"
da_minus_mu <- date_avgs - da_mu
# set C
C <- 4
# subtract C from the difference score
damimu_minus_C <- da_minus_mu - C
# create an empty vector for looping
# include an additional zero to help with indexing
precusum <- 0 * damimu_minus_C
cusum <- append(precusum, 0)
# loop through each day, check the cumulative sum, update the 
# index of our accumulator with the appropriate value
for (i in 1:length(damimu_minus_C)) 
     {
  checker <- cusum[i] + damimu_minus_C[i]
  
  ifelse(checker > 0, cusum[i+1] <- checker, cusum[i+1] <- 0) 
}
plot(cusum)

# get the index of the max cusum
which(cusum >= 85)
[1] 56 58 59 60 61
# return the date
temps[56, 1]
[1] "25-Aug"

Question 6.2.2

Use a CUSUM approach to make a judgment of whether Atlanta’s summer

climate has gotten warmer in that time (and if so, when).

For this question, I averaged the temperature for each year, and computed a CUSUM on those averages.

See the Excel document Tab “year_avgs” to see how I calculated the answer to this question.

In short, with a Critical value of 0.5 and a threshold of 3, I determined that, yes, the overall temperature has increased in that time.

The “date_avgs” Tab shows the same thing I already calculated

LS0tCnRpdGxlOiAiSVNZQSA2NTAxIFdlZWsgMyBIb21ld29yayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMgUXVlc3Rpb24gNS4xCmBgYHtyfQojIGNsZWFyIGVudgpybShsaXN0ID0gbHMoKSkKCiMgaW1wb3J0IHBhY2thZ2VzCmxpYnJhcnkob3V0bGllcnMpCgoKc2V0LnNlZWQoMTIpCgp1c2NyaW1lIDwtIHJlYWQudGFibGUoIi9Vc2Vycy93c3RhbWF0aXMvT01TQS9JU1lFIDY1MDEvV2VlayAzIEhvbWV3b3JrL3VzY3JpbWUudHh0IiwKICAgICAgICAgICAgICAgICAgIHN0cmluZ3NBc0ZhY3RvcnMgPSBGQUxTRSwgaGVhZGVyID0gVFJVRSkKCnRlbXBzIDwtIHJlYWQudGFibGUoIi9Vc2Vycy93c3RhbWF0aXMvT01TQS9JU1lFIDY1MDEvV2VlayAzIEhvbWV3b3JrL3RlbXBzLnR4dCIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzdHJpbmdzQXNGYWN0b3JzID0gRkFMU0UsIGhlYWRlciA9IFRSVUUpCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAKIyBjaGVjayBmb3Igb3V0bGllcnMgdXNpbmcgZ3J1YmJzLnRlc3QKCmdydWJicy50ZXN0KHVzY3JpbWUkQ3JpbWUpCmBgYAoKYGBge3J9CiMgdmVyaWZ5IHdpdGggYSBmaXZlLW51bWJlciBzdW1tYXJ5CnN1bW1hcnkodXNjcmltZSRDcmltZSkKYGBgCgoKYGBge3J9CiMgdmlzdWFsaXplCnBsb3QodXNjcmltZSRDcmltZSkKYGBgCgpgYGB7cn0KdXNjcmltZSRDcmltZVswOjEwXQoKYGBgCgoKV2Ugc2VlIHRoYXQgMTk5MyBpcyB0aGUgY2xlYXJlc3Qgb3V0bGllciwgd2l0aCAxOTY5IGJlaW5nIGEgY2xvc2Ugc2Vjb25kLgoKCgojIyBRdWVzdGlvbiA2LjEKSSB3b3JrIGZvciBhbiBFZFRlY2ggcGxhdGZvcm0sIGFuZCB0cmFja2luZyBhY3RpdmUgdXNlcnMgaXMgYSB2ZXJ5IGltcG9ydGFudCBtZXRyaWMgZm9yIHVzLiAKV2Ugd2FudCB0byBiZSBhYmxlIHRvIHJlYWNoIG91dCB0byBzY2hvb2xzIHRoYXQgYXJlbid0IHVzaW5nIG91ciBwcm9kdWN0IEJFRk9SRSB3ZSBsb3NlIHRoZW0sCmFuZCBhcHBseWluZyB0aGUgQ1VTVU0gYXBwcm9hY2ggd2UgY2FuIGRldGVjdCBzY2hvb2xzIHRoYXQgYXJlIGF0IHJpc2sgb2YgZHJvcHBpbmcgYW5kIHJlYWNoIG91dCB0byB0aGVtLgpUbyBiZSBob25lc3QsIEkgd291bGQgZXhwZXJpbWVudCBhIGxpdHRsZSB0byBmaW5kIGFuIGFjY3VyYXRlIGNyaXRpY2FsIHZhbHVlIGFuZCB0aHJlc2hvbGQuIFRvIHN0YXJ0IG91dCwKSSB3b3VsZCBjaG9vc2UgYSByZWxhdGl2ZWx5IGNyaXRpY2FsIHZhbHVlIGFuZCBsb3cgdGhyZXNob2xkLiBJIHdvdWxkIGdyYWR1YWxseSBpbmNyZWFzZSB0aGUgdGhyZXNob2xkIHVudGlsCndlIG5vdGljZWQgdGhhdCB3ZSBhY3R1YWxseSBzdGFydGVkIGxvc2luZyBhY2NvdW50cywgYXQgd2hpY2ggcG9pbnQgSSB3b3VsZCBhZGp1c3QgdG8gZmluZCBzb21lIHN0YWJpbGl0eS4KCgojIyBRdWVzdGlvbiA2LjIuMQojIFVzaW5nIEp1bHkgdGhyb3VnaCBPY3RvYmVyIGRhaWx5LWhpZ2gtdGVtcGVyYXR1cmUgZm9yIEF0bGFudGEgZm9yIDE5OTYuICMgdGhyb3VnaCAyMDE1LCB1c2UgYSBDVVNVTSBhcHByb2FjaCB0byBpZGVudGlmeSB3aGVuIHVub2ZmaWNpYWwgc3VtbWVyIGVuZHMuIChpLmUuLCB3aGVuIHRoZSB3ZWF0aGVyIHN0YXJ0cyBjb29saW5nIG9mZikgZWFjaCB5ZWFyLgoKRm9yIHRoaXMgcXVlc3Rpb24sIEkgY2hvc2UgdG8gYXZlcmFnZSBvdXQgdGhlIHRlbXBlcmF0dXJlIGZvciBlYWNoIGRheSBhY3Jvc3MgYWxsIHllYXJzLiBUaGlzIGdhdmUgbWUgdGhlIGJlc3QgaW5zaWdodAppbnRvIGdlbmVyYWwgdHJlbmRzIGluIEF0bGFudGEgc3VtbWVyLiBJIHZhcmllZCB0aGUgY3JpdGljYWwgdmFsdWUgKGFuZCB2aXN1YWxpemVkIGluIG9yZGVyIHRvIGNob29zZSBhIHRocmVzaG9sZCkgYW5kCmRldGVybWluZWQgdGhhdCwgd2l0aCBhIEMgPSA0IGFuZCBhIHRocmVzaG9sZCBvZiA4NSwgdGhlIHdlYXRoZXIgYmVnaW5zIHRvIGNvb2wgb2ZmIChvbiBhdmVyYWdlKSBvbiBBdWd1c3QgMjV0aC4KCgpgYGB7cn0KCiMgYXZlcmFnZSB0aGUgdGVtcGVyYXR1cmUgZm9yIGVhY2ggZGF5IGFjcm9zcyB0aGUgeWVhcnMKZGF0ZV9hdmdzIDwtIHJvd01lYW5zKHRlbXBzW2MoMjpsZW5ndGgodGVtcHMpKV0sIGRpbXM9MSwgbmEucm09VCkKCiMgY29tcHV0ZSB0aGUgbWVhbiBvZiB0aGUgKG5vdyBhdmVyYWdlZCkgdGltZSBzZXJpZXMKZGFfbXUgPC0gbWVhbihkYXRlX2F2Z3MpCgojIGNvbXB1dGUgdGhlIGRpZmZlcmVuY2UgYmV0d2VlbiB0aGUgbWVhbiBvZiB0aGUgdGltZSBzZXJpZXMgYW5kIGVhY2ggImRheSIKZGFfbWludXNfbXUgPC0gZGF0ZV9hdmdzIC0gZGFfbXUKCiMgc2V0IEMKQyA8LSA0CgojIHN1YnRyYWN0IEMgZnJvbSB0aGUgZGlmZmVyZW5jZSBzY29yZQpkYW1pbXVfbWludXNfQyA8LSBkYV9taW51c19tdSAtIEMKCiMgY3JlYXRlIGFuIGVtcHR5IHZlY3RvciBmb3IgbG9vcGluZwojIGluY2x1ZGUgYW4gYWRkaXRpb25hbCB6ZXJvIHRvIGhlbHAgd2l0aCBpbmRleGluZwpwcmVjdXN1bSA8LSAwICogZGFtaW11X21pbnVzX0MKY3VzdW0gPC0gYXBwZW5kKHByZWN1c3VtLCAwKQoKIyBsb29wIHRocm91Z2ggZWFjaCBkYXksIGNoZWNrIHRoZSBjdW11bGF0aXZlIHN1bSwgdXBkYXRlIHRoZSAKIyBpbmRleCBvZiBvdXIgYWNjdW11bGF0b3Igd2l0aCB0aGUgYXBwcm9wcmlhdGUgdmFsdWUKZm9yIChpIGluIDE6bGVuZ3RoKGRhbWltdV9taW51c19DKSkgCiAgICAgewogIGNoZWNrZXIgPC0gY3VzdW1baV0gKyBkYW1pbXVfbWludXNfQ1tpXQogIAogIGlmZWxzZShjaGVja2VyID4gMCwgY3VzdW1baSsxXSA8LSBjaGVja2VyLCBjdXN1bVtpKzFdIDwtIDApIAp9CgoKcGxvdChjdXN1bSkKYGBgCgoKCmBgYHtyfQojIGdldCB0aGUgaW5kZXggb2YgdGhlIG1heCBjdXN1bQp3aGljaChjdXN1bSA+PSA4NSkKYGBgCgoKYGBge3J9CiMgcmV0dXJuIHRoZSBkYXRlCnRlbXBzWzU2LCAxXQpgYGAKCiMjIFF1ZXN0aW9uIDYuMi4yCiMgVXNlIGEgQ1VTVU0gYXBwcm9hY2ggdG8gbWFrZSBhIGp1ZGdtZW50IG9mIHdoZXRoZXIgQXRsYW50YeKAmXMgc3VtbWVyCiMgY2xpbWF0ZSBoYXMgZ290dGVuIHdhcm1lciBpbiB0aGF0IHRpbWUgKGFuZCBpZiBzbywgd2hlbikuCgpGb3IgdGhpcyBxdWVzdGlvbiwgSSBhdmVyYWdlZCB0aGUgdGVtcGVyYXR1cmUgZm9yIGVhY2ggeWVhciwgYW5kIGNvbXB1dGVkIGEgQ1VTVU0gb24gdGhvc2UgYXZlcmFnZXMuCgpTZWUgdGhlIEV4Y2VsIGRvY3VtZW50IFRhYiAieWVhcl9hdmdzIiB0byBzZWUgaG93IEkgY2FsY3VsYXRlZCB0aGUgYW5zd2VyIHRvIHRoaXMgcXVlc3Rpb24uCgpJbiBzaG9ydCwgd2l0aCBhIENyaXRpY2FsIHZhbHVlIG9mIDAuNSBhbmQgYSB0aHJlc2hvbGQgb2YgMywgSSBkZXRlcm1pbmVkIHRoYXQsIHllcywgdGhlIG92ZXJhbGwgdGVtcGVyYXR1cmUgaGFzIGluY3JlYXNlZCBpbiB0aGF0IHRpbWUuCgpUaGUgImRhdGVfYXZncyIgVGFiIHNob3dzIHRoZSBzYW1lIHRoaW5nIEkgYWxyZWFkeSBjYWxjdWxhdGVkCg==