Question 1

Register an application with the Github API here https://github.com/settings/applications.

Access the API to get information on your instructors repositories (hint: this is the url you want “https://api.github.com/users/jtleek/repos”).

Use this data to find the time that the datasharing repo was created. What time was it created?

This tutorial may be useful (https://github.com/hadley/httr/blob/master/demo/oauth2-github.r).

You may also need to run the code in the base R package and not R studio.


Answer


Loading packages..

library(httr)


Find OAuth settings for github: http://developer.github.com/v3/oauth/

oauth_endpoints("github")
<oauth_endpoint>
 authorize: https://github.com/login/oauth/authorize
 access:    https://github.com/login/oauth/access_token


Register an application at https://github.com/settings/applications

myapp <- oauth_app("github",
  key = "75ffc4989df8001de43a",
  secret = "389877827ca7031f4586a37206816ec5152088dc")


Get OAuth credentials

github_token <- oauth2.0_token(oauth_endpoints("github"), myapp)


Use API

req <- GET("https://api.github.com/users/jtleek/repos", config(token = github_token))
stop_for_status(req)
output <- content(req)


Find “datasharing”

datashare <- which(sapply(output, FUN=function(X) "datasharing" %in% X))
datashare
[1] 15


Find the time that the datasharing repo was created.

list(output[[15]]$name, output[[15]]$created_at)
[[1]]
[1] "datasharing"

[[2]]
[1] "2013-11-07T13:25:07Z"


Question 2

The sqldf package allows for execution of SQL commands on R data frames. We will use the sqldf package to practice the queries we might send with the dbSendQuery() command in RMySQL.

Download the American Community Survey data and load it into an R object called: acs https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv

Which of the following commands will select only the data for the probability weights pwgtp1 with ages less than 50?


Answer

Loading packages…

library(sqldf)


Downloading file…

fileUrl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv"
download.file(fileUrl, destfile = "acs.csv")
trying URL 'https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv'
Content type 'text/csv' length 11462469 bytes (10.9 MB)
==================================================
downloaded 10.9 MB


Loading data…

acs <- read.csv("acs.csv")
head(acs)


Finding answer…


  • Option A:
detach("package:RMySQL", unload=TRUE)
sqldf("select pwgtp1 from acs where AGEP < 50")


  • Option B:
sqldf("select * from acs where AGEP < 50 and pwgtp1")


  • Option C:
sqldf("select pwgtp1 from acs")


  • Option D:
sqldf("select pwgtp1 from acs where AGEP < 50")


Result:

Option D: sqldf(“select pwgtp1 from acs where AGEP < 50”)



Question 3

Using the same data frame you created in the previous problem, what is the equivalent function to unique(acs$AGEP)


Answer

Z <- unique(acs$AGEP)


  • Option A:
A <- sqldf("select AGEP where unique from acs")
Error in result_create(conn@ptr, statement) : near "unique": syntax error


  • Option B:
B <- sqldf("select distinct AGEP from acs")
identical(Z, B$AGEP)
[1] TRUE


  • Option C:
C <- sqldf("select distinct pwgtp1 from acs")
identical(Z, C$AGEP)
[1] FALSE


  • Option D:
D <- sqldf("select unique AGEP from acs")
Error in result_create(conn@ptr, statement) : near "unique": syntax error


Result:

Option B: sqldf(“select distinct AGEP from acs”)




Question 4

How many characters are in the 10th, 20th, 30th and 100th lines of HTML from this page:

http://biostat.jhsph.edu/~jleek/contact.html

(Hint: the nchar() function in R may be helpful)

Answer

Fetching data…

htmlUrl <- url("http://biostat.jhsph.edu/~jleek/contact.html")
htmlCode <- readLines(htmlUrl)
close(htmlUrl)


Viewing data…

head(htmlCode)
[1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
[2] ""                                                                                                                 
[3] "<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\">"                                        
[4] ""                                                                                                                 
[5] "<head>"                                                                                                           
[6] ""                                                                                                                 


Finding answer…

c(nchar(htmlCode[10]), nchar(htmlCode[20]), nchar(htmlCode[30]), nchar(htmlCode[100]))
[1] 45 31  7 25




Question 5

Read this data set into R and report the sum of the numbers in the fourth of the nine columns.

https://d396qusza40orc.cloudfront.net/getdata%2Fwksst8110.for

Original source of the data: http://www.cpc.ncep.noaa.gov/data/indices/wksst8110.for

(Hint: this is a fixed width file format)


Answer

Fetching data…

fileUrl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fwksst8110.for"
SST <- read.fwf(fileUrl, skip=4, widths=c(12, 7, 4, 9, 4, 9, 4, 9, 4))


Viewing file…

head(SST)


Finding answer…

sum(SST[,4])
[1] 32426.7



END

LS0tCnRpdGxlOiAiR2V0dGluZyBhbmQgQ2xlYW5pbmcgRGF0YTogV2VlayAyIFF1aXoiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCjxiciAvPgo8YnIgLz4KCi0tLS0tLS0tLQoKIyMgUXVlc3Rpb24gMQoKUmVnaXN0ZXIgYW4gYXBwbGljYXRpb24gd2l0aCB0aGUgR2l0aHViIEFQSSBoZXJlIGh0dHBzOi8vZ2l0aHViLmNvbS9zZXR0aW5ncy9hcHBsaWNhdGlvbnMuCgpBY2Nlc3MgdGhlIEFQSSB0byBnZXQgaW5mb3JtYXRpb24gb24geW91ciBpbnN0cnVjdG9ycyByZXBvc2l0b3JpZXMgKGhpbnQ6IHRoaXMgaXMgdGhlIHVybCB5b3Ugd2FudCAiaHR0cHM6Ly9hcGkuZ2l0aHViLmNvbS91c2Vycy9qdGxlZWsvcmVwb3MiKS4KClVzZSB0aGlzIGRhdGEgdG8gZmluZCB0aGUgdGltZSB0aGF0IHRoZSBkYXRhc2hhcmluZyByZXBvIHdhcyBjcmVhdGVkLiBXaGF0IHRpbWUgd2FzIGl0IGNyZWF0ZWQ/CgpUaGlzIHR1dG9yaWFsIG1heSBiZSB1c2VmdWwgKGh0dHBzOi8vZ2l0aHViLmNvbS9oYWRsZXkvaHR0ci9ibG9iL21hc3Rlci9kZW1vL29hdXRoMi1naXRodWIucikuCgpZb3UgbWF5IGFsc28gbmVlZCB0byBydW4gdGhlIGNvZGUgaW4gdGhlIGJhc2UgUiBwYWNrYWdlIGFuZCBub3QgUiBzdHVkaW8uCgo8YnIgLz4KCiMjIyBBbnN3ZXIKCjxiciBcPgpMb2FkaW5nIHBhY2thZ2VzLi4KYGBge3J9CmxpYnJhcnkoaHR0cikKYGBgCgo8YnIgXD4KRmluZCBPQXV0aCBzZXR0aW5ncyBmb3IgZ2l0aHViOiBodHRwOi8vZGV2ZWxvcGVyLmdpdGh1Yi5jb20vdjMvb2F1dGgvCmBgYHtyfQpvYXV0aF9lbmRwb2ludHMoImdpdGh1YiIpCmBgYAoKPGJyIFw+ClJlZ2lzdGVyIGFuIGFwcGxpY2F0aW9uIGF0IGh0dHBzOi8vZ2l0aHViLmNvbS9zZXR0aW5ncy9hcHBsaWNhdGlvbnMKYGBge3J9Cm15YXBwIDwtIG9hdXRoX2FwcCgiZ2l0aHViIiwKICBrZXkgPSAiNzVmZmM0OTg5ZGY4MDAxZGU0M2EiLAogIHNlY3JldCA9ICIzODk4Nzc4MjdjYTcwMzFmNDU4NmEzNzIwNjgxNmVjNTE1MjA4OGRjIikKYGBgCgo8YnIgXD4KR2V0IE9BdXRoIGNyZWRlbnRpYWxzCmBgYHtyfQpnaXRodWJfdG9rZW4gPC0gb2F1dGgyLjBfdG9rZW4ob2F1dGhfZW5kcG9pbnRzKCJnaXRodWIiKSwgbXlhcHApCmBgYAoKPGJyIFw+ClVzZSBBUEkKYGBge3J9CnJlcSA8LSBHRVQoImh0dHBzOi8vYXBpLmdpdGh1Yi5jb20vdXNlcnMvanRsZWVrL3JlcG9zIiwgY29uZmlnKHRva2VuID0gZ2l0aHViX3Rva2VuKSkKc3RvcF9mb3Jfc3RhdHVzKHJlcSkKb3V0cHV0IDwtIGNvbnRlbnQocmVxKQpgYGAKCjxiciBcPgpGaW5kICJkYXRhc2hhcmluZyIKIApgYGB7cn0KZGF0YXNoYXJlIDwtIHdoaWNoKHNhcHBseShvdXRwdXQsIEZVTj1mdW5jdGlvbihYKSAiZGF0YXNoYXJpbmciICVpbiUgWCkpCmRhdGFzaGFyZQpgYGAKCjxiciBcPgpGaW5kIHRoZSB0aW1lIHRoYXQgdGhlIGRhdGFzaGFyaW5nIHJlcG8gd2FzIGNyZWF0ZWQuCmBgYHtyfQpsaXN0KG91dHB1dFtbMTVdXSRuYW1lLCBvdXRwdXRbWzE1XV0kY3JlYXRlZF9hdCkKYGBgCgotLS0KCjxiciAvPgoKIyMgUXVlc3Rpb24gMgoKVGhlICoqc3FsZGYqKiBwYWNrYWdlIGFsbG93cyBmb3IgZXhlY3V0aW9uIG9mIFNRTCBjb21tYW5kcyBvbiBSIGRhdGEgZnJhbWVzLiBXZSB3aWxsIHVzZSB0aGUgKipzcWxkZioqIHBhY2thZ2UgdG8gcHJhY3RpY2UgdGhlIHF1ZXJpZXMgd2UgbWlnaHQgc2VuZCB3aXRoIHRoZSAqKmRiU2VuZFF1ZXJ5KCkqKiBjb21tYW5kIGluICoqUk15U1FMKiouCgpEb3dubG9hZCB0aGUgQW1lcmljYW4gQ29tbXVuaXR5IFN1cnZleSBkYXRhIGFuZCBsb2FkIGl0IGludG8gYW4gUiBvYmplY3QgY2FsbGVkOiAqYWNzKgpodHRwczovL2QzOTZxdXN6YTQwb3JjLmNsb3VkZnJvbnQubmV0L2dldGRhdGElMkZkYXRhJTJGc3MwNnBpZC5jc3YKCldoaWNoIG9mIHRoZSBmb2xsb3dpbmcgY29tbWFuZHMgd2lsbCBzZWxlY3Qgb25seSB0aGUgZGF0YSBmb3IgdGhlIHByb2JhYmlsaXR5IHdlaWdodHMgcHdndHAxIHdpdGggYWdlcyBsZXNzIHRoYW4gNTA/Cgo8YnIgLz4KCiMjIyBBbnN3ZXIKCkxvYWRpbmcgcGFja2FnZXMuLi4KCmBgYHtyfQpsaWJyYXJ5KHNxbGRmKQpgYGAKCjxiciAvPgpEb3dubG9hZGluZyBmaWxlLi4uCgpgYGB7cn0KZmlsZVVybCA8LSAiaHR0cHM6Ly9kMzk2cXVzemE0MG9yYy5jbG91ZGZyb250Lm5ldC9nZXRkYXRhJTJGZGF0YSUyRnNzMDZwaWQuY3N2Igpkb3dubG9hZC5maWxlKGZpbGVVcmwsIGRlc3RmaWxlID0gImFjcy5jc3YiKQpgYGAKCjxiciAvPgpMb2FkaW5nIGRhdGEuLi4KCmBgYHtyfQphY3MgPC0gcmVhZC5jc3YoImFjcy5jc3YiKQpoZWFkKGFjcykKYGBgCgo8YnIgLz4KRmluZGluZyBhbnN3ZXIuLi4KCjxiciAvPgoKKiBPcHRpb24gQToKYGBge3J9CnNxbGRmKCJzZWxlY3QgKiBmcm9tIGFjcyIpCmBgYAoKPGJyIC8+CgoqIE9wdGlvbiBCOgoKYGBge3J9CnNxbGRmKCJzZWxlY3QgKiBmcm9tIGFjcyB3aGVyZSBBR0VQIDwgNTAgYW5kIHB3Z3RwMSIpCmBgYAoKPGJyIC8+CgoqIE9wdGlvbiBDOgoKYGBge3J9CnNxbGRmKCJzZWxlY3QgcHdndHAxIGZyb20gYWNzIikKYGBgCgoKPGJyIC8+CgoqIE9wdGlvbiBEOgoKYGBge3J9CnNxbGRmKCJzZWxlY3QgcHdndHAxIGZyb20gYWNzIHdoZXJlIEFHRVAgPCA1MCIpCmBgYAoKPGJyIC8+CgoqKlJlc3VsdDoqKgoKT3B0aW9uIEQ6IDx1PnNxbGRmKCJzZWxlY3QgcHdndHAxIGZyb20gYWNzIHdoZXJlIEFHRVAgPCA1MCIpPC91PgoKLS0tCgo8YnIgLz4KCiMjIFF1ZXN0aW9uIDMKClVzaW5nIHRoZSBzYW1lIGRhdGEgZnJhbWUgeW91IGNyZWF0ZWQgaW4gdGhlIHByZXZpb3VzIHByb2JsZW0sIHdoYXQgaXMgdGhlIGVxdWl2YWxlbnQgZnVuY3Rpb24gdG8gdW5pcXVlKGFjcyRBR0VQKQoKPGJyIC8+CgojIyMgQW5zd2VyCgpgYGB7cn0KWiA8LSB1bmlxdWUoYWNzJEFHRVApCmBgYAoKPGJyIC8+CgoqIE9wdGlvbiBBOgoKYGBge3J9CkEgPC0gc3FsZGYoInNlbGVjdCBBR0VQIHdoZXJlIHVuaXF1ZSBmcm9tIGFjcyIpCmlkZW50aWNhbChaLCBBJEFHRVApCmBgYAoKPGJyIC8+CgoqIE9wdGlvbiBCOgoKYGBge3J9CkIgPC0gc3FsZGYoInNlbGVjdCBkaXN0aW5jdCBBR0VQIGZyb20gYWNzIikKaWRlbnRpY2FsKFosIEIkQUdFUCkKYGBgCgo8YnIgLz4KCiogT3B0aW9uIEM6CgpgYGB7cn0KQyA8LSBzcWxkZigic2VsZWN0IGRpc3RpbmN0IHB3Z3RwMSBmcm9tIGFjcyIpCmlkZW50aWNhbChaLCBDJEFHRVApCmBgYAoKPGJyIC8+CgoqIE9wdGlvbiBEOgoKYGBge3J9CkQgPC0gc3FsZGYoInNlbGVjdCB1bmlxdWUgQUdFUCBmcm9tIGFjcyIpCmlkZW50aWNhbChaLCBEJEFHRVApCmBgYAogPGJyIC8+CiAKKipSZXN1bHQ6KioKCk9wdGlvbiBCOiA8dT5zcWxkZigic2VsZWN0IGRpc3RpbmN0IEFHRVAgZnJvbSBhY3MiKTwvdT4KCjxiciAvPgoKLS0tCgo8YnIgLz4KCiMjIFF1ZXN0aW9uIDQKCkhvdyBtYW55IGNoYXJhY3RlcnMgYXJlIGluIHRoZSAxMHRoLCAyMHRoLCAzMHRoIGFuZCAxMDB0aCBsaW5lcyBvZiBIVE1MIGZyb20gdGhpcyBwYWdlOgoKaHR0cDovL2Jpb3N0YXQuamhzcGguZWR1L35qbGVlay9jb250YWN0Lmh0bWwKCihIaW50OiB0aGUgbmNoYXIoKSBmdW5jdGlvbiBpbiBSIG1heSBiZSBoZWxwZnVsKQoKIyMjIEFuc3dlcgoKRmV0Y2hpbmcgZGF0YS4uLgpgYGB7cn0KaHRtbFVybCA8LSB1cmwoImh0dHA6Ly9iaW9zdGF0Lmpoc3BoLmVkdS9+amxlZWsvY29udGFjdC5odG1sIikKaHRtbENvZGUgPC0gcmVhZExpbmVzKGh0bWxVcmwpCmNsb3NlKGh0bWxVcmwpCmBgYAoKPGJyIC8+ClZpZXdpbmcgZGF0YS4uLgpgYGB7cn0KaGVhZChodG1sQ29kZSkKYGBgCgo8YnIgLz4KRmluZGluZyBhbnN3ZXIuLi4KCmBgYHtyfQpjKG5jaGFyKGh0bWxDb2RlWzEwXSksIG5jaGFyKGh0bWxDb2RlWzIwXSksIG5jaGFyKGh0bWxDb2RlWzMwXSksIG5jaGFyKGh0bWxDb2RlWzEwMF0pKQpgYGAKCjxiciAvPgoKLS0tCgo8YnIgLz4KCiMjIFF1ZXN0aW9uIDUKClJlYWQgdGhpcyBkYXRhIHNldCBpbnRvIFIgYW5kIHJlcG9ydCB0aGUgc3VtIG9mIHRoZSBudW1iZXJzIGluIHRoZSBmb3VydGggb2YgdGhlIG5pbmUgY29sdW1ucy4KCmh0dHBzOi8vZDM5NnF1c3phNDBvcmMuY2xvdWRmcm9udC5uZXQvZ2V0ZGF0YSUyRndrc3N0ODExMC5mb3IKCk9yaWdpbmFsIHNvdXJjZSBvZiB0aGUgZGF0YTogaHR0cDovL3d3dy5jcGMubmNlcC5ub2FhLmdvdi9kYXRhL2luZGljZXMvd2tzc3Q4MTEwLmZvcgoKKEhpbnQ6IHRoaXMgaXMgYSBmaXhlZCB3aWR0aCBmaWxlIGZvcm1hdCkKCjxiciAvPgoKIyMjIEFuc3dlcgoKRmV0Y2hpbmcgZGF0YS4uLgoKYGBge3J9CmZpbGVVcmwgPC0gImh0dHBzOi8vZDM5NnF1c3phNDBvcmMuY2xvdWRmcm9udC5uZXQvZ2V0ZGF0YSUyRndrc3N0ODExMC5mb3IiClNTVCA8LSByZWFkLmZ3ZihmaWxlVXJsLCBza2lwPTQsIHdpZHRocz1jKDEyLCA3LCA0LCA5LCA0LCA5LCA0LCA5LCA0KSkKYGBgCgo8YnIgLz4KVmlld2luZyBmaWxlLi4uCmBgYHtyfQpoZWFkKFNTVCkKYGBgCgo8YnIgLz4KRmluZGluZyBhbnN3ZXIuLi4KYGBge3J9CnN1bShTU1RbLDRdKQpgYGAKCjxiciAvPgoKLS0tLS0tLS0tCjxjZW50ZXI+KipFTkQqKjwvY2VudGVyPgoqKioqKioqKio=