case_data <- read.table("/Users/katyhaller/Library/Mobile Documents/com~apple~CloudDocs/R Directory/EPH 727/EPH 727/Session_13_Lab_7/TimeSpace/patient_data.csv", header=T, sep=",")
data <- read.table("/Users/katyhaller/Library/Mobile Documents/com~apple~CloudDocs/R Directory/EPH 727/EPH 727/Session_13_Lab_7/TimeSpace/cl_pm25_data.csv", header=T,sep=",")
obs <- as.matrix(data[c(2,3,7)])
colnames(obs) <- c("x","y","z")
data.query <- as.matrix(case_data[c(2,3)])
summary(data.query)
## x_coord y_coord
## Min. :-82.54 Min. :40.46
## 1st Qu.:-81.98 1st Qu.:40.77
## Median :-81.50 Median :41.04
## Mean :-81.55 Mean :41.06
## 3rd Qu.:-81.12 3rd Qu.:41.32
## Max. :-80.71 Max. :41.92
colnames(data.query) <- c("x","y")
plot(obs[,1],obs[,2], main="Map of Observation Locations", xlab="Longitude", ylab="Latitude", cex=0.1)
points(data.query[,1],data.query[,2], col="red", pch=1)
nk <- lk(data.query, obs, th=0.05, xcoord="x", ycoord="y", zcoord="z", vlen=10)
nk_var <- variogram(log(pm25)~1, loc=~along+alat, data=data)
plot(nk_var)
data(epa_cl)
data(ex)
quilt.plot(ex1.data$x,ex1.data$y,ex1.data$z,nrow=20,ncol=20,main="Observed PM[2.5] 01/01/2005")
system.time(out <- lk(ex1.grid,ex1.data,0.1,zcoord='z'))
## user system elapsed
## 0.023 0.011 2.182
quilt.plot(ex1.grid$x,ex1.grid$y,out$krig,nrow=50,ncol=50,main='predicted PM[2.5] 01/01/2005')
quilt.plot(ex1.grid$x,ex1.grid$y,out$krig,nrow=20,ncol=50,main='predicted PM[2.5] 01/01/2005')
Spatiotemporal mismatch creates a significant challenge for researchers investigating the relationship between environmental exposures (like PM 2.5) and health outcomes. Frequently, we are left to either compute exposure by interpolating data (like for this lab) or aggregating data (which was used for the first several labs). Interpolating adds uncertainty to your measurements, while aggregation assumes that all people in a given geographic area have the same exposure to an environmental source. Generally, as distance from an observed measurement increases, so does uncertainty. Also, if there are not enough neighbors within a threshold, interpolation still leaves significant missingness. In reality, neither of these methods are optimal because they often lead to exposure misclassification. Results from studies that assess the same exposure and outcome may differ greatly both in significance and extent depending on the varying degrees of exposure misclassification resulting from the chosen method of solving data mismatch. Exposure uncertainty is one of the most significant factors underlying inconsistency in the adverse impacts of criteria pollutants.