In the second part of this lab you will download hourly temperature measurements from weather stations within the boundary of the Landsat image to compare observed temperatures against the estimated temperatures from Landsat on July 30th, 2017. You will then use R to extract the temperatures from the land surface temperatures raster you created last week. These observed and estimated temperatures will be compared visually and statistically to hypothesize what time the Landsat image was captured. Finally, you will create a properly formatted R Notebook (modified from this version) or PDF lab report to present your findings.
Given the varying levels of familiarity with R, much of the coding for this lab will be done for you. However, make sure that you carefully walk through each step and understand what is being done because you will have less code written for you in future labs.
If you don’t already have R, you can download it at http://www.r-project.org.
We also recommend using R Studio as an integrated development environment (IDE) for R: https://www.rstudio.com/.
Open R (or R Studio) and install the necessary packages. R relies on externally written “packages” that carry many of the functions needed. You will be using several packages for this lab. For each package you will need to install them by typing the following code into the command console: install.packages("raster"), where raster is the name of the first package.
The cool thing about an R Notebook is that you can view it in any browser, but you can also open it in R. You can download this document as an Rmd file by clicking on the button that says “Code” in the top right-hand corner. Then select Download Rmd and open the file from R.
Here are a few helpful notes about R:
R allows you to run either the entire script, or just selected lines at a time. This notebook does a lot of the work for you. It has all the elements you need to modify the script,
Since this is an R Markdown Notebook, when you execute code within the notebook, the results appear beneath the code. Try executing the first code chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).
If you want to learn more about best practices for code writing, you can check out this style guide: http://adv-r.had.co.nz/Style.html.
Setup
This code chunk loads the packages used in this notebook.
library(raster)
library(rgdal)
library(data.table)
library(sp)
library(lubridate)
library(tidyr)
library(dplyr)
library(ggplot2)
library(plotly)
library(knitr)
knitr::opts_knit$set(echo=TRUE, warning=FALSE, message=FALSE)
This is also a good place to put helper functions. Below is a function that we will use to create a correlation matrix of the different temperatures against the estimated land surface temperatures from Landsat.
cor.matrix<-function(x,data=NA){
# panel.hist function adds the histogram
panel.hist <- function(x, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col="cyan", ...)
box()
}
panel.cor <- function(x, y, digits=2, prefix="", cex.cor)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y,method="spearman"))
txt <- format(c(r, 0.123456789), digits=digits)[1]
txt <- paste(prefix, txt, sep="")
if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex)
}
if (class(x)=="formula"){
x<-model.frame(x,data=data)
}
pairs(x,
lower.panel=panel.smooth,
upper.panel=panel.cor,
diag.panel=panel.hist,
cex.labels = 1, font.labels=2)
}
Loading Files
Here the landsurfacetemp.tif file is loaded. If you had trouble converting your temperature values from Kelvin to Fahrenheit you can do that here. Then the weather and weather station data are loaded and merged.
To get the weather and weather station data, follow these steps:
Access the Daily Temperature and Precipitation Reports - Data Tables from NOAA at NOAA Temperature Tables. Click on “Data Access” and select GHCN Daily Observations. It may be better to use Chrome to view the NOAA maps.
Click on the Time-Related Maps tab. Select Hourly/Sub-Hourly.
Type in Portland, OR and zoom out so that you can select the same extent of the Landsat image. You may need to look at the latitude and longitude of the image.
Click the wrench icon next to Hourly Global. Then select “Rectangle” to draw a boundary around all the applicable weather stations.
Select all the weather stations and download the station list.
Then click the orange button, “Access Data”. Use the date range to select data from midnight to midnight on July 30th, 2017.
You will eventually be directed to a page with a text version of the weather data. Save this text file and convert it into a CSV file that you can then import into R. Here’s an example of how the files should be formatted before loading them into R:
You will need to change the directories in the code below so that they properly reference your files.
sat_temp = raster("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/QGIS Files/surface_temperature3.tif")
# sat_temp = raster("C:/Users/bryan/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/QGIS Files/surface_temperature3.tif")
plot(sat_temp,col=rev(heat.colors(5)))

# converting kelvin to fahrenheit
sat_tempF = sat_temp * 9/5 - 459.67
# excluding edge values
sat_tempF = calc(sat_tempF, fun=function(x){ x[x < 0] <- NA; return(x)})
# changing projection
sat_tempF = projectRaster(sat_tempF,crs="+proj=longlat +datum=WGS84")
plot(sat_tempF,col=rev(heat.colors(5)))

# loading the weather station data
stations = setDT(read.csv("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/R Files/weather_data/stations.csv"))
# eliminating repeat stations
stations = unique(stations,by="USAF")
stations = unique(stations,by="BAN_ID")
stations = stations[USAF!=726881]
stations = stations[USAF!=726988]
# loading the weather data (from NOAA)
weather = setDT(read.csv("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/R Files/weather_data/weather.csv"))
weather[,dt:=ymd_hms(paste(YEAR,MONTH,DAY,HOUR,MINUTE,SECOND,sep="-"))]
weather_data = merge(stations, weather, by="USAF")
Correlating Temperatures
In this section we extract the estimated land surface temperatures for each weather station. Then we compare the hourly weather station data against the estimated land surface temperatures to hypothesize what time the landsat image was captured.
# creating a spatial points data frame for each weather station
coordinates = SpatialPoints(stations[,.(LONGITUDE,LATITUDE)],
proj4string = CRS("+proj=longlat +datum=WGS84
+ellps=WGS84 +towgs84=0,0,0"))
weather_map = SpatialPointsDataFrame(coordinates,stations)
# pulling temperatures from sat_tempF at each spatial point
sat_temps = raster::extract(sat_tempF,weather_map)
# merging observed temperatures with station location
stations[,sat_temp:=sat_temps]
# plotting sat_temps with weather stations highlighted
plot(sat_tempF,col=rev(heat.colors(5)))
points(stations[,.(LONGITUDE,LATITUDE,sat_temps)],pch="+",cex=2,col=rev(heat.colors(5)))

# display data table
weather_data
# combining station and satellite temperature data
stations[,HOUR:="sat"]
setnames(stations,"sat_temp","TEMP")
sat_station_temps = rbind(weather_data[,.(USAF,NAME,LATITUDE,LONGITUDE,TEMP,HOUR)],
stations[,.(USAF,NAME,LATITUDE,LONGITUDE,TEMP,HOUR)])
sat_station_temps$TEMP = as.numeric(as.character(sat_station_temps$TEMP))
NAs introduced by coercion
# converting groups to factor
sat_station_temps$HOUR = as.factor(sat_station_temps$HOUR)
sat_station_temps$USAF = as.factor(sat_station_temps$USAF)
# graphing the temperatures
ggplot(sat_station_temps, aes(x=USAF,y=TEMP)) +
geom_point(aes(color=HOUR))

Sometimes ggplot or static graphics are not helpful for distinguishing patterns in the data. Here’s another plot using an interactive HTML widget. You can use your mouse to highlight particular points, toggle values in the legend, or zoom in and out for a better view.
# now in plotly
plot_ly(data=sat_station_temps, x=~USAF, y=~TEMP, color=~HOUR, type = 'scatter')
No scatter mode specifed:
Setting the mode to markers
Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
Ignoring 6 observationsn too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
No scatter mode specifed:
Setting the mode to markers
Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
Ignoring 6 observationsn too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
However, if we want to get a better idea what the actual correlations are, we can calculate them for each hour. You can change the columns displayed in the cor.matrix argument (i.e., change weather_combined[,c(2:10,26),with=FALSE] to a different subset of columns like 2:10, 10-20, etc.).
# reshaping the data so that the hours are arranged as columns
weather_data$TEMP = as.numeric(as.character(weather_data$TEMP))
NAs introduced by coercion
weather_data[,HOUR_column:=paste0("H",HOUR)]
weather_simple = weather_data[,.(TEMP),by=.(USAF,HOUR_column)]
weather_simple = unique(weather_simple,by=c("USAF","HOUR_column"))
weather_wide = weather_simple %>% spread(HOUR_column,TEMP)
# merge hour temperatures with sat temperatures
setnames(stations,"TEMP","sat_temp")
weather_combined = merge(weather_wide,stations[,.(USAF,sat_temp)],by="USAF")
cor.matrix(weather_combined[,c(15:20,26),with=FALSE])
the condition has length > 1 and only the first element will be used

# may have to omit NA values to perform cor function
# weather_combined = na.omit(weather_combined)
# correlations = cor(weather_combined[,2:25,with=FALSE],weather_combined[,26,with=FALSE],method="spearman")
# correlations
# rownames(correlations)[which.max(correlations)]
# correlations[which.max(correlations)]
Lab Assignment
You will need to compose a formal lab report not exceeding five pages (excluding appendices). This can be submitted as an R Notebook (PDF version of the main report less than 5 pages) or as a PDF. You will need to include a brief introduction, a methods section, results, discussion, and a brief conclusion. See the syllabus and grading rubric for more a more detailed description. Feel free to include information or images from the first assignment that you think are relevant. Also, if you feel pressed for space you can include some of the images or tables in the appendices.
In particular, you should address the following questions:
Based on your analysis, what time do you hypothesize that the Landsat image was captured? Provide a justification for your hypothesis.
See if you can determine the actual time the image was captured. Does it match up with your analysis? If they are different, what could contribute to the difference? *Hint: check the file name when downloading through the SCP plug-in or search for the MTL file.
How would you characterize the different forms of error that may have influenced your analysis? Provide a qualitative analysis of precision (random errors), accuracy (systematic errors), and operator error (proper execution of methodology and/or use of technology).
Provide a brief description of how this ground-truth data could be used to calibrate measurements from remote sensors.
How would you characterize the potential for remote sensors to be used in development engineering?
Submit your final PDF or Rmd file on D2L through Dropbox by 3pm on Tuesday, January 30th.
---
title: "Remote Temperature Sensing Lab Part II"
output: 
   html_notebook:
     toc: yes
     toc_float: yes
     theme: simplex
---

In the second part of this lab you will download hourly temperature measurements from weather stations within the boundary of the Landsat image to compare observed temperatures against the estimated temperatures from Landsat on July 30th, 2017. You will then use R to extract the temperatures from the land surface temperatures raster you created last week. These observed and estimated temperatures will be compared visually and statistically to hypothesize what time the Landsat image was captured. Finally, you will create a properly formatted R Notebook (modified from this version) or PDF lab report to present your findings.

Given the varying levels of familiarity with R, much of the coding for this lab will be done for you. However, make sure that you carefully walk through each step and understand what is being done because you will have less code written for you in future labs.

1. If you don't already have R, you can download it at <a href="http://www.r-project.org" target="_blank">http://www.r-project.org</a>.

2. We also recommend using R Studio as an integrated development environment (IDE) for R: <a href="https://www.rstudio.com/" target="_blank">https://www.rstudio.com/</a>.

3. Open R (or R Studio) and install the necessary packages. R relies on externally written "packages" that carry many of the functions needed. You will be using several packages for this lab. For each package you will need to install them by typing the following code into the command console: `install.packages("raster")`, where `raster` is the name of the first package.

4. The cool thing about an R Notebook is that you can view it in any browser, but you can also open it in R. You can download this document as an Rmd file by clicking on the button that says "Code" in the top right-hand corner. Then select Download Rmd and open the file from R.

Here are a few helpful notes about R:

* R allows you to run either the entire script, or just selected lines at a time. This notebook does a lot of the work for you. It has all the elements you need to modify the script,

* Since this is an [R Markdown](http://rmarkdown.rstudio.com) Notebook, when you execute code within the notebook, the results appear beneath the code. Try executing the first code chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Ctrl+Shift+Enter*.

* Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing *Ctrl+Alt+I*.

* When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Ctrl+Shift+K* to preview the HTML file).

* If you want to learn more about best practices for code writing, you can check out this style guide: <a href="http://adv-r.had.co.nz/Style.html" target="_blank">http://adv-r.had.co.nz/Style.html</a>.

# Setup

This code chunk loads the packages used in this notebook.

```{r global_options, echo=TRUE, warning=FALSE, message=FALSE}
library(raster)
library(rgdal)
library(data.table)
library(sp)
library(lubridate)
library(tidyr)
library(dplyr)
library(ggplot2)
library(plotly)
library(knitr)
knitr::opts_knit$set(echo=TRUE, warning=FALSE, message=FALSE)

```

This is also a good place to put helper functions. Below is a function that we will use to create a correlation matrix of the different temperatures against the estimated land surface temperatures from Landsat.

```{r}
cor.matrix<-function(x,data=NA){
  # panel.hist function adds the histogram 
  panel.hist <- function(x, ...)
  {
      usr <- par("usr"); on.exit(par(usr))
      par(usr = c(usr[1:2], 0, 1.5) )
      h <- hist(x, plot = FALSE)
      breaks <- h$breaks; nB <- length(breaks)
      y <- h$counts; y <- y/max(y)
      rect(breaks[-nB], 0, breaks[-1], y, col="cyan", ...)
      box()
  }
  panel.cor <- function(x, y, digits=2, prefix="", cex.cor)
  {
      usr <- par("usr"); on.exit(par(usr))
      par(usr = c(0, 1, 0, 1))
      r <- abs(cor(x, y,method="spearman"))
      txt <- format(c(r, 0.123456789), digits=digits)[1]
      txt <- paste(prefix, txt, sep="")
      if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
      text(0.5, 0.5, txt, cex = cex)
  }

  if (class(x)=="formula"){
    x<-model.frame(x,data=data)
  }

  pairs(x,
        lower.panel=panel.smooth,
        upper.panel=panel.cor,
        diag.panel=panel.hist,
        cex.labels = 1, font.labels=2)

}

```

# Loading Files

Here the landsurfacetemp.tif file is loaded. If you had trouble converting your temperature values from Kelvin to Fahrenheit you can do that here. Then the weather and weather station data are loaded and merged.

To get the weather and weather station data, follow these steps:

1. Access the Daily Temperature and Precipitation Reports - Data Tables from NOAA at <a href="https://www.climate.gov/maps-data/dataset/daily-temperature-and-precipitation-reports-data-tables" target="_blank">NOAA Temperature Tables</a>. Click on "Data Access" and select GHCN Daily Observations. It may be better to use Chrome to view the NOAA maps.

2. Click on the Time-Related Maps tab. Select Hourly/Sub-Hourly.

3. Type in Portland, OR and zoom out so that you can select the same extent of the Landsat image. You may need to look at the latitude and longitude of the image.

4. Click the wrench icon next to Hourly Global. Then select "Rectangle" to draw a boundary around all the applicable weather stations.

5. Select all the weather stations and download the station list.

6. Then click the orange button, "Access Data". Use the date range to select data from midnight to midnight on July 30th, 2017.

7. You will eventually be directed to a page with a text version of the weather data. Save this text file and convert it into a CSV file that you can then import into R. Here's an example of how the files should be formatted before loading them into R:

    + <a href="https://drive.google.com/open?id=1ME8rJ76hDTYpWqQw7XZbh17WecJ4YJHE" target="_blank">Weather data example</a>
    
    + <a href="https://drive.google.com/a/pdx.edu/file/d/1B99CsHA2-FdEBcN3-3Yx4C73VRLIp0TP/view?usp=sharing" target="_blank">Weather station data example</a>

You will need to change the directories in the code below so that they properly reference your files.

```{r}
sat_temp = raster("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/QGIS Files/surface_temperature3.tif")
# sat_temp = raster("C:/Users/bryan/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/QGIS Files/surface_temperature3.tif")
plot(sat_temp,col=rev(heat.colors(5)))

# converting kelvin to fahrenheit
sat_tempF = sat_temp * 9/5 - 459.67

# excluding edge values
sat_tempF = calc(sat_tempF, fun=function(x){ x[x < 0] <- NA; return(x)})

# changing projection
sat_tempF = projectRaster(sat_tempF,crs="+proj=longlat +datum=WGS84")
plot(sat_tempF,col=rev(heat.colors(5)))

# loading the weather station data
stations = setDT(read.csv("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/R Files/weather_data/stations.csv"))

# eliminating repeat stations
stations = unique(stations,by="USAF")
stations = unique(stations,by="BAN_ID")
stations = stations[USAF!=726881]
stations = stations[USAF!=726988]

# loading the weather data (from NOAA)
weather = setDT(read.csv("D:/Google Drive/School/PDX/2018 Winter/ENGR510-DevelopmentEngineering/ENGR510-CourseMaterials/Labs/Remote Sensing/R Files/weather_data/weather.csv"))
weather[,dt:=ymd_hms(paste(YEAR,MONTH,DAY,HOUR,MINUTE,SECOND,sep="-"))]

weather_data = merge(stations, weather, by="USAF")

```

# Correlating Temperatures

In this section we extract the estimated land surface temperatures for each weather station. Then we compare the hourly weather station data against the estimated land surface temperatures to hypothesize what time the landsat image was captured.

```{r}

# creating a spatial points data frame for each weather station
coordinates = SpatialPoints(stations[,.(LONGITUDE,LATITUDE)],
                            proj4string = CRS("+proj=longlat +datum=WGS84 
                                              +ellps=WGS84 +towgs84=0,0,0"))
weather_map = SpatialPointsDataFrame(coordinates,stations)

# pulling temperatures from sat_tempF at each spatial point
sat_temps = raster::extract(sat_tempF,weather_map)

# merging observed temperatures with station location
stations[,sat_temp:=sat_temps]

# plotting sat_temps with weather stations highlighted
plot(sat_tempF,col=rev(heat.colors(5)))
points(stations[,.(LONGITUDE,LATITUDE,sat_temps)],pch="+",cex=2,col=rev(heat.colors(5)))

# display data table
weather_data

# combining station and satellite temperature data
stations[,HOUR:="sat"]
setnames(stations,"sat_temp","TEMP")
sat_station_temps = rbind(weather_data[,.(USAF,NAME,LATITUDE,LONGITUDE,TEMP,HOUR)],
                          stations[,.(USAF,NAME,LATITUDE,LONGITUDE,TEMP,HOUR)])
sat_station_temps$TEMP = as.numeric(as.character(sat_station_temps$TEMP))

# converting groups to factor
sat_station_temps$HOUR = as.factor(sat_station_temps$HOUR)
sat_station_temps$USAF = as.factor(sat_station_temps$USAF)

# graphing the temperatures
ggplot(sat_station_temps, aes(x=USAF,y=TEMP)) +
  geom_point(aes(color=HOUR))

```

Sometimes ggplot or static graphics are not helpful for distinguishing patterns in the data. Here's another plot using an interactive HTML widget. You can use your mouse to highlight particular points, toggle values in the legend, or zoom in and out for a better view.

```{r}

# now in plotly
plot_ly(data=sat_station_temps, x=~USAF, y=~TEMP, color=~HOUR, type = 'scatter')

```

However, if we want to get a better idea what the actual correlations are, we can calculate them for each hour. You can change the columns displayed in the cor.matrix argument (i.e., change `weather_combined[,c(2:10,26),with=FALSE]` to a different subset of columns like 2:10, 10-20, etc.).

```{r}
# reshaping the data so that the hours are arranged as columns
weather_data$TEMP = as.numeric(as.character(weather_data$TEMP))
weather_data[,HOUR_column:=paste0("H",HOUR)]
weather_simple = weather_data[,.(TEMP),by=.(USAF,HOUR_column)]
weather_simple = unique(weather_simple,by=c("USAF","HOUR_column"))
weather_wide = weather_simple %>% spread(HOUR_column,TEMP)

# merge hour temperatures with sat temperatures
setnames(stations,"TEMP","sat_temp")
weather_combined = merge(weather_wide,stations[,.(USAF,sat_temp)],by="USAF")
cor.matrix(weather_combined[,c(15:20,26),with=FALSE])

# may have to omit NA values to perform cor function
# weather_combined = na.omit(weather_combined)
# correlations = cor(weather_combined[,2:25,with=FALSE],weather_combined[,26,with=FALSE],method="spearman")
# correlations
# rownames(correlations)[which.max(correlations)]
# correlations[which.max(correlations)]

```

# Lab Assignment

You will need to compose a formal lab report not exceeding five pages (excluding appendices). This can be submitted as an R Notebook (PDF version of the main report less than 5 pages) or as a PDF. You will need to include a brief introduction, a methods section, results, discussion, and a brief conclusion. See the syllabus and grading rubric for more a more detailed description. Feel free to include information or images from the first assignment that you think are relevant. Also, if you feel pressed for space you can include some of the images or tables in the appendices.

In particular, you should address the following questions:

1. Based on your analysis, what time do you hypothesize that the Landsat image was captured? Provide a justification for your hypothesis.

2. See if you can determine the actual time the image was captured. Does it match up with your analysis? If they are different, what could contribute to the difference? *Hint: check the file name when downloading through the SCP plug-in or search for the MTL file.

3. How would you characterize the different forms of error that may have influenced your analysis? Provide a qualitative analysis of precision (random errors), accuracy (systematic errors), and operator error (proper execution of methodology and/or use of technology).

4. Provide a brief description of how this ground-truth data could be used to calibrate measurements from remote sensors.

5. How would you characterize the potential for remote sensors to be used in development engineering?

Submit your final PDF or Rmd file on D2L through Dropbox by 3pm on Tuesday, January 30th.
