This is a quick example of displaying weather data with the following workflow:

  1. Downloaded raw weather data from Weather Underground, then save it as CSV.
  2. Import CSV data into local data frame using tbl_df().
  3. Transform given date from text to UTC format using mdy() via lubridate package.
  4. Create plots
    1. Create a Temperature over Time plot for Mountain View, CA temperature during Feb 2014.
    2. Create a Temperature Density plot (histogram) for Mountain View, CA temperature during Feb 2014.
  5. Display plots.
  6. Get a job @Granular!

Setting up environment

library(dplyr)
library(lubridate)
library(ggvis)
library(corrplot)


# Initializing working directory
dir_Data        = "/Users/richleung/Dropbox/Projects/Weather/1_Input"
dir_Main      = "/Users/richleung/Dropbox/Projects/Weather/2_Code"



# Import raw data file
setwd(dir_Data)
wxdata <- read.table("wx_mountainviewCA_v2.csv", sep=",", header=TRUE)
# convert to local data frame
wxdata <- tbl_df(wxdata)

Computing

# ---------------------------------------------------------------------
# 1.) Add column: PSTdate to [wxdata] via mutuate() and mdy()
wxdata <- wxdata %>%
  mutate(PSTdate = mdy(PST))

Plots

Temperature during Feb 2014 in Mountain View, CA

Temperature Histogram during Feb 2014 in Mountain View, CA

Correlations

Correlation between Temperature, Dew Points, Humidity, Pressure, Visibility, Wind Speed, Precipitationmm, Cloud Cover during 1Q2015 in Mountain View, CA.

Linear Regression

Predict temperature based on atmospheric signals

Regress on Temperature given Dew Point as a given vector.

## 
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC, data = wxdata_cor)
## 
## Coefficients:
##    (Intercept)  MeanDew_PointC  
##        12.4502          0.2225
## 
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC, data = wxdata_cor)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.1177 -1.3127 -0.0076  1.5474  6.9948 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    12.45021    0.44762  27.814  < 2e-16 ***
## MeanDew_PointC  0.22249    0.06211   3.582 0.000558 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.198 on 88 degrees of freedom
## Multiple R-squared:  0.1273, Adjusted R-squared:  0.1174 
## F-statistic: 12.83 on 1 and 88 DF,  p-value: 0.0005577

Regress on Temperature given Dew Point, Humidity, and Pressure as vectors.

## 
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC + X_Mean_Humidity + 
##     X_Mean_Wind_SpeedKm.h, data = wxdata_cor)
## 
## Coefficients:
##           (Intercept)         MeanDew_PointC        X_Mean_Humidity  
##              23.56040                0.80940               -0.24237  
## X_Mean_Wind_SpeedKm.h  
##               0.08546
## 
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC + X_Mean_Humidity + 
##     X_Mean_Wind_SpeedKm.h, data = wxdata_cor)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5183 -0.7240 -0.1794  0.7893  2.9836 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           23.56040    0.89517  26.320   <2e-16 ***
## MeanDew_PointC         0.80940    0.06005  13.478   <2e-16 ***
## X_Mean_Humidity       -0.24237    0.01766 -13.721   <2e-16 ***
## X_Mean_Wind_SpeedKm.h  0.08546    0.03665   2.332   0.0221 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.218 on 86 degrees of freedom
## Multiple R-squared:  0.7379, Adjusted R-squared:  0.7288 
## F-statistic: 80.72 on 3 and 86 DF,  p-value: < 2.2e-16

Conclusion: Using Dew Point, Humidity, and Pressure to predict air temperature seems pretty reliable. This is evidence by the lower p-Value than the regression with a single vector. Moreover, the p-Value for the second regression is statistically significant (aka, the variables significant from each other).