This is a quick example of displaying weather data with the following workflow:
tbl_df().mdy() via lubridate package.library(dplyr)
library(lubridate)
library(ggvis)
library(corrplot)
# Initializing working directory
dir_Data = "/Users/richleung/Dropbox/Projects/Weather/1_Input"
dir_Main = "/Users/richleung/Dropbox/Projects/Weather/2_Code"
# Import raw data file
setwd(dir_Data)
wxdata <- read.table("wx_mountainviewCA_v2.csv", sep=",", header=TRUE)
# convert to local data frame
wxdata <- tbl_df(wxdata)
# ---------------------------------------------------------------------
# 1.) Add column: PSTdate to [wxdata] via mutuate() and mdy()
wxdata <- wxdata %>%
mutate(PSTdate = mdy(PST))
Temperature during Feb 2014 in Mountain View, CA
Temperature Histogram during Feb 2014 in Mountain View, CA
Correlation between Temperature, Dew Points, Humidity, Pressure, Visibility, Wind Speed, Precipitationmm, Cloud Cover during 1Q2015 in Mountain View, CA.
Regress on Temperature given Dew Point as a given vector.
##
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC, data = wxdata_cor)
##
## Coefficients:
## (Intercept) MeanDew_PointC
## 12.4502 0.2225
##
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC, data = wxdata_cor)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.1177 -1.3127 -0.0076 1.5474 6.9948
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.45021 0.44762 27.814 < 2e-16 ***
## MeanDew_PointC 0.22249 0.06211 3.582 0.000558 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.198 on 88 degrees of freedom
## Multiple R-squared: 0.1273, Adjusted R-squared: 0.1174
## F-statistic: 12.83 on 1 and 88 DF, p-value: 0.0005577
Regress on Temperature given Dew Point, Humidity, and Pressure as vectors.
##
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC + X_Mean_Humidity +
## X_Mean_Wind_SpeedKm.h, data = wxdata_cor)
##
## Coefficients:
## (Intercept) MeanDew_PointC X_Mean_Humidity
## 23.56040 0.80940 -0.24237
## X_Mean_Wind_SpeedKm.h
## 0.08546
##
## Call:
## lm(formula = Mean_TemperatureC ~ MeanDew_PointC + X_Mean_Humidity +
## X_Mean_Wind_SpeedKm.h, data = wxdata_cor)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5183 -0.7240 -0.1794 0.7893 2.9836
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.56040 0.89517 26.320 <2e-16 ***
## MeanDew_PointC 0.80940 0.06005 13.478 <2e-16 ***
## X_Mean_Humidity -0.24237 0.01766 -13.721 <2e-16 ***
## X_Mean_Wind_SpeedKm.h 0.08546 0.03665 2.332 0.0221 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.218 on 86 degrees of freedom
## Multiple R-squared: 0.7379, Adjusted R-squared: 0.7288
## F-statistic: 80.72 on 3 and 86 DF, p-value: < 2.2e-16
Conclusion: Using Dew Point, Humidity, and Pressure to predict air temperature seems pretty reliable. This is evidence by the lower p-Value than the regression with a single vector. Moreover, the p-Value for the second regression is statistically significant (aka, the variables significant from each other).