So What’s The Relationship Between Wind speed and Temperature in New York

To manipulate the airquality data of New York we have better, I wanted to first get all of my data into R and then create the new variables discussed in the unit to make the data more meaningful for any queries I wanted to do. First I I loaded the data set in the R ,Since there are too many data content, I just load the first six lines of the dataset and the six line at the end, and then summarize the data briefly.

head(airquality)
##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
## 4    18     313 11.5   62     5   4
## 5    NA      NA 14.3   56     5   5
## 6    28      NA 14.9   66     5   6
tail(airquality)
##     Ozone Solar.R Wind Temp Month Day
## 148    14      20 16.6   63     9  25
## 149    30     193  6.9   70     9  26
## 150    NA     145 13.2   77     9  27
## 151    14     191 14.3   75     9  28
## 152    18     131  8.0   76     9  29
## 153    20     223 11.5   68     9  30
summary(airquality)
##      Ozone           Solar.R           Wind             Temp      
##  Min.   :  1.00   Min.   :  7.0   Min.   : 1.700   Min.   :56.00  
##  1st Qu.: 18.00   1st Qu.:115.8   1st Qu.: 7.400   1st Qu.:72.00  
##  Median : 31.50   Median :205.0   Median : 9.700   Median :79.00  
##  Mean   : 42.13   Mean   :185.9   Mean   : 9.958   Mean   :77.88  
##  3rd Qu.: 63.25   3rd Qu.:258.8   3rd Qu.:11.500   3rd Qu.:85.00  
##  Max.   :168.00   Max.   :334.0   Max.   :20.700   Max.   :97.00  
##  NA's   :37       NA's   :7                                       
##      Month            Day      
##  Min.   :5.000   Min.   : 1.0  
##  1st Qu.:6.000   1st Qu.: 8.0  
##  Median :7.000   Median :16.0  
##  Mean   :6.993   Mean   :15.8  
##  3rd Qu.:8.000   3rd Qu.:23.0  
##  Max.   :9.000   Max.   :31.0  
## 
str(airquality)
## 'data.frame':    153 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
##  $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
##  $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
##  $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...

Then I made a scatter of the wind speed and temperature of every month.

library(ggplot2)
airquality$Month<-factor(airquality$Month)
qplot(Wind,Temp,data=airquality,color=Month)

Then I add regression analysis to the scatter plot.

library(ggplot2)
airquality$Month<-factor(airquality$Month)
qplot(Wind,Temp,data=airquality,color=Month,geom = c("point","smooth"),facets = .~Month)
## `geom_smooth()` using method = 'loess'

ggplot(airquality,aes(Wind,Temp))+
  geom_point(aes(color=factor(Month)))+
  geom_smooth(se=FALSE,aes(color=factor(Month)))
## `geom_smooth()` using method = 'loess'

Then I performed linear regression analysis.

library(ggplot2)
airquality$Month<-factor(airquality$Month)
ggplot(airquality,aes(Wind,Temp))+
  geom_point(aes(color=factor(Month)))+
  geom_smooth(method="lm",se=FALSE,aes(color=factor(Month)))

Finally, I fit all the regression lines into a whole.

library(ggplot2)
airquality$Month<-factor(airquality$Month)
ggplot(airquality,aes(Wind,Temp))+
  geom_point(aes(color=factor(Month)))+
  geom_smooth(method="lm",se=FALSE,aes(color=factor(Month),group=1))

It appears based on this graph with aggregate data that the temperature is negatively related to wind speed, In other words , the temperature decreases with the increase of wind speed.