boxplot(airquality$Ozone~airquality$Temp)Figure 1: Data plot showing ozone and temperature in New York
hist(airquality$Ozone)Figure 2: Histogram for ozone in New York
hist(airquality$Temp)Figure 3: Histogram for temperature in New York
qqnorm(airquality$Ozone) #This needs to be transformed.
qqline(airquality$Ozone)Figure 4: QQ plot showing normality for the ozone levels in New York
qqnorm(airquality$Temp) #This data is kind of normal for Temperature.
qqline(airquality$Temp)Figure 5: Q-Q plot showing normality for the temperature in New York
hist(airquality$Temp) #This shows the data is normal for temperatureFigure 6: Histogram showing normality for the temperature in New York
I transformed the ozone data as it was not normal in its raw form. For the ozone measurement I used the log function because when looking at the histogram it made a more normal distribution. For the temperature I looked at the histogram and it was almost perfectly normal.
airquality$OzoneLog<-log10(airquality$Ozone+0.0001)
hist(airquality$OzoneLog)Figure 7: Log transformation of the data on a histogram.
airquality$OzoneLog<-log10(airquality$Ozone+0.0001)
qqnorm(airquality$OzoneLog) #Doesn't help with normality
qqline(airquality$OzoneLog)Figure 8: Log transformation of the ozone data on a q-q plot.
hist(airquality$Temp) #This histogram is pretty normal.Figure 9: Histogram of temperature.
qqnorm(airquality$Temp) #This also shows normality.
qqline(airquality$Temp)Figure 10: Q-Q plot of Temperature.
airquality.LM<-lm(airquality$Ozone~airquality$Temp)
airqualityTrans.LM <-lm(airquality$OzoneLog~airquality$Temp)
plot(airquality.LM) #It approximates homoscedasticity. plot(airqualityTrans.LM)I rejected the null hypothesis and the results are statistically significant.
summary(airqualityTrans.LM) ##
## Call:
## lm(formula = airquality$OzoneLog ~ airquality$Temp)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.93139 -0.14373 0.01286 0.15855 0.64893
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.798204 0.195865 -4.075 8.53e-05 ***
## airquality$Temp 0.029316 0.002497 11.741 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.254 on 114 degrees of freedom
## (37 observations deleted due to missingness)
## Multiple R-squared: 0.5473, Adjusted R-squared: 0.5434
## F-statistic: 137.8 on 1 and 114 DF, p-value: < 2.2e-16
plot(airqualityTrans.LM)There is a significant positive relationship between the amount of ozone and the temperature of airquality in New York (Linear Model p-value <0.001, Multiple R-Squared= 0.5473). The ozone measurement was transformed to approximate normality of the residuals using the Log functions. The temperature measurement was not transformed.
plot(airquality$OzoneLog~airquality$Temp)
abline(airqualityTrans.LM)Please turn–in your homework via Sakai by saving and submitting an R Markdown PDF or HTML file from R Pubs!