A: Plot your data first (as always). Be sure to include an informative figure caption.

data("airquality")
summary (airquality)
##      Ozone           Solar.R           Wind             Temp      
##  Min.   :  1.00   Min.   :  7.0   Min.   : 1.700   Min.   :56.00  
##  1st Qu.: 18.00   1st Qu.:115.8   1st Qu.: 7.400   1st Qu.:72.00  
##  Median : 31.50   Median :205.0   Median : 9.700   Median :79.00  
##  Mean   : 42.13   Mean   :185.9   Mean   : 9.958   Mean   :77.88  
##  3rd Qu.: 63.25   3rd Qu.:258.8   3rd Qu.:11.500   3rd Qu.:85.00  
##  Max.   :168.00   Max.   :334.0   Max.   :20.700   Max.   :97.00  
##  NA's   :37       NA's   :7                                       
##      Month            Day      
##  Min.   :5.000   Min.   : 1.0  
##  1st Qu.:6.000   1st Qu.: 8.0  
##  Median :7.000   Median :16.0  
##  Mean   :6.993   Mean   :15.8  
##  3rd Qu.:8.000   3rd Qu.:23.0  
##  Max.   :9.000   Max.   :31.0  
## 
plot (Ozone~Temp, data = airquality)

B: Are both of your continuous variables displaying normal distribution? Provide two pieces of evidence that display your normality assessment for each variable.

hist (airquality$Temp) #histogram continuous variable

hist (airquality$Ozone) # histogram of continuous variable

qqnorm (airquality$Temp)#q-q plot of continuous variable
qqline(airquality$Temp)

qqnorm(airquality$Ozone)
qqline(airquality$Ozone)

C: Did you transform one or more of your variables? If so, state which transformation you used. Provide two pieces of evidence that your data more closely approximates a normal distribution. If not, state why you did not transform the data.

ANSWER HERE

airquality$OzoneLog<-log10(airquality$Ozone+0.0001)
hist(airquality$OzoneLog)

D: Create a linear regression object with the appropriate data (raw or transformed). Be sure to place the variables on the correct axis. Present your code and output into R Markdown file.

#Linear Model
airquality.LM <- lm(OzoneLog~Temp, data = airquality)
plot (airquality.LM) #Assumption plots - Run in the R console

summary (airquality.LM)
## 
## Call:
## lm(formula = OzoneLog ~ Temp, data = airquality)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.93139 -0.14373  0.01286  0.15855  0.64893 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.798204   0.195865  -4.075 8.53e-05 ***
## Temp         0.029316   0.002497  11.741  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.254 on 114 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.5473, Adjusted R-squared:  0.5434 
## F-statistic: 137.8 on 1 and 114 DF,  p-value: < 2.2e-16

E: Did you accept or reject the null hypothesis? Are the results statistically significant? Provide and interpret two evidence graphs that the residuals meet the assumptions of the linear model.

#I rejec the null Hypothesis that the slope between ozone and temperature equals zero. The results are said to be statistically significant.
plot (airquality.LM) #original x-y plot

abline (airquality.LM) #regression line 

F: Summarize your results in a paragraph similar to the example in the ``Reporting Your Results’’ section. Be sure to also provide a final graph of your data including a best fit line.

#A significant positive relationship exist between Ozone and temperature in the airquality dateset. Linear Model p-value <0.001 Multiple R-Squared =0.55. Ozone data need it to be transformed to approximate normality of the residuals.

Please turn–in your homework via Sakai by saving and submitting an R Markdown PDF or HTML file from R Pubs!