mod <- lm(Temp ~ Solar.R, data=aqc)
summary(mod)
##
## Call:
## lm(formula = Temp ~ Solar.R, data = aqc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -19.735 -6.292 1.080 6.231 18.648
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 72.110720 1.970502 36.595 < 2e-16 ***
## Solar.R 0.030747 0.009571 3.212 0.00173 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.15 on 109 degrees of freedom
## Multiple R-squared: 0.08649, Adjusted R-squared: 0.07811
## F-statistic: 10.32 on 1 and 109 DF, p-value: 0.001731
This shows the output of a simple linear regression model. The coefficients are the main statistic to look for first, as they are the elements of the model. In this case, we have \(0.028255\) as a coefficient for ‘Solar.R’ and \(72.863012\) for the intercept. So to create the model, it would look like this: \(Temp = 72.863012 + 0.028255*(Solar.R)\)
Graph of Simple Linear Regression
This graph shows how the regression line fits in the scatter plot of solar.R and Temp. It shows how the line fits the general trend of the data points.
Interpretation of Coefficents
- For regression models, they are in the form of \(\hat{y} = \beta_0+\beta_1x\).
- \(\beta_0\) is the constant for the model, and \(\beta_1\) is the coefficient for the \(x\), or independent variable.
- For interpretation, we would say “If \(x\) is \(0\), then \(y\) is equal to \(\beta_0\)”. This is for interpretation of \(\beta_0\).
- For \(\beta_1\), we would say “One unit increase in \(x\) would lead to a \(\beta_1\) increase in \(y\)”.
- It is important to remember that when doing formal interpretations of \(\beta_0\) and \(\beta_1\), we need to say the actual variable name instead of \(x\) and \(y\).
Strength of Models
- When judging how well models work, some main points you want to look for is the \(r^2\) statistic, and the significance of the independent variable.
- \(r^2\) is a measure of how well the model fits the data, and how much variation there is between the actual data points and the points on the linear model.
ggplot(aqc, aes(Solar.R, Temp)) + geom_point()

We know from the results of our regression model that the \(r^2\) statistic is about .08 which is not good at all. This is evident in our scatter plot because the data is not close together and telling of a strong pattern.
ggplot(aqc, aes(Ozone, Temp)) + geom_point()

From this graph, we see a clear upward trend that seems decently strong and the data points are closer together. We can expect a much higher \(r^2\) statistic in this model than the previous.
Strength of Models pt.2
- Checking the significance of the independent variable is important to checking the validity of a model.
- This is basically a hypothesis test of whether or not the value of \(\beta_1\) is actually 0 or not.
- The outputs of the regression model tell us all we need to know.
- Ways to check this hypothesis is just like other tests: The t-statistic, p-value, or confidence interval, which are all provided in the r output summary except for the confidence interval.
- If the null hypothesis of \(\beta_1 = 0\) is not rejected, the model is not trustworthy, because it can’t be proven that \(x\) has any effect on \(y\).
Conclusion
- In conclusion, we can see how regression models can be useful for prediction and identifying trends within data.
- We can also know how to interpret the meaning of the coefficient of the model, as well as what they mean.
- Finally, we can now identify the key indicator statistics to test the trustworthiness of the model and how strong its prediction power is.