## hours score ## 1 1 52 ## 2 2 55 ## 3 3 61 ## 4 4 63 ## 5 5 68 ## 6 6 72 ## 7 7 75 ## 8 8 81 ## 9 9 85 ## 10 10 88
## hours score ## 1 1 52 ## 2 2 55 ## 3 3 61 ## 4 4 63 ## 5 5 68 ## 6 6 72 ## 7 7 75 ## 8 8 81 ## 9 9 85 ## 10 10 88
Here is the equation for a simple linear regression model: \[ \hat{y} = b_0 + b_1 x \]
where:
\(\hat{y}\) is the predicted value
\(b_0\) is the intercept
\(b_1\) is the slope
\(x\) is the predictor variable
The slope tells us how much the response variable changes for each one-unit increase in \(x\).
We can use R to compute the regression line using the lm() function.
\[\hat{y} = 47.53 + 4.08x\]
This plot shows the fitted regression line along with the data.
Residuals are the differences between the observed values and the predicted values.
\[ e_i = y_i - \hat{y}_i \]
A good model has residuals that are relatively small and randomly scattered around 0.
This interactive plot shows the data points and fitted regression line.
Simple linear regression is a useful statistical method for understanding the relationship between two variables. In this example, we saw that as hours studied increase, exam scores also increase.
This model allows us to:
describe relationships between variables
make predictions
identify trends in data
Overall, simple linear regression is a widely applicable topic that can be used in variety of fields and industries.