Introduction

Simple Linear Regression studies the relationship between two variables.

Example question: how does horsepower affect fuel efficiency (mpg) in cars? we will analyze this using the mtcars dataset in R

What is Simple Linear Regression?

Simple linear regression models the relationship between:

-one independent variable \(x\) -one dependent variable \(y\)

it helps us predict one variable using another.

Regression Model

the regression Model is:

\[ y = \beta_0 + \beta_1 x + \varepsilon \] where:

  • \(y\) = response variable
  • \(x\) = predictor variable
  • \(\beta_0\) = intercept
  • \(\beta_1\) = slop
  • () = error trem

Estimated Regression Equation

In practice we estimate the model using data:

\[ \hat{y} = b_0 + b_1 x \] where: - (b_0) = estimated intercept - (b_1) = estimated slop

this equation predict value of \(y\).

Dataset

We will use the built-in mtcars dataset.

Variables used:

  • hp = horsepower
  • mpg = miles per gallon
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Scatter Plot (ggplot)

This plot shows the relationship between horsepower and fuel efficiency.

Regression Line (ggplot)

## `geom_smooth()` using formula = 'y ~ x'

The regression line shows the trend between variables.

Interactive Plot (Plotly)

## `geom_smooth()` using formula = 'y ~ x'

Plotly makes the graph interactive.

You can hover to see values.

Fit the Model

## 
## Call:
## lm(formula = mpg ~ hp, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.7121 -2.1122 -0.8854  1.5819  8.2360 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 30.09886    1.63392  18.421  < 2e-16 ***
## hp          -0.06823    0.01012  -6.742 1.79e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.863 on 30 degrees of freedom
## Multiple R-squared:  0.6024, Adjusted R-squared:  0.5892 
## F-statistic: 45.46 on 1 and 30 DF,  p-value: 1.788e-07

The lm() function fits a linear regression model.

The summary shows:

  • coefficients
  • significance
  • model fit statistics

Interpretation

From the regression model:

  • horsepower and mpg have a negative relationship
  • as horsepower increases, fuel efficiency decreases

This means powerful cars tend to use more fuel.

Conclusion

Simple Linear Regression helps us:

  • understand relationships between variables
  • predict outcomes
  • analyze trends in data

In this example, we predicted fuel efficiency using horsepower.

Regression models are widely used in statistics, economics, and data science.