---
title: "Multiple Linear Regression"
author: "J Sigma"
editor: source
format:
html:
css: styles.css
toc: true
toc-depth: 3
number-sections: false
theme: cosmo
code-fold: true
code-tools: true
smooth-scroll: true
embed-resources: true
page-navigation: true
pdf:
documentclass: article
toc: true
number-sections: false
execute:
engine: knitr
echo: true
warning: false
message: false
---
In **simple linear regression**, we try to explain or predict a response variable using **one explanatory variable**.
For example:
- Response variable: Final mark
- Explanatory variable: Lecture attendance
In reality, final marks are rarely determined by lecture attendance alone.
Other factors may matter too:
- study hours
- tutorial attendance
- assignment performance
- sleep
- prior mathematical background
So instead of using one explanatory variable, we use **multiple predictors simultaneously**.
# **Simple Linear Regression vs Multiple Linear Regression**
## **Population and Sample Models**
We have already established, for simple linear regression, that the population model is given by
$$y_{i}=\beta_{0}+\beta_{1}x
+\epsilon_{i}$$
where
- $i$ refers to a specific observation
- $\beta_{0}$ is the intercept parameter
- $\beta_{1}$ is the slope parameter; and
- $\epsilon_{i}$ is the error for the $i$-th observation
The sample model is then given by
$$\hat{y}_{i}=\hat{\beta_{0}}+\hat{\beta}_{1}x$$
since we assume that the errors are normally distributed with a mean of $0$. Here, there is one dependent variable and only one explanatory variable. For multiple linear regression, we have more than one explanatory variable. So, we adjust the population model for simple linear regression slightly. We have:
$$y_{i}=\beta_{0}+\beta_{1}x_{1i}+\beta_{2}x_{2i}+\dots+\beta_{p}x_{pi}+\epsilon_{i}$$
Here, we have $p$ independent variables.
- $i$ refers to an observation
- $\beta_{0}$ is the intercept parameter
- $\beta_{j}$ is the slope parameter for the $j$-th independent variable; and
- $\epsilon_{i}$ is the error
For multiple linear regression, the sample model is given by
$$\hat{y}_{i}=\hat{\beta}_{0}+\hat{\beta}_{1}x_{1i}+\hat{\beta}_{2}x_{2i}+\dots+\hat{\beta}_{p}x_{pi}$$
Visually, we have the following difference
{fig-align="center"}
## **Estimating the** $\beta$ **Parameters**
Like in simple linear regression, the **ordinary least squares (OLS)** method is used to estimate the $\beta$ parameters by minimising $\sum\epsilon_{i}^{2}$
## **Aim of Multiple Linear Regression**
1. To **model** the dependent variable using many independent variables
2. To **predict** the value of the dependent variable from the values of the multiple independent variables
3. To **understand** how a dependent variable changes with many independent variables
We will try to achieve these aims by using the following example problem:
::: {.callout-warning title="Working Example" icon="false"}
A company produces ***Fresh***, a brand of detergent. In order to manage its inventory more effectively and make revenue predictions, this company would like to better predict the demand for Fresh. To develop a prediction model, the company has gathered data concerning demand for Fresh over the last 30 sales periods. The first few lines of the dataset are shown below:
```{r}
#########################
# IMPORTING DATA INTO R
#########################
fresh_data <- data.frame(
demand = c(
7.38, 8.51, 9.52, 7.50, 9.33, 8.28, 8.75, 7.87,
7.10, 8.00, 7.89, 8.15, 9.10, 8.86, 8.90, 8.87,
9.26, 9.30, 8.75, 7.95, 7.65, 7.27, 8.30, 8.50,
8.75, 9.21, 8.27, 7.67, 7.93, 9.26
),
fresh_price = c(
3.85, 3.75, 3.70, 3.70, 3.60, 3.60, 3.60, 3.80,
3.80, 3.85, 3.90, 3.90, 3.70, 3.75, 3.75, 3.80,
3.70, 3.80, 3.70, 3.80, 3.80, 3.75, 3.70, 3.55,
3.60, 3.65, 3.70, 3.75, 3.80, 3.70
),
ads_expenditure = c(
5.5, 6.75, 7.25, 5.5, 7.0, 6.5, 6.75, 5.25,
5.25, 6.0, 6.5, 6.25, 7.0, 6.9, 6.8, 6.8,
7.1, 7.0, 6.8, 6.5, 6.25, 6.0, 6.5, 7.0,
6.8, 6.8, 6.5, 5.75, 5.8, 6.8
),
size = c(
"Small", "Big", "Big", "Small", "Big", "Small", "Big", "Small",
"Small", "Small", "Small", "Small", "Big", "Big", "Big", "Big",
"Big", "Big", "Big", "Small", "Small", "Small", "Small", "Big",
"Big", "Big", "Small", "Small", "Small", "Big"
),
ads_campaign = c(
"B", "B", "B", "A", "C", "A", "C", "C",
"B", "C", "A", "C", "C", "A", "B", "B",
"B", "A", "B", "B", "C", "A", "A", "A",
"A", "B", "C", "B", "C", "C"
),
competitor_price = c(
3.80, 4.00, 4.30, 3.70, 3.85, 3.80, 3.75, 3.85,
3.65, 4.00, 4.10, 4.00, 4.10, 4.20, 4.10, 4.10,
4.20, 4.30, 4.10, 3.75, 3.75, 3.65, 3.90, 3.65,
4.10, 4.25, 3.65, 3.75, 3.85, 4.25
)
)
# first few (6) line of data
head(fresh_data)
```
Here, we have
- $y \implies \text{demand for Fresh (in $100 000$s)}$
- $x_{1} \implies \text{price for Fresh (in $10$ rands)}$
- $x_{2} \implies \text{ad expenditure to promote Fresh (in $1000$ rands)}$
- $x_{3} \implies \text{size of the company (big or small)}$
- $x_{4} \implies \text{ads campaign used by the company}$. (A: TV campaigns, B: Mixture of TV and Radio ads, C: Mixture of TV, radio, magazine and newspaper ads)
- $x_{5} \implies \text{average competitor price for liquid detergents}$
:::