- Introduction to regression analysis.
- Importance of simple linear regression in statistics.
2024-02-15
The simple linear regression model can be represented as:
\[ y = \beta_0 + \beta_1 x + \varepsilon \]
where: \(y\) is the dependent variable, \(x\) is the independent variable, \(\beta_0\) is the intercept, \(\beta_1\) is the slope, and \(\varepsilon\) is the error term.
Let’s consider an example where we want to predict the sales of a product based on the advertising budget spent on it.
# Generate example data set.seed(123) budget <- seq(100, 500, by = 50) sales <- 100 + 0.5 * budget + rnorm(length(budget), mean = 0, sd = 20) data <- data.frame(Budget = budget, Sales = sales) # Display the first few rows of the data head(data)
## Budget Sales ## 1 100 138.7905 ## 2 150 170.3965 ## 3 200 231.1742 ## 4 250 226.4102 ## 5 300 252.5858 ## 6 350 309.3013
model <- lm(Sales ~ Budget, data = data)
summary(model)
## ## Call: ## lm(formula = Sales ~ Budget, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -23.789 -11.413 -2.626 9.344 33.040 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 110.9711 17.5711 6.316 0.000398 *** ## Budget 0.4723 0.0538 8.778 5.02e-05 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 20.84 on 7 degrees of freedom ## Multiple R-squared: 0.9167, Adjusted R-squared: 0.9048 ## F-statistic: 77.05 on 1 and 7 DF, p-value: 5.017e-05
plot_ly(data, x = ~Budget, y = ~Sales, type = "scatter", mode = "markers", name = "Data") %>% add_trace(x = budget, y = predict(model), mode = "lines", name = "Regression Line") %>% layout(title = "Fitted Regression Line", xaxis = list(title = "Advertising Budget"), yaxis = list(title = "Sales"))
In this presentation, we introduced the concept of simple linear regression, demonstrated its application using an example, and interpreted the results obtained from the fitted regression model.