Linear regression is statistical method that is used to predict what a dependent variable will be based on independent variable by fitting a straight line through data points assuming that there is a linear relationship between them.
2024-11-18
Linear regression is statistical method that is used to predict what a dependent variable will be based on independent variable by fitting a straight line through data points assuming that there is a linear relationship between them.
We tend to use linear regression when the relationship between data points is pretty obviously linear to begin with especially when we are using it to extrapolate data points that are missing. We can also use it to determine how strong the relationship between to variables used.
The mathematical formula for simple linear regression is \[y = \beta_0 + \beta_1 * X_1 + \varepsilon\].
\[\varepsilon \sim \mathcal{N}(0,\sigma^2)\]
| YearsExperience | Salary |
|---|---|
| 1.2 | 39344 |
| 1.4 | 46206 |
| 1.6 | 37732 |
| 2.1 | 43526 |
| 2.3 | 39892 |
x = salary_data\(YearsExperience y = salary_data\)Salary
mod = lm(y~x) xax <- list( title = “Years Experience”, titlefont = list(family=“Modern Computer Roman”) )
yax <- list( title = “Salary”, titlefont = list(family=“Modern Computer Roman”) ) graph <- plot_ly(salary_data, x=x, y = y, type = “scatter”, mode = “markers”) %>% add_lines(x = x, y = fitted(mod)) %>% layout(xaxis = xax, yaxis = yax) config(graph, displaylogo = FALSE)
The below table shows the key values for the variables in the equation used for linear regression \[y = \beta_0 + \beta_1X_1\]
## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 24848.204 2306.6537 10.77240 1.816526e-11 ## x 9449.962 378.7546 24.95009 1.143068e-20
In conclusion we can see that there is a strong relationship bewtween the years of experience and the salary that people make. \(\newline\) Thus we can draw a linear relationship that will give us an idea of what values we can expect for certain values of the indepdent variable