- Statistical model to represent/estimate relationship between one or more variables
- Can take unknown dependent variable and predict it by using the known independent variable
How does this relate to Data Science?
We can use a combination of statistics and coding to calculate linear regression to predict useful information, given what we already know.
Examples:
- using rainfall to predict soil erosion
- using age to predict income
- using height to predict weight