Nishant Upadhyay
21/11/2015 2.30 pm
Correlation—Correlation is a bivariate analysis that describes the strength of linear association between two numerical variables.
Correlations has two properties: strength and direction.
Correlation is not causation
Let's take Shaquille O'Neal as an example. Shaq is really tall, 7 ft 1 in to be exact (for you metric fans that's about 2.2 meters)
Galton called this phenomenon regression, as in “A father's son's height tends to regress (or drift towards) the mean (average) height.”
Linear regression is an approach to model a relationship between a dependent variable (y) and one or more independent variables (x).
\[ Y_i=\beta_0 + \beta_1 X_i \]
A very simple example to understand how the relationship between two variables work. We will explore a relationship between Father's heigth and Son's height.
Regression technique tries to fit a single line through a scatter plot by the principle of minimising the squared sum of errors.
A critical question to ask at this point is why this red line?—why not green or blue line
how can we claim that this line is the best and what it means to be the best fit line?
One day, Simba, you will be 4' 10".
Concepts at this stage will be confusing/nagging– which is a good thing…..explore..read…ask…collaborate
Refer books: