This practice will be focussing on simple linear regression model. The regression model is basically to answer four broad questions: What is the nature of association between x and y?; Is there an association between x and y?; What is the expected response of a new given predictor values? and what is the expected change of mean response variable for each unit increase or decrease (change) in predictor value?
In this practice, we will work with a datset called Animals
from MASS
package.
Simple linear regression model follows the mathametical equation:
\[ y_{i} = E(Y_{i}) +\epsilon_{i} = \beta_{0} + \beta_{1}{x_{i}} +\epsilon_{i} \ (1) \]
The equation (1) is formed as population regression line, but we often don’t know the whole population. Therefore, we have to reply on a sample of data from the population to estimate the population regression line
Now we will use a similar equation, but estimate sample regression line. The following equation supports that idea
\[ y_{i}^{'}= \alpha + \ b_{1}x_{1} +\epsilon_{1} \ (2) \] As we notice that we will use the sample intercept (2) to estimae the population intercept(1), and the slope (2) to estimate population slope (1)
Before fitting the simple linear regression, it is important to clarify that error terms are LINE
conditions, and the model should satisfy all these conditions.If you want to understand more about LINE
conditions, here is LINE condition
Plotting the relationship between brain
and body
# Get to know the dataset
library(MASS)
head(Animals)

It is observed that there is week or no linear relationship between brain
and body
. But how about using log transformation for both variables
plot(brain~body,data=Animals,main="Brain vs Body",col=4,pch=16,log="xy")

Oh, It looks much better and there is a positively linear relationship between natural log of body and natural log of brain
Now fitting a simple linear regression
m1<-lm(log(brain)~log(body),data=Animals)
summary(m1)
Call:
lm(formula = log(brain) ~ log(body), data = Animals)
Residuals:
Min 1Q Median 3Q Max
-3.2890 -0.6763 0.3316 0.8646 2.5835
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.55490 0.41314 6.184 1.53e-06 ***
log(body) 0.49599 0.07817 6.345 1.02e-06 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.532 on 26 degrees of freedom
Multiple R-squared: 0.6076, Adjusted R-squared: 0.5925
F-statistic: 40.26 on 1 and 26 DF, p-value: 1.017e-06
We may be interested in calculating CI for model
confint(m1,level = 0.95)
2.5 % 97.5 %
(Intercept) 1.7056829 3.4041133
log(body) 0.3353152 0.6566742
As the model coefficients reported p-value<0.05, there is a significant evidence that there is a linear association bt natural log of body and natural log of brain
The main objective of model is the prediction. Thus, we will make a prediction about brain when a given body value is 32.
new_point<-data.frame(body=c(32))
pre_value<-predict(m1,new_point)
# But this gave the value of natural log of brain, to get brain value we need to use formula as follow
2.7^pre_value
1
69.7588
We can calculate CI of a given value
predict(m1,new_point,interval = "confidence")
fit lwr upr
1 4.273885 3.676913 4.870857
Calculate a prediction interval of a given value
predict(m1,new_point,interval = "prediction")
fit lwr upr
1 4.273885 1.069607 7.478162
The simple linear regression model is very useful in many cases.
LS0tDQp0aXRsZTogIlNpbXBsZSBMaW5lYXIgUmVncmVzc2lvbiINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNCiBUaGlzIHByYWN0aWNlIHdpbGwgYmUgZm9jdXNzaW5nIG9uIHNpbXBsZSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbC4gVGhlIHJlZ3Jlc3Npb24gbW9kZWwgaXMgYmFzaWNhbGx5IHRvIGFuc3dlciBmb3VyIGJyb2FkIHF1ZXN0aW9uczogV2hhdCBpcyB0aGUgbmF0dXJlIG9mIGFzc29jaWF0aW9uIGJldHdlZW4gKngqIGFuZCAqeSo/OyBJcyB0aGVyZSBhbiBhc3NvY2lhdGlvbiBiZXR3ZWVuICp4KiBhbmQgKnkqPzsgV2hhdCBpcyB0aGUgZXhwZWN0ZWQgcmVzcG9uc2Ugb2YgYSBuZXcgZ2l2ZW4gcHJlZGljdG9yIHZhbHVlcz8gYW5kIHdoYXQgaXMgdGhlIGV4cGVjdGVkIGNoYW5nZSBvZiBtZWFuIHJlc3BvbnNlIHZhcmlhYmxlIGZvciBlYWNoIHVuaXQgaW5jcmVhc2Ugb3IgZGVjcmVhc2UgKGNoYW5nZSkgaW4gcHJlZGljdG9yIHZhbHVlPw0KIA0KIEluIHRoaXMgcHJhY3RpY2UsIHdlIHdpbGwgd29yayB3aXRoIGEgZGF0c2V0IGNhbGxlZCBgQW5pbWFsc2AgZnJvbSBgTUFTU2AgcGFja2FnZS4NCiANCiBTaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24gbW9kZWwgZm9sbG93cyB0aGUgbWF0aGFtZXRpY2FsIGVxdWF0aW9uOg0KIA0KICQkIHlfe2l9ID0gRShZX3tpfSkgK1xlcHNpbG9uX3tpfSA9IFxiZXRhX3swfSArIFxiZXRhX3sxfXt4X3tpfX0gK1xlcHNpbG9uX3tpfSAgXCAoMSkgICAgICAgICAgJCQNCiANCiBUaGUgZXF1YXRpb24gKDEpIGlzIGZvcm1lZCBhcyBwb3B1bGF0aW9uIHJlZ3Jlc3Npb24gbGluZSwgYnV0IHdlIG9mdGVuIGRvbid0IGtub3cgdGhlIHdob2xlIHBvcHVsYXRpb24uIFRoZXJlZm9yZSwgd2UgaGF2ZSB0byByZXBseSBvbiBhIHNhbXBsZSBvZiBkYXRhIGZyb20gdGhlIHBvcHVsYXRpb24gdG8gZXN0aW1hdGUgdGhlIHBvcHVsYXRpb24gcmVncmVzc2lvbiBsaW5lDQogDQogTm93IHdlIHdpbGwgdXNlIGEgc2ltaWxhciBlcXVhdGlvbiwgYnV0IGVzdGltYXRlIHNhbXBsZSByZWdyZXNzaW9uIGxpbmUuIFRoZSBmb2xsb3dpbmcgZXF1YXRpb24gc3VwcG9ydHMgdGhhdCBpZGVhDQogDQogJCQgeV97aX1eeyd9PSBcYWxwaGEgKyBcIGJfezF9eF97MX0gK1xlcHNpbG9uX3sxfSBcICgyKSAgICAgICAgICAgICAgICQkDQogQXMgd2Ugbm90aWNlIHRoYXQgd2Ugd2lsbCB1c2UgdGhlIHNhbXBsZSBpbnRlcmNlcHQgKDIpIHRvIGVzdGltYWUgdGhlIHBvcHVsYXRpb24gaW50ZXJjZXB0KDEpLCBhbmQgdGhlIHNsb3BlICgyKSB0byBlc3RpbWF0ZSBwb3B1bGF0aW9uIHNsb3BlICgxKQ0KIA0KIEJlZm9yZSBmaXR0aW5nIHRoZSBzaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24sIGl0IGlzIGltcG9ydGFudCB0byBjbGFyaWZ5IHRoYXQgZXJyb3IgdGVybXMgYXJlIGBMSU5FYCBjb25kaXRpb25zLCBhbmQgdGhlIG1vZGVsIHNob3VsZCBzYXRpc2Z5IGFsbCB0aGVzZSBjb25kaXRpb25zLklmIHlvdSB3YW50IHRvIHVuZGVyc3RhbmQgbW9yZSBhYm91dCBgTElORWAgY29uZGl0aW9ucywgaGVyZSBpcyBbTElORSBjb25kaXRpb25dKGh0dHA6Ly93d3cuc3RhdGlzdGljc3NvbHV0aW9ucy5jb20vYXNzdW1wdGlvbnMtb2YtbGluZWFyLXJlZ3Jlc3Npb24vKQ0KIA0KIFBsb3R0aW5nIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBgYnJhaW5gIGFuZCBgYm9keWANCiANCmBgYHtyIG1lc3NhZ2U9VFJVRX0NCiMgR2V0IHRvIGtub3cgdGhlIGRhdGFzZXQNCg0KbGlicmFyeShNQVNTKQ0KDQpoZWFkKEFuaW1hbHMpDQoNCmBgYA0KDQpgYGAge3J9DQoNCiMgRml0IGEgc2ltcGxlIGxpbmVhciByZWdyZXNzaW9uIG1vZGVsDQoNCnBsb3QoYnJhaW5+Ym9keSxkYXRhPUFuaW1hbHMsbWFpbj0iQnJhaW4gdnMgQm9keSIsY29sPTQscGNoPTE2KQ0KYGBgDQogDQogSXQgaXMgb2JzZXJ2ZWQgdGhhdCB0aGVyZSBpcyB3ZWVrIG9yIG5vIGxpbmVhciByZWxhdGlvbnNoaXAgYmV0d2VlbiBgYnJhaW5gIGFuZCBgYm9keWAuIEJ1dCBob3cgYWJvdXQgdXNpbmcgbG9nIHRyYW5zZm9ybWF0aW9uIGZvciBib3RoIHZhcmlhYmxlcw0KDQpgYGB7cn0NCg0KcGxvdChicmFpbn5ib2R5LGRhdGE9QW5pbWFscyxtYWluPSJCcmFpbiB2cyBCb2R5Iixjb2w9NCxwY2g9MTYsbG9nPSJ4eSIpDQoNCmBgYA0KDQogT2gsIEl0IGxvb2tzIG11Y2ggYmV0dGVyIGFuZCB0aGVyZSBpcyBhIHBvc2l0aXZlbHkgbGluZWFyIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIG5hdHVyYWwgbG9nIG9mIGJvZHkgYW5kIG5hdHVyYWwgbG9nIG9mIGJyYWluDQogDQogTm93IGZpdHRpbmcgYSBzaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24NCg0KDQpgYGB7cn0NCg0KbTE8LWxtKGxvZyhicmFpbil+bG9nKGJvZHkpLGRhdGE9QW5pbWFscykNCg0Kc3VtbWFyeShtMSkNCmBgYA0KIFdlIG1heSBiZSBpbnRlcmVzdGVkIGluIGNhbGN1bGF0aW5nIENJIGZvciBtb2RlbA0KIA0KDQpgYGB7cn0NCmNvbmZpbnQobTEsbGV2ZWwgPSAwLjk1KQ0KYGBgDQoNCiAgQXMgdGhlIG1vZGVsIGNvZWZmaWNpZW50cyByZXBvcnRlZCBwLXZhbHVlPDAuMDUsIHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQgZXZpZGVuY2UgdGhhdCB0aGVyZSBpcyBhIGxpbmVhciBhc3NvY2lhdGlvbiBidCBuYXR1cmFsIGxvZyBvZiBib2R5IGFuZCBuYXR1cmFsIGxvZyBvZiBicmFpbg0KIA0KIFRoZSBtYWluIG9iamVjdGl2ZSBvZiBtb2RlbCBpcyB0aGUgcHJlZGljdGlvbi4gVGh1cywgd2Ugd2lsbCBtYWtlIGEgcHJlZGljdGlvbiBhYm91dCBicmFpbiB3aGVuIGEgZ2l2ZW4gYm9keSB2YWx1ZSBpcyAzMi4NCiANCiANCmBgYHtyfQ0KbmV3X3BvaW50PC1kYXRhLmZyYW1lKGJvZHk9YygzMikpDQoNCnByZV92YWx1ZTwtcHJlZGljdChtMSxuZXdfcG9pbnQpDQoNCiMgQnV0IHRoaXMgZ2F2ZSB0aGUgdmFsdWUgb2YgbmF0dXJhbCBsb2cgb2YgYnJhaW4sIHRvIGdldCBicmFpbiB2YWx1ZSB3ZSBuZWVkIHRvIHVzZSBmb3JtdWxhIGFzIGZvbGxvdw0KDQoyLjdecHJlX3ZhbHVlDQoNCmBgYA0KICBXZSBjYW4gY2FsY3VsYXRlIENJIG9mIGEgZ2l2ZW4gdmFsdWUNCiANCmBgYHtyfQ0KDQpwcmVkaWN0KG0xLG5ld19wb2ludCxpbnRlcnZhbCA9ICJjb25maWRlbmNlIikNCg0KYGBgDQoNCkNhbGN1bGF0ZSBhIHByZWRpY3Rpb24gaW50ZXJ2YWwgb2YgYSBnaXZlbiB2YWx1ZQ0KDQpgYGB7cn0NCnByZWRpY3QobTEsbmV3X3BvaW50LGludGVydmFsID0gInByZWRpY3Rpb24iKQ0KYGBgDQoNCiAqIENvbmNsdXNpb246DQogDQogVGhlIHNpbXBsZSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbCBpcyB2ZXJ5IHVzZWZ1bCBpbiBtYW55IGNhc2VzLiANCg==