Simple Linear Regression

Problem Defination

Check if there is good correlation in the above dataset and if it can be used for regression model
If yes, predict weight for the following heights 160, 170, 180

Dataset

# height in cms
hght <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131, 153, 177, 148, 189, 138, 146, 199, 167, 153, 130)
# weight in kgs
wght <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48, 65, 84, 59, 93, 49, 55, 79, 75, 66, 49)

Setup

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
data <- data.frame(hght,wght)
head(data)
##   hght wght
## 1  151   63
## 2  174   81
## 3  138   56
## 4  186   91
## 5  128   47
## 6  136   57
cor.test(data$hght,data$wght)
## 
##  Pearson's product-moment correlation
## 
## data:  data$hght and data$wght
## t = 12.215, df = 18, p-value = 3.788e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8627911 0.9782375
## sample estimates:
##      cor 
## 0.944644
### visualising correlation
ggplot(data,aes(data$hght,data$wght)) + geom_point(shape = 19,colour = 'red',fill = 'red') +
  geom_smooth(method= 'lm',formula = y~x)

Observation It is observed that as height increases weight also increases

Linear Model

x <- data$hght
y <- data$wght
model <- lm(y~x)
summary(model)
## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.1573  -1.7267   0.7701   2.6045   6.2102 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -33.55669    8.25032  -4.067 0.000723 ***
## x             0.63675    0.05213  12.215 3.79e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.846 on 18 degrees of freedom
## Multiple R-squared:  0.8924, Adjusted R-squared:  0.8864 
## F-statistic: 149.2 on 1 and 18 DF,  p-value: 3.788e-10
test <- data.frame(hght = c(160, 170, 180),stringsAsFactors = F)
names(test) <- c('x')
predictions <- predict(model, test)
print(predictions)
##        1        2        3 
## 68.32394 74.69148 81.05902