LINEAR REGRESSION

IVY UKIRU S3810139

5/25/2020

INTRODUCTION

Specific body measurements can be used to determine the sizes of other body parts. This can be used in hospitals to make faster decisions about some equipment measurements. It can also be used to determine clothing sizes in some companies. The data was collected from the data repository for MATH 1324 APPLIED ANALYTICS in google drive

PROBLEM STATEMENT

This investigation will seek to understand if there is any statistical significant relationship between a person’s chest diameter (dependent y variable) and height (independent x variable).

A scatter plot will be used to asses the bivariate relationship.

Simple linear regression will be used to examine the reltionship.

Validation of the following assumptions will be carried out:

Independence Linearity Normality of residuals Homoscedasticity

SCATTER PLOT

It exhibits a positive linear bivariate relationship

FITTING REGRESSION

## 
## Call:
## lm(formula = che.di ~ hgt, data = bdims)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.3102 -1.4326 -0.0696  1.4168  6.8929 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -3.2947     1.7319  -1.902   0.0577 .  
## hgt           0.1827     0.0101  18.082   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.138 on 505 degrees of freedom
## Multiple R-squared:  0.393,  Adjusted R-squared:  0.3918 
## F-statistic:   327 on 1 and 505 DF,  p-value: < 2.2e-16
##                  2.5 %    97.5 %
## (Intercept) -6.6972252 0.1079121
## hgt          0.1628512 0.2025541

TESTED ASSUMPTIONS

Residuals vs Fitted/ Scale-Location : We can safely assume homoscedasticity Normal Q-Q plot : no ajor deviations from normality Residual vs Leverage : no evidence of influential cases

CONCLUSION

A linear regression model was fitted to predict the dependent variable, chest diameter, using measures of height as a single predictor. Prior to fitting the regression, a scatter plot assessing the bivariate relationship between chest diameter and height was inspected. The scatter plot demonstrated evidence of a positive linear relationship. Other non-linear trends were ruled out. The overall regression model was statistically significant, F(1,505)= 327, p<.001, and explained 39.3% of the variability in chest diameter, R2=.393. The estimated regression equation was che.di= -3.295 + 0.183*hgt. The positive slope for height was statistically significant, b=0.183, t(507)=18.082, p<.001, 95% CI [0.163, 0.203]. Final inspection of the residuals supported normality and homoscedasticity