Call:
lm(formula = Overall_Risk_Score ~ Air_Pollution + Alcohol_Use + 
    Obesity + Occupational_Hazards, data = cancer_data)
Residuals:
      Min        1Q    Median        3Q       Max 
-0.305332 -0.055599 -0.002445  0.056468  0.263308 
Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)          0.1688402  0.0064234   26.29   <2e-16 ***
Air_Pollution        0.0178606  0.0005837   30.60   <2e-16 ***
Alcohol_Use          0.0134763  0.0005705   23.62   <2e-16 ***
Obesity              0.0103827  0.0006076   17.09   <2e-16 ***
Occupational_Hazards 0.0121961  0.0005797   21.04   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.08296 on 1995 degrees of freedom
Multiple R-squared:  0.5466,    Adjusted R-squared:  0.5457 
F-statistic: 601.2 on 4 and 1995 DF,  p-value: < 2.2e-16
- Air pollution, alcohol use, obesity, and occupational hazards all demonstrated a very small p-value; we can conclude there is a statistically significant effect on the overall risk score that isn’t a result of chance.
 
- Additionally, all the variables have high t values (cutoff for signficance is 0.05, so for t we used the cutoff 2), which further supports that each risk factor has a legitimate effect on overall risk score.
 
- Lastly, the linear model explains 50% of the variation (shown by r-squared being approximately 0.55).