library(ggplot2)
library(dplyr)
Make sure your data and R Markdown files are in the same directory. When loaded your data file will be called brfss2013. Delete this note when before you submit your work.
load("new_data")
str(brfss2013)
## 'data.frame': 491775 obs. of 330 variables:
## $ X_state : Factor w/ 55 levels "0","Alabama",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ fmonth : Factor w/ 12 levels "January","February",..: 1 1 1 1 2 3 3 3 4 4 ...
## $ idate : int 1092013 1192013 1192013 1112013 2062013 3272013 3222013 3042013 4242013 4242013 ...
## $ imonth : Factor w/ 12 levels "January","February",..: 1 1 1 1 2 3 3 3 4 4 ...
## $ iday : Factor w/ 31 levels "1","2","3","4",..: 9 19 19 11 6 27 22 4 24 24 ...
## $ iyear : Factor w/ 2 levels "2013","2014": 1 1 1 1 1 1 1 1 1 1 ...
## $ dispcode : Factor w/ 2 levels "Completed interview",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ seqno : int 2013000580 2013000593 2013000600 2013000606 2013000608 2013000630 2013000634 2013000644 2013001305 2013001338 ...
## $ X_psu : int 2013000580 2013000593 2013000600 2013000606 2013000608 2013000630 2013000634 2013000644 2013001305 2013001338 ...
## $ ctelenum : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ pvtresd1 : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ colghous : Factor w/ 1 level "Yes": NA NA NA NA NA NA NA NA NA NA ...
## $ stateres : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ cellfon3 : Factor w/ 1 level "Not a cellular phone": 1 1 1 1 1 1 1 1 1 1 ...
## $ ladult : Factor w/ 2 levels "Yes, male respondent",..: NA NA NA NA NA NA NA NA NA NA ...
## $ numadult : Factor w/ 19 levels "1","2","3","4",..: 2 2 3 2 2 1 2 1 5 2 ...
## $ nummen : Factor w/ 14 levels "0","1","2","3",..: 2 2 3 2 2 1 2 1 5 2 ...
## $ numwomen : Factor w/ 12 levels "0","1","2","3",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ genhlth : Factor w/ 5 levels "Excellent","Very good",..: 4 3 3 2 3 2 4 3 1 3 ...
## $ physhlth : int 30 0 3 2 10 0 1 5 0 0 ...
## $ menthlth : int 29 0 2 0 2 0 15 0 0 0 ...
## $ poorhlth : int 30 NA 0 0 0 NA 0 10 NA NA ...
## $ hlthpln1 : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ persdoc2 : Factor w/ 3 levels "Yes, only one",..: 1 1 1 1 1 1 2 1 1 1 ...
## $ medcost : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ checkup1 : Factor w/ 5 levels "Within past year",..: 1 1 1 2 4 1 1 1 1 1 ...
## $ sleptim1 : int NA 6 9 8 6 8 7 6 8 8 ...
## $ bphigh4 : Factor w/ 4 levels "Yes","Yes, but female told only during pregnancy",..: 1 3 3 3 1 1 1 1 3 3 ...
## $ bpmeds : Factor w/ 2 levels "Yes","No": 1 NA NA NA 2 1 1 1 NA NA ...
## $ bloodcho : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ cholchk : Factor w/ 4 levels "Within past year",..: 1 1 4 1 2 1 1 1 1 1 ...
## $ toldhi2 : Factor w/ 2 levels "Yes","No": 1 2 2 1 2 1 2 1 1 2 ...
## $ cvdinfr4 : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ cvdcrhd4 : Factor w/ 2 levels "Yes","No": NA 2 2 2 2 2 2 1 2 2 ...
## $ cvdstrk3 : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ asthma3 : Factor w/ 2 levels "Yes","No": 1 2 2 2 1 2 2 2 2 2 ...
## $ asthnow : Factor w/ 2 levels "Yes","No": 1 NA NA NA 2 NA NA NA NA NA ...
## $ chcscncr : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ chcocncr : Factor w/ 2 levels "Yes","No": 2 2 2 2 1 2 2 2 2 2 ...
## $ chccopd1 : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 2 2 2 2 2 ...
## $ havarth3 : Factor w/ 2 levels "Yes","No": 1 2 1 2 2 2 1 1 1 2 ...
## $ addepev2 : Factor w/ 2 levels "Yes","No": 1 1 1 2 2 2 2 2 2 2 ...
## $ chckidny : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 2 2 2 2 2 ...
## $ diabete3 : Factor w/ 4 levels "Yes","Yes, but female told only during pregnancy",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ veteran3 : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ marital : Factor w/ 6 levels "Married","Divorced",..: 2 1 1 1 1 2 1 3 1 1 ...
## $ children : int 0 2 0 0 0 0 1 0 1 0 ...
## $ educa : Factor w/ 6 levels "Never attended school or only kindergarten",..: 6 5 6 4 6 6 4 5 6 4 ...
## $ employ1 : Factor w/ 8 levels "Employed for wages",..: 7 1 1 7 7 1 1 7 7 5 ...
## $ income2 : Factor w/ 8 levels "Less than $10,000",..: 7 8 8 7 6 8 NA 6 8 4 ...
## $ weight2 : Factor w/ 570 levels "",".b","100",..: 154 30 63 31 169 128 9 1 139 73 ...
## $ height3 : int 507 510 504 504 600 503 500 505 602 505 ...
## $ numhhol2 : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 1 2 2 2 2 ...
## $ numphon2 : Factor w/ 6 levels "1 residential telephone number",..: 2 NA NA NA NA 1 NA NA NA NA ...
## $ cpdemo1 : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ cpdemo4 : int 10 70 70 75 0 70 40 1 60 50 ...
## $ internet : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ renthom1 : Factor w/ 3 levels "Own","Rent","Other arrangement": 1 1 1 1 1 1 1 2 1 1 ...
## $ sex : Factor w/ 2 levels "Male","Female": 2 2 2 2 1 2 2 2 1 2 ...
## $ pregnant : Factor w/ 2 levels "Yes","No": NA NA NA NA NA NA 2 NA NA NA ...
## $ qlactlm2 : Factor w/ 2 levels "Yes","No": 1 2 1 2 2 2 1 1 2 2 ...
## $ useequip : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 2 2 2 2 2 ...
## $ blind : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ decide : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ diffwalk : Factor w/ 2 levels "Yes","No": 1 2 1 2 2 2 2 1 2 2 ...
## $ diffdres : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
## $ diffalon : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 2 2 2 2 2 ...
## $ smoke100 : Factor w/ 2 levels "Yes","No": 1 2 1 2 1 2 1 1 2 2 ...
## $ smokday2 : Factor w/ 3 levels "Every day","Some days",..: 3 NA 2 NA 3 NA 3 1 NA NA ...
## $ stopsmk2 : Factor w/ 2 levels "Yes","No": NA NA 1 NA NA NA NA 2 NA NA ...
## $ lastsmk2 : Factor w/ 8 levels "Within the past month",..: 7 NA NA NA 1 NA 5 NA NA NA ...
## $ usenow3 : Factor w/ 3 levels "Every day","Some days",..: 3 3 3 3 3 3 3 3 1 3 ...
## $ alcday5 : int 201 0 220 208 210 0 201 202 101 0 ...
## $ avedrnk2 : int 2 NA 4 2 2 NA 1 1 1 NA ...
## $ drnk3ge5 : int 0 NA 20 0 0 NA 0 0 0 NA ...
## $ maxdrnks : int 2 NA 10 2 3 NA 1 1 2 NA ...
## $ fruitju1 : int 304 305 301 202 0 205 320 0 0 202 ...
## $ fruit1 : int 104 301 203 306 302 206 325 320 101 202 ...
## $ fvbeans : int 303 310 202 202 101 0 330 360 202 203 ...
## $ fvgreen : int 310 203 202 310 310 203 315 315 203 201 ...
## $ fvorang : int 303 202 310 305 303 0 310 325 0 201 ...
## $ vegetab1 : int NA 203 330 204 101 207 310 308 101 203 ...
## $ exerany2 : Factor w/ 2 levels "Yes","No": 2 1 2 1 2 1 1 1 1 1 ...
## $ exract11 : Factor w/ 75 levels "Active Gaming Devices (Wii Fit, Dance, Dance revolution)",..: NA 64 NA 64 NA 6 64 64 7 64 ...
## $ exeroft1 : int NA 105 NA 205 NA 102 220 102 102 220 ...
## $ exerhmm1 : int NA 20 NA 30 NA 15 100 15 100 30 ...
## $ exract21 : Factor w/ 76 levels "Active Gaming Devices (Wii Fit, Dance, Dance revolution)",..: NA 71 NA 75 NA 18 75 75 75 18 ...
## $ exeroft2 : int NA 101 NA NA NA 102 NA NA NA 101 ...
## $ exerhmm2 : int NA 10 NA NA NA 30 NA NA NA 100 ...
## $ strength : int 0 0 0 0 0 0 205 0 102 0 ...
## $ lmtjoin3 : Factor w/ 2 levels "Yes","No": 1 NA 1 NA NA NA 2 1 2 NA ...
## $ arthdis2 : Factor w/ 2 levels "Yes","No": 1 NA 1 NA NA NA 1 2 2 NA ...
## $ arthsocl : Factor w/ 3 levels "A lot","A little",..: 1 NA 2 NA NA NA 3 1 3 NA ...
## $ joinpain : int 7 NA 5 NA NA NA 3 8 4 NA ...
## $ seatbelt : Factor w/ 6 levels "Always","Nearly always",..: 1 1 1 1 1 1 1 1 2 1 ...
## $ flushot6 : Factor w/ 2 levels "Yes","No": 2 1 1 2 2 1 2 1 1 2 ...
## $ flshtmy2 : Factor w/ 26 levels "January 2012",..: NA 10 13 NA NA NA NA 10 10 NA ...
## $ tetanus : Factor w/ 4 levels "Yes, received Tdap",..: 4 1 1 4 4 4 4 4 1 4 ...
## $ pneuvac3 : Factor w/ 2 levels "Yes","No": 1 2 2 2 2 1 2 2 2 2 ...
## [list output truncated]
The data is described a follows (quoted directly from the CDC website): “The Behavioral RIsk Factor Surveillance System (BRFSS) is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventative services. Established in 1984 with 15 states, BRFSS now collects data in all 50 states as well as teh District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.”
Generalizability: Survey data were collected from all 50 states and U.S. territories, which makes the data seem good enough of a random sample to make it generalizable to the U.S. population as a whole.
Causality: Since the data obtained and methodology used makes this an observational exercise - that is, all participants were not subject to assigned treatment and control groups - causation can’t be assume - only correlation can be measured.
Further, any efforts to determine causality between two or more variables should be more reliable than other, smaller surveys. However, caution is warranted due to the nature of collecting the information through the telephone, where some reposndents may have been reluctant to share personal information or not being fully truthful with their answers. The telephone collection only method is obviously limited to those who possess either a lnd based telephone or a cellphone
Methodology used, how the samples were collected, is quoted directly from the CDC website below: “BRFSS is a cross-sectional telephone survey that state health departments conduct monthly over landline telephones and cellular telephones with a standardized questionnaire and technical and methodological assistance from CDC. In conducting the BRFSS landline telephone survey, interviewers collect data from a randomly selected adult in a household. In conducting the cellular telephone version of the BRFSS questionnaire, interviewers collect data from an adult who participates by using a cellular telephone and resides in a private residence or college housing.”
Generalizability: Survey data were collected from all 50 states and U.S. territories, which makes the data seem good enough of a random sample to make it generalizable to the U.S. population as a whole. Causality: Since the data obtained and methodology used makes this an observational exercise - that is, all participants were not subject to assigned treatment and control groups - causation can’t be assume - only correlation can be measured.
For future reference, it would be useful if the dataset included details about each interview, such as the time of day the data was collected and the duration of the interviews. THese additional pieces of information would provide further insight about those who may or may not have taken part in the survey.
The BRFSS is an ongoing surveillance system designed to measure behavioral risk factors for the non-institutionalized adult population. The objective is to collect uniform, state-specific data on preventitive health practices and risk behaviors that are linked to chronic diseases, injuries and preventable infectious diseases.
Telephone surveys are the basis of data collection. BRFSS collects health-related information through more than 400,000 telephone surveys of U.S. residents in all 50 states. It is the largest continuously conducted health survey system in the world.
Given the size of the sample study and the geographic breadth of the repondents the data is likely the most representative survey of the total U.S. population that exists. As such, any inferences made using this data shold be generalizable to the larger U.S. population.
Research question 1:
Is the sleep time, the diet on vegetables and exercises good to reduce Asthma?
The variables to use for this question are:
asattack: Asthma During Past 12 Months
exerany2: Exercise In Past 30 Days
fruit1: How Many Times Did You Eat Fruit?
sleptim1: How Much Time Do You Sleep
Research question 2:
Is education and employment status really necessary to get access to Health Insurance?
The variables to use for this question are:
hlthcvrg: Health Insurance Coverage
educa: Education Level
employ1: Employment Status
income2: Income Level
Research question 3:
Are the social activities causes by Arthritis Burden be less hard when people make exercises but also continue smoking?
The variables to use for this question are:
smoke100: Smoked At Least 100 Cigarettes
exeroft2: How Many Times Walking, Running, Jogging, Or Swimming
lmtjoin3: Limited Because Of Joint Symptoms
arthsocl: Social Activities Limited Because Of Joint Symptoms * * *
NOTE: Insert code chunks as needed by clicking on the “Insert a new code chunk” button (green button with orange arrow) above. Make sure that your code is visible in the project you submit. Delete this note when before you submit your work.
Research question 1:
For this question we need to explore the data at first:
Is the sleep time, the diet on vegetables and exercises good to reduce Asthma?
The variables to use for this question are:
asattack: Asthma During Past 12 Months
exerany2: Exercise In Past 30 Days
fruit1: How Many Times Did You Eat Fruit?
sleptim1: How Much Time Do You Sleep
brfss2013 %>% ggplot(aes(exerany2, asattack)) + geom_point() + ggtitle(" Asthma vs Exercises")
brfss2013 %>% ggplot(aes(fruit1, asattack)) + geom_point() + ggtitle(" Asthma vs Consumption of Fruits")
## Warning: Removed 33798 rows containing missing values (geom_point).
brfss2013 %>% ggplot(aes(sleptim1, asattack)) + geom_point() + ggtitle(" Asthma vs Sleep Time")
## Warning: Removed 7387 rows containing missing values (geom_point).
So we can see that there are relationship between this variables: so fitting a model:
newdata <- brfss2013 %>% select(asattack,exerany2,fruit1,sleptim1) %>%
filter(asattack != is.na(asattack)) %>%
filter(exerany2 != is.na(exerany2)) %>%
filter(fruit1 != is.na(fruit1)) %>%
filter(sleptim1 != is.na(sleptim1))
newdata %>% ggplot(aes(exerany2, asattack)) + geom_point() + ggtitle(" Asthma vs Exercises") + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
newdata %>% ggplot(aes(fruit1, asattack)) + geom_point() + ggtitle(" Asthma vs Consumption of Fruits") + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
newdata %>% ggplot(aes(sleptim1, asattack)) + geom_point() + ggtitle(" Asthma vs Sleep Time") + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
head(newdata)
## asattack exerany2 fruit1 sleptim1
## 1 Yes Yes 101 6
## 2 Yes Yes 315 10
## 3 No No 101 6
## 4 Yes No 305 6
## 5 Yes Yes 314 8
## 6 Yes Yes 205 3
fit <- lm(as.numeric(asattack) ~ as.numeric(exerany2) + fruit1 + sleptim1, data= newdata)
summary(fit)
##
## Call:
## lm(formula = as.numeric(asattack) ~ as.numeric(exerany2) + fruit1 +
## sleptim1, data = newdata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.7947 -0.4849 -0.3902 0.5135 0.6705
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.5141293 0.0855765 17.693 <2e-16 ***
## as.numeric(exerany2) -0.0504127 0.0344072 -1.465 0.1432
## fruit1 -0.0003278 0.0001851 -1.771 0.0769 .
## sleptim1 0.0151839 0.0086396 1.757 0.0792 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4983 on 907 degrees of freedom
## Multiple R-squared: 0.009833, Adjusted R-squared: 0.006558
## F-statistic: 3.002 on 3 and 907 DF, p-value: 0.02972
We can see a negative relationship between eating fruits and the asthma, equal to exercises, it indicates that a more fruits that we eat and more exercises that we do, we can probably have less opportunities to get asthma, but if you sleep more than you have to, it could give us some problems.
The residuals of the model:
library(car)
## Warning: package 'car' was built under R version 4.0.2
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
residualPlots(fit)
## Test stat Pr(>|Test stat|)
## as.numeric(exerany2) -0.5132 0.6079
## fruit1 -0.7166 0.4738
## sleptim1 -0.9648 0.3349
## Tukey test -0.1227 0.9023
Research question 2:
Is education and employment status really necessary to get access to Health Insurance?
The variables to use for this question are:
hlthcvrg: Health Insurance Coverage
educa: Education Level
employ1: Employment Status
income2: Income Level
newdata2 <- brfss2013 %>% select(hlthcvrg, educa, employ1, income2) %>%
filter(hlthcvrg != is.na(hlthcvrg),
educa != is.na(educa),
employ1 != is.na(employ1),
income2 != is.na(income2))
head(newdata2)
## hlthcvrg educa
## 1 3 7 College 4 years or more (College graduate)
## 2 2 College 1 year to 3 years (Some college or technical school)
## 3 3 College 4 years or more (College graduate)
## 4 3 Grade 12 or GED (High school graduate)
## 5 4 2 College 4 years or more (College graduate)
## 6 1 College 4 years or more (College graduate)
## employ1 income2
## 1 Retired Less than $75,000
## 2 Employed for wages $75,000 or more
## 3 Employed for wages $75,000 or more
## 4 Retired Less than $75,000
## 5 Retired Less than $50,000
## 6 Employed for wages $75,000 or more
newdata2 %>% ggplot(aes(employ1, hlthcvrg, color = educa)) + geom_point() + facet_grid(.~ income2)
For our model, we need to fit next:
fit2 <- lm(as.numeric(hlthcvrg)~ as.numeric(educa), as.numeric(employ1), as.numeric(income2), data= newdata2)
summary(fit2)
##
## Call:
## lm(formula = as.numeric(hlthcvrg) ~ as.numeric(educa), data = newdata2,
## subset = as.numeric(employ1), weights = as.numeric(income2))
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -194.54 10.00 42.19 42.91 59.33
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 282.60381 0.44051 641.5 <2e-16 ***
## as.numeric(educa) 11.86264 0.07804 152.0 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 70.89 on 418654 degrees of freedom
## Multiple R-squared: 0.05231, Adjusted R-squared: 0.0523
## F-statistic: 2.311e+04 on 1 and 418654 DF, p-value: < 2.2e-16
As we can see the more education you have the more access to health insurance you will have.
residualPlots(fit2)
## Test stat Pr(>|Test stat|)
## as.numeric(educa) 688.31 < 2.2e-16 ***
## Tukey test 532.22 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Research question 3:
Are the social activities causes by Arthritis Burden be less hard when people make exercises but also continue smoking?
The variables to use for this question are:
smoke100: Smoked At Least 100 Cigarettes
exeroft2: How Many Times Walking, Running, Jogging, Or Swimming
arthsocl: Social Activities Limited Because Of Joint Symptoms
newdata3 <- brfss2013 %>% select(arthsocl, smoke100, exeroft2) %>% filter(arthsocl != is.na(arthsocl),
exeroft2 != is.na(exeroft2),
smoke100 != is.na(smoke100)
)
head(newdata3)
## arthsocl smoke100 exeroft2
## 1 Not at all Yes 103
## 2 Not at all No 102
## 3 Not at all Yes 203
## 4 A little No 107
## 5 A lot Yes 107
## 6 Not at all Yes 103
newdata3 %>% ggplot(aes(exeroft2, arthsocl, color= smoke100)) + geom_point() + geom_smooth()+ facet_wrap(smoke100~.)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf en llamada a una función externa (arg 3)
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf en llamada a una función externa (arg 3)
So our final model is :
fit3 <- lm(as.numeric(arthsocl)~ as.numeric(smoke100)+ exeroft2, data= newdata3)
summary(fit3)
##
## Call:
## lm(formula = as.numeric(arthsocl) ~ as.numeric(smoke100) + exeroft2,
## data = newdata3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6272 -0.5399 0.3738 0.4598 0.5222
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.486e+00 1.154e-02 215.366 < 2e-16 ***
## as.numeric(smoke100) 8.663e-02 5.414e-03 16.003 < 2e-16 ***
## exeroft2 -3.168e-04 5.206e-05 -6.084 1.18e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6683 on 61088 degrees of freedom
## Multiple R-squared: 0.004936, Adjusted R-squared: 0.004903
## F-statistic: 151.5 on 2 and 61088 DF, p-value: < 2.2e-16
and the residuals:
residualPlots(fit3)
## Test stat Pr(>|Test stat|)
## as.numeric(smoke100) 0.8845 0.3764422
## exeroft2 -4.9161 8.851e-07 ***
## Tukey test -3.7337 0.0001887 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can see a negative relationship between eating fruits and the asthma, equal to exercises, it indicates that a more fruits that we eat and more exercises that we do, we can probably have less opportunities to get asthma, but if you sleep more than you have to, it could give us some problems.
As we can see the more education you have the more access to health insurance you will have.
The more time you spent smoking the more social problems will you have, and the more exercises you do the better social life you will have.