Make sure to load the appropriate libraries and load the data before you begin your analysis. Knit the document early and often to make sure it will knit properly. You will print these documents to submit.
For this first assignments, I have loaded the data and packages for you in the library. In future assignment, use this section for loading the data and library.
This is a test output, you should see the output only when you knit the document.
There are 20 000 cases in this data set with 9 variables.
genhlth - categorical
exerany - categorical
hlthplan - categorical
smoke100 - categorical
height - numerical
weight - numerical
wdesire - numerical
age - numerical
gender - numerical
Numerical Summary - Height
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 48.00 64.00 67.00 67.18 70.00 93.00
Numerical Summary - Age
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 18.00 31.00 43.00 45.07 57.00 99.00
IQR - Weight
## [1] 50
IQR - Age
## [1] 26
##
## 0 1
## m 4547 5022
## f 6012 4419
This Plot shows us that on average, a random male is more likely to have smoked 100 cigarettes than a random female.
fdata25 <- cdc %>% filter(gender == “f”, age>25)
I chose to look at the relationship between age and weight, and can seee that there is a weak negative correlation. This means thta there is alot of variation of weights, being that it is a data set of 20 000. However, there is less people that weigh over 250 pounds after age 75, because their unhealthy lifestyles it costs them their lives.
This is a scatterplot looking at weight and desired weight. There is a medium-strong positive trend with this data set. Also, we can see that heavier people are more likely to have a desired weight with a much larger difference form their actual weight.
wdiff <- cdc\(wtdesire - cdc\)weight
“wdiff” is numerical data. It shows the difference between peoples weight and their desired weight. If wdiff=0, then the person is happy with their weight. If wdiff<0, they want to lose weight and if wdiff>0 they want to gain weight.
Here is a histogram that plots the frequency of how many pounds people want to lose or gain.
Shape - Unimodel, skewed right
Centre - Median of -10
Spread - IQR range=21
People on average want to lose bewtween 10 to 15 pounds because the median is -10 and the mean is -15.
Numerical summary for males.
## wdiff cdc.gender
## Min. :-300.00 m:9569
## 1st Qu.: -20.00 f: 0
## Median : -5.00
## Mean : -10.71
## 3rd Qu.: 0.00
## Max. : 500.00
Numerical summary for females
## wdiff cdc.gender
## Min. :-300.00 m: 0
## 1st Qu.: -27.00 f:10431
## Median : -10.00
## Mean : -18.15
## 3rd Qu.: 0.00
## Max. : 83.00
In this sibe by side box plot, we can see that women on average want to lose more weight than men do. Infact more men want to gain weight than women.
mean of weight = 169.7
standard deviation = 40.08
## [1] 0.7076
Proportion within one standard deviation of mean = .7076 ## On My Own…