Make sure to load the appropriate libraries and load the data before you begin your analysis. Knit the document early and often to make sure it will knit properly. You will print these documents to submit.

First In-Class Assignment

For this first assignments, I have loaded the data and packages for you in the library. In future assignment, use this section for loading the data and library.

This is a test output, you should see the output only when you knit the document.

Exercise 1

There are 20 000 cases in this data set with 9 variables.

genhlth - categorical

exerany - categorical

hlthplan - categorical

smoke100 - categorical

height - numerical

weight - numerical

wdesire - numerical

age - numerical

gender - numerical

Exercise 2

Numerical Summary - Height

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   48.00   64.00   67.00   67.18   70.00   93.00

Numerical Summary - Age

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   18.00   31.00   43.00   45.07   57.00   99.00

IQR - Weight

## [1] 50

IQR - Age

## [1] 26

Exercise 3

##    
##        0    1
##   m 4547 5022
##   f 6012 4419

This Plot shows us that on average, a random male is more likely to have smoked 100 cigarettes than a random female.

Exercise 4

fdata25 <- cdc %>% filter(gender == “f”, age>25)

Exercise 5

Exercise 6

I chose to look at the relationship between age and weight, and can seee that there is a weak negative correlation. This means thta there is alot of variation of weights, being that it is a data set of 20 000. However, there is less people that weigh over 250 pounds after age 75, because their unhealthy lifestyles it costs them their lives.

Exercise 7

This is a scatterplot looking at weight and desired weight. There is a medium-strong positive trend with this data set. Also, we can see that heavier people are more likely to have a desired weight with a much larger difference form their actual weight.

Exercise 8

wdiff <- cdc\(wtdesire - cdc\)weight

Exercise 9

“wdiff” is numerical data. It shows the difference between peoples weight and their desired weight. If wdiff=0, then the person is happy with their weight. If wdiff<0, they want to lose weight and if wdiff>0 they want to gain weight.

Exercise 10

Here is a histogram that plots the frequency of how many pounds people want to lose or gain.

Shape - Unimodel, skewed right

Centre - Median of -10

Spread - IQR range=21

People on average want to lose bewtween 10 to 15 pounds because the median is -10 and the mean is -15.

Exercise 11

Numerical summary for males.

##      wdiff         cdc.gender
##  Min.   :-300.00   m:9569    
##  1st Qu.: -20.00   f:   0    
##  Median :  -5.00             
##  Mean   : -10.71             
##  3rd Qu.:   0.00             
##  Max.   : 500.00

Numerical summary for females

##      wdiff         cdc.gender
##  Min.   :-300.00   m:    0   
##  1st Qu.: -27.00   f:10431   
##  Median : -10.00             
##  Mean   : -18.15             
##  3rd Qu.:   0.00             
##  Max.   :  83.00

In this sibe by side box plot, we can see that women on average want to lose more weight than men do. Infact more men want to gain weight than women.

Exercise 12

mean of weight = 169.7

standard deviation = 40.08

## [1] 0.7076

Proportion within one standard deviation of mean = .7076 ## On My Own…