Instructions: Complete bullet points 1-5 in section 2.6 from the OpenIntro website: https://nulib.github.io/kuyper-stat202/introduction-to-data.html. Be sure to include all relevant code, output, and answers to exercise questions. When you are done, submit your knitted html file to the assignment link.


2.6 On Your Own

Exercise 1. Make a scatterplot of weight versus desired weight. Describe the relationship between these two variables.
plot(cdc$weight,cdc$wtdesire)


Exercise 2. Let’s consider a new variable: the difference between desired weight (wtdesire) and current weight (weight). Create this new variable by subtracting the two columns in the data frame and assigning them to a new object called wdiff.
wdiff <- (cdc$weight - cdc$wtdesire)


Exercise 3. What type of data is wdiff? If an observation wdiff is 0, what does this mean about the person’s weight and desired weight. What if wdiff is positive or negative?

wdiff is continuous data, measured in whole integers, can range from positive to negative, including zero. If wdiff is positive, that means the persons actual weight is more than their desired weight. If wdiff is negative, it means that their acutal weight is less than their desired weight.


Exercise 4: Describe the distribution of wdiff in terms of its center, shape, and spread, including any plots you use. What does this tell us about how people feel about their current weight?
median(wdiff)
## [1] 10
mean(wdiff)
## [1] 14.5891
table(wdiff)
## wdiff
## -500 -311 -110  -91  -90  -86  -85  -83  -80  -75  -73  -72  -70  -68  -65  -64 
##    1    1    1    1    1    1    2    1    2    1    1    1    3    1    5    2 
##  -63  -61  -60  -55  -53  -52  -50  -47  -45  -43  -42  -41  -40  -39  -38  -37 
##    1    1    6    2    1    1    9    1    7    1    2    3   21    1    1    3 
##  -36  -35  -33  -32  -31  -30  -29  -28  -27  -26  -25  -24  -23  -22  -21  -20 
##    3   26    6    5    2   46    1    9    9    3   79    5    9   17    2  151 
##  -19  -18  -17  -16  -15  -14  -13  -12  -11  -10   -9   -8   -7   -6   -5   -4 
##    9   14   13    5  208    8   23   32   20  333   17   43   58   34  232   21 
##   -3   -2   -1    0    1    2    3    4    5    6    7    8    9   10   11   12 
##   33   39   19 5616   51  125  188  131 1253  132  234  234  103 1856   64  171 
##   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28 
##  157   71 1173   73   69  141   43 1467   41   74   70   47  677   26   65   70 
##   29   30   31   32   33   34   35   36   37   38   39   40   41   42   43   44 
##   36  893   22   28   30   32  360   23   34   31   18  556    8   21   12   18 
##   45   46   47   48   49   50   51   52   53   54   55   56   57   58   59   60 
##  189   13   18   23   12  395    4   14   11   13  101   12    8   15    8  184 
##   61   62   63   64   65   66   67   68   69   70   71   72   73   74   75   76 
##    4   10    8    8   62    9    5    9    6  125    2    8    5    6   43    3 
##   77   78   79   80   81   82   83   84   85   86   87   88   90   91   92   93 
##    5    6    4   72    2    5    4    3   22    5    2    6   46    1    2    2 
##   94   95   96   97   98   99  100  103  105  107  108  109  110  112  113  115 
##    2   11    1    2    3    1  115    1   12    1    3    3   16    3    3   11 
##  117  120  122  125  126  128  130  132  133  135  139  140  142  145  147  148 
##    1   27    2    9    2    1   15    3    1    4    1    8    2    4    1    1 
##  150  152  155  160  165  170  175  180  190  200  210  220  235  246  300 
##   17    4    1    3    2    4    1    1    1    6    1    1    1    1    2
barplot(wdiff)

max(wdiff)
## [1] 300
min(wdiff)
## [1] -500
max(cdc$wtdesire)
## [1] 680
min(cdc$wtdesire)
## [1] 68

Through analysis, the average weight over one’s desired weight was 14.5891 pounds, the median was 10 pounds. There were more people that reported being heavier than their desired weight, than lighter than their desired weight. There were some extreme outliers in the values of desiring to be heavier, with the highest response being 500 pounds lighter than a desired weight. This may be an aspiring sumo wrestler or something similar.


Exercise 5: Using numerical summaries and a side-by-side box plot, determine if men tend to view their weight differently than women.
mdata <- subset(cdc, cdc$gender == "m")
fdata <- subset(cdc, cdc$gender == "f")
mdiff <- (mdata$weight - mdata$wtdesire)
fdiff <- (fdata$weight - fdata$wtdesire)
boxplot(mdiff)

boxplot(fdiff)

summary(mdiff)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -500.00    0.00    5.00   10.71   20.00  300.00
summary(fdiff)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -83.00    0.00   10.00   18.15   27.00  300.00

Men and women have differences in the shapes of data around actual weight and desired weight. Men had a wider range of 800 pounds, to women with a range of 385 pounds. This does include some outliers. Median men were 5 pounds above desired weight, median women were 10 pounds above desired weight. The average men desired weight difference was 10.71 pounds. Women was 18.15. Women had a higher difference of weight difference, as well as a wider range of interquartile ranges.