1. Make a scatterplot of weight versus desired weight. Describe the relationship between these two variables. ANSWER - I PLOTTED BOTH WAYS AND FOUND THE DESIRE VARIABLE ON X TO BE EASIER TO EVALUATE. THE RELATIONSHIP SHOWS THAT MOST PEOPLE DESIRE TO BE A LESSER WEIGHT WITH VALUES CLUSTERING BELOW THE X/Y LINE. HOWEVER, WITH 20000 DATA POINTS, THIS SCATTERPLOT IS HARD TO READ WITH BOTH ORIENTATIONS. I PLAYED AROUND WITH JUST USING AN X,Y PLOT COMMAND, AND CURIOUSLY THE GRAPH WAS IDENTICAL TO THE VERSUS GRAPH WITH OPPOSITE ORDERS, SHOWING ME THAT THE VERSUS ~ SIGN PLACES THE FIRST VALUE ON Y RATHER THAN X. WONDERING IF SCATTER/VERSUS IS JUST NOT GREAT FOR NUMERICAL DISCRETE DATA,BUT SCATTER/COMMA ARE OK.
plot(cdc$weight ~ cdc$wtdesire)

plot (cdc$wtdesire ~ cdc$weight)

plot(cdc$weight, cdc$wtdesire)

  1. Let’s consider a new variable: the difference between desired weight (wtdesire) and current weight (weight). Create this new variable by subtracting the two columns in the data frame and assigning them to a new object called wdiff.
wdiff <- (cdc$wtdesire - cdc$weight)
summary (wdiff)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-300.00  -21.00  -10.00  -14.59    0.00  500.00 
  1. What type of data is wdiff? If an observation wdiff is 0, what does this mean about the person’s weight and desired weight. What if wdiff is positive or negative? ANSWER - WHEN I RAN AT A GLANCE I COULD SEE THAT THE VAST MAJORITY OF PEOPLE WANT TO BE LESS IN WEIGHT. A 0 MEANS THEY WOULD NOT WANT TO LOSE WEIGHT OR GAIN WEIGHT. POSITIVE VALUES MEAN THEY WANT TO GAIN, NEGATIVE TO LOSE.
var(wdiff)
[1] 578.2032
sd(wdiff)
[1] 24.04586
  1. Describe the distribution of wdiff in terms of its center, shape, and spread, including any plots you use. What does this tell us about how people feel about their current weight? ANSWER - I TRIED 1. A HISTOGRAM ON THIS ONE TO SHOW THE DISTRIBUTION OF THE NEW SINGLE VARIABLE WDIFF (AS THE TUTORIAL SAID IT WOULD BE HELPFUL TO SHOW A SINGLE DISTRIBUTION). IT SHOWS THE SHAPE TO BE EXTREMELY TALL AT THE -50 LB BIN MARK. VALUES ARE DOMINANT IN THE NEGATIVE ZONE, WHICH CORRESPONDS TO THE DATA SHOWING THAT MOST PEOPLE WISH TO LOSE, NOT GAIN WEIGHT. HOWEVER, THIS IS NOT VERY GRANUALR OR VISUALLY HELPFUL TO SHOW A SPREAD. I TRIED THE BREAKS ADDITION OF 100…THIS BETTER SHOWED, MORE GRANULARLY THAT PEAK AND SLOPE TO THE LEFT OF 0.

    I WOULD DESCRIBE THE SHAPE AND DISTRIBUTION AS LEFT-CENTERED AND UNIMODAL AT THAT VALUE JUST PAST 0.

    THIS SHOWS MORE CLEARLY THAT MOST PEOPLE WISH TO LOSE BETWEEN WHATEVER THE FIRST VALUE IS IN THE 100-BIN VERSION (I DON’T KNOW HOW TO MAKE IT LABEL THIS) AND THERE ARE ALSO SIGNIFICANT NUMBERS WHO WISH TO LOSE BETWEEN THE NEXT SEVERAL VALUES.

hist(wdiff)

hist(wdiff,breaks = 100)

  1. Using numerical summaries and a side-by-side box plot, determine if men tend to view their weight differently than women. ANSWER - I DID DO THE BOX PLOT OF NEW VARIABLE WDIFF VERSUS ALL GENDER AND GOT A BOX PLOT THAT WAS A LITTLE CONFUSING. IT APPEARED TO SHOW NO DIFFERENCE - BOTH FEMALE AND MALE HOVERED AT 0 - THOUGH THE WIDTH OF THE FEMALE DATA LOOKS A LITTLE FATTER - HARD TO TELL HERE. CANNOT PULL A SUMMARY… THE DATA SET IS SO HUGE IT IS UNCLEAR IF THE BOXPLOT IS DISPLAYING WELL. I THOGUHT TO TRY A BAR GRAPH COMPARE FEMALES TO THE WHOLE AND THOGUHT TO USE THE SUBSETTING PROCESSES LEARNED (CREATING A FDATA AS THEY DID FOR MDATA SO I COULD VIEW JUST FEMALE WTDESIRES) BUT THIS DID NOT WORK BECAUSE THE WDIFF IS NOT A CDC COLUMN, IT IS MORE VALUES THAN THE FEMALES PULLED FROM CDC$GENDER. I WANTED TO SEE MORE BUT i WILL DO THIS LATER WHEN I LEARN MORE.
boxplot(wdiff ~ cdc$gender)

  1. Now it’s time to get creative. Find the mean and standard deviation of weight and determine what proportion of the weights are within one standard deviation of the mean.

ANSWER - THE MEAN IS 169.7 AND THE SD IS 40.(RAN THE SUMMARY / MEAN AND VAR AND CALCULATED the square of 1606.484, THE VARIATION) I TRIED A HISTOGRAM OF ALL CDC WEIGHTS TO SEE WHAT THAT COULD SHOW ABOUT THE DISTRIBUTION. iT SHOWED THAT THE BIGGEST PROPORTION OF WEIGHTS FALL BETWEEN 100-200 CERTAINLY.

FOR THE PROPORTION PART OF THIS QUESTION - I COULD EYEBALL IT AND SEE ABOUT HOW MUCH 40 LBS LESS AND MORE THAN 169.7 WOUDL COMPRISE OF THE WHOLE, BUT I AM GUESSING THAT IS NOT GOOD ENOUGH. I CALCULATED USING R JUST THE RANGE OF WEIGHTS SOUTH AND NORTH OF THE MEAN AND THAT IS A RANGE OF 129.7 - 209.7. I LOOKED AT THE DIMENSION OF THE DATA AND SHOWED THAT THIS RANGE COMPRISES .7076 - ABOUT 70%…AND THIS COMPORTS WITH THE 68% RULE, MORE OR LESS, THAT I READ ABOUT ONLINE.

summary(cdc$weight)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   68.0   140.0   165.0   169.7   190.0   500.0 
mean(cdc$weight)
[1] 169.683
var(cdc$weight)
[1] 1606.484
hist(cdc$weight)

169.7 - 40
[1] 129.7
169.7 + 40
[1] 209.7
w_within <- cdc[cdc$weight >=129.7 & cdc$weight <=209.7,]
nrow (w_within)/nrow (cdc)
[1] 0.7076
LS0tDQp0aXRsZTogIkV4ZXJjaXNlIDMgRmluYWwgKG5vdCBwcmFjdGljZSkiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCmVkaXRvcl9vcHRpb25zOiANCiAgbWFya2Rvd246IA0KICAgIHdyYXA6IHNlbnRlbmNlDQotLS0NCg0KMS4gIE1ha2UgYSBzY2F0dGVycGxvdCBvZiB3ZWlnaHQgdmVyc3VzIGRlc2lyZWQgd2VpZ2h0LiBEZXNjcmliZSB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gdGhlc2UgdHdvIHZhcmlhYmxlcy4gDQpBTlNXRVIgLSBJIFBMT1RURUQgQk9USCBXQVlTIEFORCBGT1VORCBUSEUgREVTSVJFIFZBUklBQkxFIE9OIFggVE8gQkUgRUFTSUVSIFRPIEVWQUxVQVRFLiBUSEUgUkVMQVRJT05TSElQIFNIT1dTIFRIQVQgTU9TVCBQRU9QTEUgREVTSVJFIFRPIEJFIEEgTEVTU0VSIFdFSUdIVCBXSVRIIFZBTFVFUyBDTFVTVEVSSU5HIEJFTE9XIFRIRSBYL1kgTElORS4gSE9XRVZFUiwgV0lUSCAyMDAwMCBEQVRBIFBPSU5UUywgVEhJUyBTQ0FUVEVSUExPVCBJUyBIQVJEIFRPIFJFQUQgV0lUSCBCT1RIIE9SSUVOVEFUSU9OUy4gSSBQTEFZRUQgQVJPVU5EIFdJVEggSlVTVCBVU0lORyBBTiBYLFkgUExPVCBDT01NQU5ELCBBTkQgQ1VSSU9VU0xZIFRIRSBHUkFQSCBXQVMgSURFTlRJQ0FMIFRPIFRIRSBWRVJTVVMgR1JBUEggV0lUSCBPUFBPU0lURSBPUkRFUlMsIFNIT1dJTkcgTUUgVEhBVCBUSEUgVkVSU1VTIH4gU0lHTiBQTEFDRVMgVEhFIEZJUlNUIFZBTFVFIE9OIFkgUkFUSEVSIFRIQU4gWC4gV09OREVSSU5HIElGIFNDQVRURVIvVkVSU1VTIElTIEpVU1QgTk9UIEdSRUFUIEZPUiBOVU1FUklDQUwgRElTQ1JFVEUgREFUQSxCVVQgU0NBVFRFUi9DT01NQSBBUkUgT0suDQpgYGB7cn0NCnBsb3QoY2RjJHdlaWdodCB+IGNkYyR3dGRlc2lyZSkNCnBsb3QgKGNkYyR3dGRlc2lyZSB+IGNkYyR3ZWlnaHQpDQpwbG90KGNkYyR3ZWlnaHQsIGNkYyR3dGRlc2lyZSkNCg0KYGBgDQoNCjIuICBMZXTigJlzIGNvbnNpZGVyIGEgbmV3IHZhcmlhYmxlOiB0aGUgZGlmZmVyZW5jZSBiZXR3ZWVuIGRlc2lyZWQgd2VpZ2h0ICh3dGRlc2lyZSkgYW5kIGN1cnJlbnQgd2VpZ2h0ICh3ZWlnaHQpLiBDcmVhdGUgdGhpcyBuZXcgdmFyaWFibGUgYnkgc3VidHJhY3RpbmcgdGhlIHR3byBjb2x1bW5zIGluIHRoZSBkYXRhIGZyYW1lIGFuZCBhc3NpZ25pbmcgdGhlbSB0byBhIG5ldyBvYmplY3QgY2FsbGVkIHdkaWZmLg0KDQpgYGB7cn0NCndkaWZmIDwtIChjZGMkd3RkZXNpcmUgLSBjZGMkd2VpZ2h0KQ0Kc3VtbWFyeSAod2RpZmYpDQoNCg0KYGBgDQoNCjMuICBXaGF0IHR5cGUgb2YgZGF0YSBpcyB3ZGlmZj8gSWYgYW4gb2JzZXJ2YXRpb24gd2RpZmYgaXMgMCwgd2hhdCBkb2VzIHRoaXMgbWVhbiBhYm91dCB0aGUgcGVyc29u4oCZcyB3ZWlnaHQgYW5kIGRlc2lyZWQgd2VpZ2h0LiBXaGF0IGlmIHdkaWZmIGlzIHBvc2l0aXZlIG9yIG5lZ2F0aXZlPw0KICAgIEFOU1dFUiAtIFdIRU4gSSBSQU4gQVQgQSBHTEFOQ0UgSSBDT1VMRCBTRUUgVEhBVCBUSEUgVkFTVCBNQUpPUklUWSBPRiBQRU9QTEUgV0FOVCBUTyBCRSBMRVNTIElOIFdFSUdIVC4gQSAwIE1FQU5TIFRIRVkgV09VTEQgTk9UIFdBTlQgVE8gTE9TRSBXRUlHSFQgT1IgR0FJTiBXRUlHSFQuIFBPU0lUSVZFIFZBTFVFUyBNRUFOIFRIRVkgV0FOVCBUTyBHQUlOLCBORUdBVElWRSBUTyBMT1NFLg0KYGBge3J9DQp2YXIod2RpZmYpDQpzZCh3ZGlmZikNCmBgYA0KICAgIA0KDQo0LiAgRGVzY3JpYmUgdGhlIGRpc3RyaWJ1dGlvbiBvZiB3ZGlmZiBpbiB0ZXJtcyBvZiBpdHMgY2VudGVyLCBzaGFwZSwgYW5kIHNwcmVhZCwgaW5jbHVkaW5nIGFueSBwbG90cyB5b3UgdXNlLiBXaGF0IGRvZXMgdGhpcyB0ZWxsIHVzIGFib3V0IGhvdyBwZW9wbGUgZmVlbCBhYm91dCB0aGVpciBjdXJyZW50IHdlaWdodD8NCiAgICBBTlNXRVIgLSBJIFRSSUVEIDEuIEEgSElTVE9HUkFNIE9OIFRISVMgT05FIFRPIFNIT1cgVEhFIERJU1RSSUJVVElPTiBPRiBUSEUgTkVXIFNJTkdMRSBWQVJJQUJMRSBXRElGRiAoQVMgVEhFIFRVVE9SSUFMIFNBSUQgSVQgV09VTEQgQkUgSEVMUEZVTCBUTyBTSE9XIEEgU0lOR0xFIERJU1RSSUJVVElPTikuIElUIFNIT1dTIFRIRSBTSEFQRSBUTyBCRSBFWFRSRU1FTFkgVEFMTCBBVCBUSEUgLTUwIExCIEJJTiBNQVJLLiBWQUxVRVMgQVJFIERPTUlOQU5UIElOIFRIRSBORUdBVElWRSBaT05FLCBXSElDSCBDT1JSRVNQT05EUyBUTyBUSEUgREFUQSBTSE9XSU5HIFRIQVQgTU9TVCBQRU9QTEUgV0lTSCBUTyBMT1NFLCBOT1QgR0FJTiBXRUlHSFQuIEhPV0VWRVIsIFRISVMgSVMgTk9UIFZFUlkgR1JBTlVBTFIgT1IgVklTVUFMTFkgSEVMUEZVTCBUTyBTSE9XIEEgU1BSRUFELiBJIFRSSUVEIFRIRSBCUkVBS1MgQURESVRJT04gT0YgMTAwLi4uVEhJUyBCRVRURVIgU0hPV0VELCBNT1JFIEdSQU5VTEFSTFkgVEhBVCBQRUFLIEFORCBTTE9QRSBUTyBUSEUgTEVGVCBPRiAwLg0KICAgIA0KICAgIEkgV09VTEQgREVTQ1JJQkUgVEhFIFNIQVBFIEFORCBESVNUUklCVVRJT04gQVMgTEVGVC1DRU5URVJFRCBBTkQgVU5JTU9EQUwgQVQgVEhBVCBWQUxVRSBKVVNUIFBBU1QgMC4NCiAgICANCiAgICBUSElTIFNIT1dTIE1PUkUgQ0xFQVJMWSBUSEFUIE1PU1QgUEVPUExFIFdJU0ggVE8gTE9TRSBCRVRXRUVOIFdIQVRFVkVSIFRIRSBGSVJTVCBWQUxVRSBJUyBJTiBUSEUgMTAwLUJJTiBWRVJTSU9OIChJIERPTidUIEtOT1cgSE9XIFRPIE1BS0UgSVQgTEFCRUwgVEhJUykgQU5EIFRIRVJFIEFSRSBBTFNPIFNJR05JRklDQU5UIE5VTUJFUlMgV0hPIFdJU0ggVE8gTE9TRSBCRVRXRUVOIFRIRSBORVhUIFNFVkVSQUwgVkFMVUVTLiANCiAgICANCg0KYGBge3J9DQpoaXN0KHdkaWZmKQ0KaGlzdCh3ZGlmZixicmVha3MgPSAxMDApDQpgYGANCjUuICBVc2luZyBudW1lcmljYWwgc3VtbWFyaWVzIGFuZCBhIHNpZGUtYnktc2lkZSBib3ggcGxvdCwgZGV0ZXJtaW5lIGlmIG1lbiB0ZW5kIHRvIHZpZXcgdGhlaXIgd2VpZ2h0IGRpZmZlcmVudGx5IHRoYW4gd29tZW4uDQpBTlNXRVIgLSBJIERJRCBETyBUSEUgQk9YIFBMT1QgT0YgTkVXIFZBUklBQkxFIFdESUZGIFZFUlNVUyBBTEwgR0VOREVSIEFORCBHT1QgQSBCT1ggUExPVCBUSEFUIFdBUyBBIExJVFRMRSBDT05GVVNJTkcuIElUIEFQUEVBUkVEIFRPIFNIT1cgTk8gRElGRkVSRU5DRSAtIEJPVEggRkVNQUxFIEFORCBNQUxFIEhPVkVSRUQgQVQgMCAtIFRIT1VHSCBUSEUgV0lEVEggT0YgVEhFIEZFTUFMRSBEQVRBIExPT0tTIEEgTElUVExFIEZBVFRFUiAtIEhBUkQgVE8gVEVMTCBIRVJFLiBDQU5OT1QgUFVMTCBBIFNVTU1BUlkuLi4gVEhFIERBVEEgU0VUIElTIFNPIEhVR0UgSVQgSVMgVU5DTEVBUiBJRiBUSEUgQk9YUExPVCBJUyBESVNQTEFZSU5HIFdFTEwuIEkgVEhPR1VIVCBUTyBUUlkgQSBCQVIgR1JBUEggQ09NUEFSRSBGRU1BTEVTIFRPIFRIRSBXSE9MRSBBTkQgVEhPR1VIVCBUTyBVU0UgVEhFIFNVQlNFVFRJTkcgUFJPQ0VTU0VTIExFQVJORUQgKENSRUFUSU5HIEEgRkRBVEEgQVMgVEhFWSBESUQgRk9SIE1EQVRBIFNPIEkgQ09VTEQgVklFVyBKVVNUIEZFTUFMRSBXVERFU0lSRVMpIEJVVCBUSElTIERJRCBOT1QgV09SSyBCRUNBVVNFIFRIRSBXRElGRiBJUyBOT1QgQSBDREMgQ09MVU1OLCBJVCBJUyBNT1JFIFZBTFVFUyBUSEFOIFRIRSBGRU1BTEVTIFBVTExFRCBGUk9NIENEQyRHRU5ERVIuIEkgV0FOVEVEIFRPIFNFRSBNT1JFIEJVVCBpIFdJTEwgRE8gVEhJUyBMQVRFUiBXSEVOIEkgTEVBUk4gTU9SRS4NCg0KYGBge3J9DQpib3hwbG90KHdkaWZmIH4gY2RjJGdlbmRlcikNCmBgYA0KNi4gIE5vdyBpdOKAmXMgdGltZSB0byBnZXQgY3JlYXRpdmUuIEZpbmQgdGhlIG1lYW4gYW5kIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB3ZWlnaHQgYW5kIGRldGVybWluZSB3aGF0IHByb3BvcnRpb24gb2YgdGhlIHdlaWdodHMgYXJlIHdpdGhpbiBvbmUgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSBtZWFuLiANCg0KQU5TV0VSIC0gVEhFIE1FQU4gSVMgMTY5LjcgQU5EIFRIRSBTRCBJUyA0MC4oUkFOIFRIRSBTVU1NQVJZIC8gTUVBTiBBTkQgVkFSIEFORCBDQUxDVUxBVEVEIHRoZSBzcXVhcmUgb2YgMTYwNi40ODQsIFRIRSBWQVJJQVRJT04pIEkgVFJJRUQgQSBISVNUT0dSQU0gT0YgQUxMIENEQyBXRUlHSFRTIFRPIFNFRSBXSEFUIFRIQVQgQ09VTEQgU0hPVyBBQk9VVCBUSEUgRElTVFJJQlVUSU9OLiBpVCBTSE9XRUQgVEhBVCBUSEUgQklHR0VTVCBQUk9QT1JUSU9OIE9GIFdFSUdIVFMgRkFMTCBCRVRXRUVOIDEwMC0yMDAgQ0VSVEFJTkxZLiANCg0KRk9SIFRIRSBQUk9QT1JUSU9OIFBBUlQgT0YgVEhJUyBRVUVTVElPTiAtIEkgQ09VTEQgRVlFQkFMTCBJVCBBTkQgU0VFIEFCT1VUIEhPVyBNVUNIIDQwIExCUyBMRVNTIEFORCBNT1JFIFRIQU4gMTY5LjcgV09VREwgQ09NUFJJU0UgT0YgVEhFIFdIT0xFLCBCVVQgSSBBTSBHVUVTU0lORyBUSEFUIElTIE5PVCBHT09EIEVOT1VHSC4gSSBDQUxDVUxBVEVEIFVTSU5HIFIgSlVTVCBUSEUgUkFOR0UgT0YgV0VJR0hUUyBTT1VUSCBBTkQgTk9SVEggT0YgVEhFIE1FQU4gQU5EIFRIQVQgSVMgQSBSQU5HRSBPRiAxMjkuNyAtIDIwOS43LiBJIExPT0tFRCBBVCBUSEUgRElNRU5TSU9OIE9GIFRIRSBEQVRBIEFORCBTSE9XRUQgVEhBVCBUSElTIFJBTkdFIENPTVBSSVNFUyAuNzA3NiAtIEFCT1VUIDcwJS4uLkFORCBUSElTIENPTVBPUlRTIFdJVEggVEhFIDY4JSBSVUxFLCBNT1JFIE9SIExFU1MsIFRIQVQgSSBSRUFEIEFCT1VUIE9OTElORS4NCg0KYGBge3J9DQpzdW1tYXJ5KGNkYyR3ZWlnaHQpDQptZWFuKGNkYyR3ZWlnaHQpDQp2YXIoY2RjJHdlaWdodCkNCmBgYA0KYGBge3J9DQpoaXN0KGNkYyR3ZWlnaHQpDQpgYGANCmBgYHtyfQ0KMTY5LjcgLSA0MA0KMTY5LjcgKyA0MA0Kd193aXRoaW4gPC0gY2RjW2NkYyR3ZWlnaHQgPj0xMjkuNyAmIGNkYyR3ZWlnaHQgPD0yMDkuNyxdDQoNCg0KYGBg