Question 1: Read in the gambling dataset check the first couple of rows and describe the data types. Identify incorrect data types, if any. ( 5 Points )
mydata = read.csv(file = "data/gambling.csv")
mydata
Sex is a categorical data type. Income & gamble are ratio data types. Status & verbal are interval data types
The sex column is also presented incorrectly.
Status, Income, Verble, and Gamble are all confusing to interpret since there is not much information provided as to what exactly the numbers tell us.
Question 2: Describe the data using full sentences and using descriptive statistics. ( 5 Points )
summary(mydata)
sex status income verbal gamble
Min. :0.0000 Min. :18.00 Min. : 0.600 Min. : 1.00 Min. : 0.0
1st Qu.:0.0000 1st Qu.:28.00 1st Qu.: 2.000 1st Qu.: 6.00 1st Qu.: 1.1
Median :0.0000 Median :43.00 Median : 3.250 Median : 7.00 Median : 6.0
Mean :0.4043 Mean :45.23 Mean : 4.642 Mean : 6.66 Mean : 19.3
3rd Qu.:1.0000 3rd Qu.:61.50 3rd Qu.: 6.210 3rd Qu.: 8.00 3rd Qu.: 19.4
Max. :1.0000 Max. :75.00 Max. :15.000 Max. :10.00 Max. :156.0
There is an issue in the beginning column of provided data due to the fact that sex is a categorical data type. So for the data beneath that header to be numeric provides an issue. The rest of the numerical data provided is unknown because there is no real descriptors attached to the numbers to indicate to us what it is exactly that they are measuring.
Question 3: Estimate the upper and lower threshold for the verbal score ( 5 Points )
HINT: A common way to estimate the upper and lower threshold is to take the mean (+ or -) 3 * standard deviation.
#Use the formula above to calculate the upper and lower threshold
verbal = mydata$verbal
mean(verbal)
[1] 6.659574
spreadverbal = sd(verbal)
spreadverbal
[1] 1.856558
upperVerbal = mean(verbal) + (3) * spreadverbal
upperVerbal
[1] 12.22925
lowerverbal = mean(verbal) - 3 * spreadverbal
lowerverbal
[1] 1.0899
Question 4: Calculate the z-score for income where x=13. Based on the income value x=13 pounds per week, how would you rate the income: low income, average income, high income. Why? ( 5 Points )
Hint: zscore = (x - mean)/sd
spreadIncome = sd(income)
zscore = (13 - mean(income))/spreadIncome
zscore
[1] 2.353481
Question 5: Create a histogram for the zscore of income. What do you notice about the shape? ( 5 Points )
Hint: To plot a histogram, use the function hist(variable).
zscoreIncome = (income - mean(income))/spreadIncome
hist(zscoreIncome)

LS0tCnRpdGxlOiAiQnVzaW5lc3MgQW5hbHl0aWNzIC0gTUlEVEVSTSIKYXV0aG9yOiAiSnVzdGljZSBMYXdzb24iCmRhdGU6ICJTdW1tZXIgMjAxNyIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OiBkZWZhdWx0CiAgcGRmX2RvY3VtZW50OiBkZWZhdWx0CnN1YnRpdGxlOiBDTUUgR3JvdXAgRm91bmRhdGlvbiBCdXNpbmVzcyBBbmFseXRpY3MgTGFiCi0tLQoKIyMjIFF1ZXN0aW9uIDE6IFJlYWQgaW4gdGhlIGdhbWJsaW5nIGRhdGFzZXQgY2hlY2sgdGhlIGZpcnN0IGNvdXBsZSBvZiByb3dzIGFuZCBkZXNjcmliZSB0aGUgZGF0YSB0eXBlcy4gSWRlbnRpZnkgaW5jb3JyZWN0IGRhdGEgdHlwZXMsIGlmIGFueS4gKCA1IFBvaW50cyApCgpgYGB7cn0KbXlkYXRhID0gcmVhZC5jc3YoZmlsZSA9ICJkYXRhL2dhbWJsaW5nLmNzdiIpCm15ZGF0YQpgYGAKU2V4IGlzIGEgY2F0ZWdvcmljYWwgZGF0YSB0eXBlLgpJbmNvbWUgJiBnYW1ibGUgYXJlIHJhdGlvIGRhdGEgdHlwZXMuClN0YXR1cyAmIHZlcmJhbCBhcmUgaW50ZXJ2YWwgZGF0YSB0eXBlcwoKVGhlIHNleCBjb2x1bW4gaXMgYWxzbyBwcmVzZW50ZWQgaW5jb3JyZWN0bHkuIAoKU3RhdHVzLCBJbmNvbWUsIFZlcmJsZSwgYW5kIEdhbWJsZSBhcmUgYWxsIGNvbmZ1c2luZyB0byBpbnRlcnByZXQgc2luY2UgdGhlcmUgaXMgbm90IG11Y2ggaW5mb3JtYXRpb24gcHJvdmlkZWQgYXMgdG8gd2hhdCBleGFjdGx5IHRoZSBudW1iZXJzIHRlbGwgdXMuIAoKCiMjIyBRdWVzdGlvbiAyOiBEZXNjcmliZSB0aGUgZGF0YSB1c2luZyBmdWxsIHNlbnRlbmNlcyBhbmQgdXNpbmcgZGVzY3JpcHRpdmUgc3RhdGlzdGljcy4gKCA1IFBvaW50cyApCgpgYGB7cn0Kc3VtbWFyeShteWRhdGEpCmBgYApUaGVyZSBpcyBhbiBpc3N1ZSBpbiB0aGUgYmVnaW5uaW5nIGNvbHVtbiBvZiBwcm92aWRlZCBkYXRhIGR1ZSB0byB0aGUgZmFjdCB0aGF0IHNleCBpcyBhIGNhdGVnb3JpY2FsIGRhdGEgdHlwZS4gU28gZm9yIHRoZSBkYXRhIGJlbmVhdGggdGhhdCBoZWFkZXIgdG8gYmUgbnVtZXJpYyBwcm92aWRlcyBhbiBpc3N1ZS4gVGhlIHJlc3Qgb2YgdGhlIG51bWVyaWNhbCBkYXRhIHByb3ZpZGVkIGlzIHVua25vd24gYmVjYXVzZSB0aGVyZSBpcyBubyByZWFsIGRlc2NyaXB0b3JzIGF0dGFjaGVkIHRvIHRoZSBudW1iZXJzIHRvIGluZGljYXRlIHRvIHVzIHdoYXQgaXQgaXMgZXhhY3RseSB0aGF0IHRoZXkgYXJlIG1lYXN1cmluZy4gCgoKIyMjIFF1ZXN0aW9uIDM6IEVzdGltYXRlIHRoZSB1cHBlciBhbmQgbG93ZXIgdGhyZXNob2xkIGZvciB0aGUgdmVyYmFsIHNjb3JlICggNSBQb2ludHMgKQoKSElOVDogIEEgY29tbW9uIHdheSB0byBlc3RpbWF0ZSB0aGUgdXBwZXIgYW5kIGxvd2VyIHRocmVzaG9sZCBpcyB0byB0YWtlIHRoZSBtZWFuICgrIG9yIC0pIDMgKiBzdGFuZGFyZCBkZXZpYXRpb24uCgpgYGB7cn0KI1VzZSB0aGUgZm9ybXVsYSBhYm92ZSB0byBjYWxjdWxhdGUgdGhlIHVwcGVyIGFuZCBsb3dlciB0aHJlc2hvbGQKdmVyYmFsID0gbXlkYXRhJHZlcmJhbAptZWFuKHZlcmJhbCkKCnNwcmVhZHZlcmJhbCA9IHNkKHZlcmJhbCkKc3ByZWFkdmVyYmFsCgp1cHBlclZlcmJhbCA9IG1lYW4odmVyYmFsKSArICgzKSAqIHNwcmVhZHZlcmJhbAp1cHBlclZlcmJhbAoKbG93ZXJ2ZXJiYWwgPSBtZWFuKHZlcmJhbCkgLSAzICogc3ByZWFkdmVyYmFsCmxvd2VydmVyYmFsCmBgYAojIyMgUXVlc3Rpb24gNDogQ2FsY3VsYXRlIHRoZSB6LXNjb3JlIGZvciBpbmNvbWUgd2hlcmUgeD0xMy4gQmFzZWQgb24gdGhlIGluY29tZSB2YWx1ZSB4PTEzIHBvdW5kcyBwZXIgd2VlaywgaG93IHdvdWxkIHlvdSByYXRlIHRoZSBpbmNvbWU6IGxvdyBpbmNvbWUsIGF2ZXJhZ2UgaW5jb21lLCBoaWdoIGluY29tZS4gV2h5PyAoIDUgUG9pbnRzICkKCkhpbnQ6IHpzY29yZSA9ICh4IC0gbWVhbikvc2QKYGBge3J9CnNwcmVhZEluY29tZSA9IHNkKGluY29tZSkKenNjb3JlID0gKDEzIC0gbWVhbihpbmNvbWUpKS9zcHJlYWRJbmNvbWUKenNjb3JlCmBgYAojIyMgUXVlc3Rpb24gNTogQ3JlYXRlIGEgaGlzdG9ncmFtIGZvciB0aGUgenNjb3JlIG9mIGluY29tZS4gV2hhdCBkbyB5b3Ugbm90aWNlIGFib3V0IHRoZSBzaGFwZT8gKCA1IFBvaW50cyApCgpIaW50OiBUbyBwbG90IGEgaGlzdG9ncmFtLCB1c2UgdGhlIGZ1bmN0aW9uIGhpc3QodmFyaWFibGUpLiAKYGBge3J9CnpzY29yZUluY29tZSA9IChpbmNvbWUgLSBtZWFuKGluY29tZSkpL3NwcmVhZEluY29tZQpoaXN0KHpzY29yZUluY29tZSkKYGBgCgojIyMgUXVlc3Rpb24gNjogQW5hbHl6ZSB0aGUgY29ycmVsYXRpb24gcGxvdCBiZWxvdy4gR2l2ZSByZWxhdmFudCBpbmZvcm1hdGlvbiBhYm91dCB0aGUgbmVnYXRpdmUgY29ycmVsYXRlZCwgbm8gY29ycmVsYXJlZCBhbmQgcG9zaXRpdmUgY29ycmVsYXRlZCB2YXJpYWJsZXMuICggNSBQb2ludHMgKQoKIVtdKGRhdGEvY29ycl9wbG90LnBuZykKU2FmZSB0byBzYXkgdGhhdCBhbGwgYXJlIHBvc2l0aXZlbHkgY29ycmVsYXRlZCBzaW5jZSBhbGwgY2lyY2xlcyBhcmUgYmx1ZSBhbmQgd2Uga25vdyB0aGF0IGJsdWUgY2lyY2xlcyBhcmUgIHBvc3NpdGl2ZSBiZWNhdXNlIGl0IGlzIGluZGljYXRlZCBpbiB0aGUga2V5IHRvIHRoZSByaWdodC4gVGhlIGRhcmtlciBibHVlcyBoYXZpbmcgdGhlIG1vc3QgcG9zaXRpdmUgY29ycmVsYXRpb24gd2l0aCAxLCBhbGwgdGhlIHdheSBkb3duIHRvIHRoZSBsaWdodGVyIG9uZXMgd2l0aCBhIGNvb3JlbGF0aW9uIHJhbmdpbmcgYW55d2hlcmUgZnJvbSAwIC0gMC4yLiBUaGUgbGFyZ2VzdCBwb3NpdGl2ZSBjb3JyZWxhdGlvbiBzZWVtcyB0byBleGlzdCBiZXR3ZWVuIGZiX2xpa2VzLCBhbmQgdXNlcl92b3Rlcywgd2hpbGUgZGlyZWN0b3IgYW5kIGNhc3RfbGlrZXMgaGF2ZSB0aGUgbGVhc3QgcG9zaXRpdmUgY29ycmVsYXRpb24gd2l0aCBzY29yZSBhbmQgZ3Jvc3MuIE92ZXJhbGwgdGhvdWdoLCBhbGwgYXJlIHBvc2l0aXZlbHkgcmVsYXRlZCEKCgoKIyMjIEV4dHJhIENyZWRpdDogQW5hbHl6ZSB0aGUgY29ycmVsYXRpb24gdGFibGUgYmVsb3cuIEdpdmUgcmVsYXZhbnQgaW5mb3JtYXRpb24gYWJvdXQgdGhlIG5lZ2F0aXZlIGNvcnJlbGF0ZWQsIG5vIGNvcnJlbGFyZWQgYW5kIHBvc2l0aXZlIGNvcnJlbGF0ZWQgdmFyaWFibGVzLiAoIDUgUG9pbnRzICkKCmBgYHtyfQojIENyZWF0ZSBhIGNvcnJlbGF0aW9uIHRhYmxlICJjb3IobW92aWVzKSIKbW92aWVzID0gcmVhZC5jc3YoImRhdGEvbW92aWVzLmNzdiIpCmNvcihtb3ZpZXMpCmBgYApJdCBzZWVtcyBhcyB0aG91Z2ggdGhlIGNvbHVtbnMgbGVuZ3RoLCBidWRnZXQsIGFjdG9yIDEsIGFjdG9yIDIsIGFuZCBhY3RvciAzIGFsbCBzZWVtIHRvIGhhdmUgbm8gcmVhbCBjb29yZWxhdGlvbiB3aXRoIGFueSBvdGhlciBjYXRlZ29yeSBsaXN0ZWQgaW4gdGhlIHRhYmxlIGJlc2lkZXMgdGhhdCB3aGljaCBleGlzdHMgd2l0aCB0aGVtc2VsdmVzLiAKVGhlcmUgYXBwZWFycyB0byBiZSBubyBuZWdhdGl2ZWx5IGNvb3JlbGF0ZWQgcmVsYXRpb25zaGlwcy4KQW5kIHRoZXJlIGFwcGVhciB0byBiZSBwb3NpdGl2ZSByZWxhdGlvbnNoaXBzIGFtb25nc3QgZGlyZWN0b3IsIGNhc3RfbGlrZXMsIGFuZCBmYl9saWtlcy4g