Question 1: Read in the gambling dataset check the first couple of rows and describe the data types. Identify incorrect data types, if any. ( 5 Points )

mydata = read.csv("data/gambling.csv")
head(mydata)

The data types featured in the gambling dataset are nominal, ratio, and categorical.

There appear to be no incorrect data types.

Question 2: Describe the data using full sentences and using descriptive statistics. ( 5 Points )

#str(mydata)
summary(mydata)
      sex             status     
 Min.   :0.0000   Min.   :18.00  
 1st Qu.:0.0000   1st Qu.:28.00  
 Median :0.0000   Median :43.00  
 Mean   :0.4043   Mean   :45.23  
 3rd Qu.:1.0000   3rd Qu.:61.50  
 Max.   :1.0000   Max.   :75.00  
     income           verbal     
 Min.   : 0.600   Min.   : 1.00  
 1st Qu.: 2.000   1st Qu.: 6.00  
 Median : 3.250   Median : 7.00  
 Mean   : 4.642   Mean   : 6.66  
 3rd Qu.: 6.210   3rd Qu.: 8.00  
 Max.   :15.000   Max.   :10.00  
     gamble     
 Min.   :  0.0  
 1st Qu.:  1.1  
 Median :  6.0  
 Mean   : 19.3  
 3rd Qu.: 19.4  
 Max.   :156.0  
income = mydata$income
income
 [1]  2.00  2.50  2.00  7.00  2.00  3.47  5.50
 [8]  6.42  2.00  6.00  3.00  4.75  2.20  2.00
[15]  3.00  1.50  9.50 10.00  4.00  3.50  3.00
[22]  2.50  3.50 10.00  6.50  1.50  5.44  1.00
[29]  0.60  5.50 12.00  7.00 15.00  2.00  1.50
[36]  4.50  2.50  8.00 10.00  1.60  2.00 15.00
[43]  3.00  3.25  4.94  1.50  2.50
mean_income = mean (income)
mean_income
[1] 4.641915
status = mydata$status
status
 [1] 51 28 37 28 65 61 28 27 43 18 18 43 30 28
[15] 38 38 28 18 43 51 62 47 43 27 71 38 51 38
[29] 51 62 18 30 38 71 28 61 71 28 51 65 48 61
[43] 75 66 62 71 71
mean_status = mean(status)
mean_status
[1] 45.23404

The data set includes five catgories in the following columns: ‘sex’, ‘status’, ‘verbal’, ‘income’, and ‘gamble’.

The average income is 4.641915 and the average status is 45.23404

Question 3: Estimate the upper and lower threshold for the verbal score ( 5 Points )

HINT: A common way to estimate the upper and lower threshold is to take the mean (+ or -) 3 * standard deviation.

verbal = mydata$verbal
verbal
 [1]  8  8  6  4  8  6  7  5  6  7  6  6  4  6
[15]  6  8  8  5  8  9  8  9  5  4  7  7  4  6
[29]  7  8  2  7  7 10  1  8  7  6  6  6  9  9
[43]  8  9  6  7  9
mean_verbal = mean(verbal)
mean_verbal
[1] 6.659574
verbal_sd = sd(verbal)
verbal_sd
[1] 1.856558
verbal_lower = mean_verbal - (3) * verbal_sd
verbal_upper = mean_verbal + (3) * verbal_sd
verbal_upper
[1] 12.22925
verbal_lower
[1] 1.0899

Question 4: Calculate the z-score for income where x=13. Based on the income value x=13 pounds per week, how would you rate the income: low income, average income, high income. Why? ( 5 Points )

income_sd = sd(income)
income_sd
[1] 3.551371

Hint: zscore = (x - mean)/sd

zscore = (13 - 4.641915)/ 3.551371
zscore
[1] 2.353481

I would rate the income low considering it is below the average.

Question 5: Create a histogram for the zscore of income. What do you notice about the shape? ( 5 Points )

Hint: To plot a histogram, use the function hist(variable).

hist(zscore)

Question 6: Analyze the correlation plot below. Give relavant information about the negative correlated, no correlared and positive correlated variables. ( 5 Points )

There are postive correlated between fb likes and user votes; user votes and gross, fb likes and gross .

Signs of negative correlated between score and cast likes. direactor and cast likes.

Can conclude that there is a very strong correlation between fb likes and user votes.

Extra Credit: Analyze the correlation table below. Give relavant information about the negative correlated, no correlared and positive correlated variables. ( 5 Points )

# Create a correlation table "cor(movies)"
LS0tCnRpdGxlOiAiQnVzaW5lc3MgQW5hbHl0aWNzIC0gTUlEVEVSTSIKYXV0aG9yOiAiS2FyaW5hIFJvY2hhIgpkYXRlOiAiSnVseSAyNCwgMjAxNyIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OiBkZWZhdWx0CiAgcGRmX2RvY3VtZW50OiBkZWZhdWx0CnN1YnRpdGxlOiBDTUUgR3JvdXAgRm91bmRhdGlvbiBCdXNpbmVzcyBBbmFseXRpY3MgTGFiCi0tLQoKCgojIyMgUXVlc3Rpb24gMTogUmVhZCBpbiB0aGUgZ2FtYmxpbmcgZGF0YXNldCBjaGVjayB0aGUgZmlyc3QgY291cGxlIG9mIHJvd3MgYW5kIGRlc2NyaWJlIHRoZSBkYXRhIHR5cGVzLiBJZGVudGlmeSBpbmNvcnJlY3QgZGF0YSB0eXBlcywgaWYgYW55LiAoIDUgUG9pbnRzICkKCmBgYHtyfQpteWRhdGEgPSByZWFkLmNzdigiZGF0YS9nYW1ibGluZy5jc3YiKQpoZWFkKG15ZGF0YSkKYGBgCgpUaGUgZGF0YSB0eXBlcyBmZWF0dXJlZCBpbiB0aGUgZ2FtYmxpbmcgZGF0YXNldCBhcmUgbm9taW5hbCwgcmF0aW8sIGFuZCBjYXRlZ29yaWNhbC4gCgpUaGVyZSBhcHBlYXIgdG8gYmUgbm8gaW5jb3JyZWN0IGRhdGEgdHlwZXMuCgoKCgoKIyMjIFF1ZXN0aW9uIDI6IERlc2NyaWJlIHRoZSBkYXRhIHVzaW5nIGZ1bGwgc2VudGVuY2VzIGFuZCB1c2luZyBkZXNjcmlwdGl2ZSBzdGF0aXN0aWNzLiAoIDUgUG9pbnRzICkKCmBgYHtyfQojc3RyKG15ZGF0YSkKc3VtbWFyeShteWRhdGEpCmBgYAoKYGBge3J9CmluY29tZSA9IG15ZGF0YSRpbmNvbWUKaW5jb21lCmBgYAoKYGBge3J9Cm1lYW5faW5jb21lID0gbWVhbiAoaW5jb21lKQptZWFuX2luY29tZQpgYGAKCmBgYHtyfQpzdGF0dXMgPSBteWRhdGEkc3RhdHVzCnN0YXR1cwpgYGAKCgpgYGB7cn0KbWVhbl9zdGF0dXMgPSBtZWFuKHN0YXR1cykKbWVhbl9zdGF0dXMKYGBgCgoKClRoZSBkYXRhIHNldCBpbmNsdWRlcyBmaXZlIGNhdGdvcmllcyBpbiB0aGUgZm9sbG93aW5nIGNvbHVtbnM6ICdzZXgnLCAnc3RhdHVzJywgJ3ZlcmJhbCcsICdpbmNvbWUnLCBhbmQgJ2dhbWJsZScuIAoKVGhlIGF2ZXJhZ2UgaW5jb21lIGlzIDQuNjQxOTE1CmFuZCB0aGUgYXZlcmFnZSBzdGF0dXMgaXMgNDUuMjM0MDQKCgoKCiMjIyBRdWVzdGlvbiAzOiBFc3RpbWF0ZSB0aGUgdXBwZXIgYW5kIGxvd2VyIHRocmVzaG9sZCBmb3IgdGhlIHZlcmJhbCBzY29yZSAoIDUgUG9pbnRzICkKCkhJTlQ6ICBBIGNvbW1vbiB3YXkgdG8gZXN0aW1hdGUgdGhlIHVwcGVyIGFuZCBsb3dlciB0aHJlc2hvbGQgaXMgdG8gdGFrZSB0aGUgbWVhbiAoKyBvciAtKSAzICogc3RhbmRhcmQgZGV2aWF0aW9uLgoKYGBge3J9CnZlcmJhbCA9IG15ZGF0YSR2ZXJiYWwKdmVyYmFsCmBgYAoKYGBge3J9Cm1lYW5fdmVyYmFsID0gbWVhbih2ZXJiYWwpCm1lYW5fdmVyYmFsCmBgYAoKYGBge3J9CnZlcmJhbF9zZCA9IHNkKHZlcmJhbCkKdmVyYmFsX3NkCmBgYAoKCgpgYGB7cn0KdmVyYmFsX2xvd2VyID0gbWVhbl92ZXJiYWwgLSAoMykgKiB2ZXJiYWxfc2QKdmVyYmFsX3VwcGVyID0gbWVhbl92ZXJiYWwgKyAoMykgKiB2ZXJiYWxfc2QKdmVyYmFsX3VwcGVyCmBgYAoKYGBge3J9CnZlcmJhbF9sb3dlcgpgYGAKCgoKCiMjIyBRdWVzdGlvbiA0OiBDYWxjdWxhdGUgdGhlIHotc2NvcmUgZm9yIGluY29tZSB3aGVyZSB4PTEzLiBCYXNlZCBvbiB0aGUgaW5jb21lIHZhbHVlIHg9MTMgcG91bmRzIHBlciB3ZWVrLCBob3cgd291bGQgeW91IHJhdGUgdGhlIGluY29tZTogbG93IGluY29tZSwgYXZlcmFnZSBpbmNvbWUsIGhpZ2ggaW5jb21lLiBXaHk/ICggNSBQb2ludHMgKQoKYGBge3J9CmluY29tZV9zZCA9IHNkKGluY29tZSkKaW5jb21lX3NkCmBgYAoKSGludDogenNjb3JlID0gKHggLSBtZWFuKS9zZApgYGB7cn0KenNjb3JlID0gKDEzIC0gNC42NDE5MTUpLyAzLjU1MTM3MQp6c2NvcmUKYGBgCgpJIHdvdWxkIHJhdGUgdGhlIGluY29tZSBsb3cgY29uc2lkZXJpbmcgaXQgaXMgYmVsb3cgdGhlIGF2ZXJhZ2UuIAoKCgoKCiMjIyBRdWVzdGlvbiA1OiBDcmVhdGUgYSBoaXN0b2dyYW0gZm9yIHRoZSB6c2NvcmUgb2YgaW5jb21lLiBXaGF0IGRvIHlvdSBub3RpY2UgYWJvdXQgdGhlIHNoYXBlPyAoIDUgUG9pbnRzICkKCkhpbnQ6IFRvIHBsb3QgYSBoaXN0b2dyYW0sIHVzZSB0aGUgZnVuY3Rpb24gaGlzdCh2YXJpYWJsZSkuIApgYGB7cn0KaGlzdCh6c2NvcmUpCmBgYAoKCgoKCgoKIyMjIFF1ZXN0aW9uIDY6IEFuYWx5emUgdGhlIGNvcnJlbGF0aW9uIHBsb3QgYmVsb3cuIEdpdmUgcmVsYXZhbnQgaW5mb3JtYXRpb24gYWJvdXQgdGhlIG5lZ2F0aXZlIGNvcnJlbGF0ZWQsIG5vIGNvcnJlbGFyZWQgYW5kIHBvc2l0aXZlIGNvcnJlbGF0ZWQgdmFyaWFibGVzLiAoIDUgUG9pbnRzICkKCiFbXShkYXRhL2NvcnJfcGxvdC5wbmcpCgoKVGhlcmUgYXJlIHBvc3RpdmUgY29ycmVsYXRlZCBiZXR3ZWVuIGZiIGxpa2VzIGFuZCB1c2VyIHZvdGVzOyB1c2VyIHZvdGVzIGFuZCBncm9zcywgZmIgbGlrZXMgYW5kIGdyb3NzIC4gCgpTaWducyBvZiBuZWdhdGl2ZSBjb3JyZWxhdGVkIGJldHdlZW4gc2NvcmUgYW5kIGNhc3QgbGlrZXMuIGRpcmVhY3RvciBhbmQgY2FzdCBsaWtlcy4gCgpDYW4gY29uY2x1ZGUgdGhhdCB0aGVyZSBpcyBhIHZlcnkgc3Ryb25nIGNvcnJlbGF0aW9uIGJldHdlZW4gZmIgbGlrZXMgYW5kIHVzZXIgdm90ZXMuIAoKIyMjIEV4dHJhIENyZWRpdDogQW5hbHl6ZSB0aGUgY29ycmVsYXRpb24gdGFibGUgYmVsb3cuIEdpdmUgcmVsYXZhbnQgaW5mb3JtYXRpb24gYWJvdXQgdGhlIG5lZ2F0aXZlIGNvcnJlbGF0ZWQsIG5vIGNvcnJlbGFyZWQgYW5kIHBvc2l0aXZlIGNvcnJlbGF0ZWQgdmFyaWFibGVzLiAoIDUgUG9pbnRzICkKCmBgYHtyfQojIENyZWF0ZSBhIGNvcnJlbGF0aW9uIHRhYmxlICJjb3IobW92aWVzKSIKYGBgCgoKCgoKCgo=