Business Analytics

Question 1: Read in the gambling dataset check the first couple of rows and describe the data types. Identify incorrect data types, if any. ( 5 Points )

mydata = read.csv(file="data/gambling.csv")

The data set is an interval data type, and uses continuous metrics.

Question 2: Describe the data using full sentences and using descriptive statistics. ( 5 Points )

sex = mydata$sex 
status = mydata$status 
income = mydata$income 
verbal = mydata$verbal 
gamble = mydata$gamble

meansex = mean(sex) 
sdsex = sd(sex) 
meanstatus = mean(status) 
sdstatus = sd(status) 
meanincome = mean(income) 
sdincome = sd(income) 
meanverbal = mean(verbal) 
sdverbal = sd(verbal) 
meangamble = mean(gamble) 
sdgamble = sd(gamble)  
meansex

## [1] 0.4042553

sdsex

## [1] 0.4960529

meanstatus

## [1] 45.23404

sdstatus

## [1] 17.26294

meanincome

## [1] 4.641915

sdincome

## [1] 3.551371

meanverbal

## [1] 6.659574

sdverbal

## [1] 1.856558

meangamble

## [1] 19.30106

sdgamble

## [1] 31.51587

THe data includes the variables sex, status, income, verbal, and gamble. The data is quantitative. I found the mean and standard deviation of all variables.

Question 3: Estimate the upper and lower threshold for the verbal score ( 5 Points )

HINT: A common way to estimate the upper and lower threshold is to take the mean (+ or -) 3 * standard deviation.

upper = (meanverbal + 3) * sdverbal
lower = (meanverbal - 3) * sdverbal
upper

## [1] 17.93356

lower

## [1] 6.794213

Question 4: Calculate the z-score for income where x=13. Based on the income value x=13 pounds per week, how would you rate the income: low income, average income, high income. Why? ( 5 Points )

Hint: zscore = (x - mean)/sd

zscore = (13 - meanincome)/sdincome 
zscore

## [1] 2.353481

The zscore of income is low compared to the mean of income which is 4.6419, but within range of distribution. The income is average.

Question 5: Create a histogram for the zscore of income. What do you notice about the shape? ( 5 Points )

Hint: To plot a histogram, use the function hist(variable).

hist(zscore)

Question 6: Analyze the correlation plot below. Give relavant information about the negative correlated, no correlared and positive correlated variables. ( 5 Points )

THe correlation plot is decreasing from left to right, indicating that the plot is negatively correlated. As you go right, the infomration is denser, meaning there is a larger weight on those unites for those variables. fb_likes, user_votes, and score greatly affect one another.

Extra Credit: Analyze the correlation table below. Give relavant information about the negative correlated, no correlared and positive correlated variables. ( 5 Points )

Business Analytics - MIDTERM

CME Group Foundation Business Analytics Lab

Ally Ungashick