1.1 First data sets should be imported into R. Since there is a special package in CRAN for the examples and cases from Statistical Sleuth (3rdEd.), one could easily load data sets by installing Sleuth3 package

# install.packages(Sleuth3)
require(Sleuth3)
## Loading required package: Sleuth3

1.2 After installing “Sleuth3” package, we are ready to start examining our first case. Since the database including the data sets for cases are also installed with Sleuth3 package, one could call case1’s data set directly,

data <- Sleuth3::case0101

1.3 Case 1 covers a randomized experiment which had been taken by Teresa Ambile for the research piece listed bellow

Amiable, T. (1985). Motivation and Creativity: Effects of Motivational Orientation on Creative Writers, Journal of Personality and Social Psychology 48(2): 393-399.

The idea behind the experiment is to test effects of the intrinsic and extrinsic motivation behind creativity. Subjects with considerable experience are assigned to one of the treatment groups. Researcher assumes assignment is random.

table(data$Treatment)
## 
## Extrinsic Intrinsic 
##        23        24

1.4 24 subjects that are requested to complete a questionnaire which involves ranking the intrinsic reasons behind writing - (doing something because doing it brings satisfaction), other 23 extrinsic exemplars are completed a different questionnaire used as a device to get groups thinking on extrinsic motivation. Small portion of data set could be observed via the following list.

head(data)
##   Score Treatment
## 1   5.0 Extrinsic
## 2   5.4 Extrinsic
## 3   6.1 Extrinsic
## 4  10.9 Extrinsic
## 5  11.8 Extrinsic
## 6  12.0 Extrinsic

1.5 After finishing the questionnaires all subjects were asked to write poems. These poems are submitted to 12 poets, who have evaluated them on a 40 point-scale of creativity. Here one could observe summary of average creativity scores given by 12 judges for each individual.

tapply(data$Score,data$Treatment,summary)
## $Extrinsic
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    5.00   12.15   17.20   15.74   18.95   24.00 
## 
## $Intrinsic
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   12.00   17.42   20.40   19.88   22.30   29.70

And the standard deviations

tapply(data$Score,data$Treatment,sd)
## Extrinsic Intrinsic 
##  5.252596  4.439513

1.6 We can also employ Box-Plot visualization to explore data

boxplot(data$Score ~ data$Treatment, xlab="Treatment", ylab="Creativity Scores",title="Comparing Box-plots of Groups",horizontal=FALSE,main = "Haiku Creativity Scores for 47 Creative Writing Students",names = c("23 'Extrinsic' Group Students","24 'Intrinsic' Group Students"))

1.7 By employing t.test() we could see if there is a statistically significant difference between the means’ of each group.

But before conducting a t test first we should be confident on if our data satisfies the assumptions or not.

Assumption 1: Both series should be normally distributed

tapply(data$Score,data$Treatment,shapiro.test) 
## $Extrinsic
## 
##  Shapiro-Wilk normality test
## 
## data:  X[[i]]
## W = 0.92362, p-value = 0.07969
## 
## 
## $Intrinsic
## 
##  Shapiro-Wilk normality test
## 
## data:  X[[i]]
## W = 0.96813, p-value = 0.6211

As it could be seen from the results we could say that scores for Extrinsic and Intrinsic group are normally distributed with 0,05 alpha level.

Assumption 2: Series should have equal variances

var.test(data$Score[data$Treatment=="Intrinsic"],data$Score[data$Treatment=="Extrinsic"])
## 
##  F test to compare two variances
## 
## data:  data$Score[data$Treatment == "Intrinsic"] and data$Score[data$Treatment == "Extrinsic"]
## F = 0.71437, num df = 23, denom df = 22, p-value = 0.4289
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.3047427 1.6612046
## sample estimates:
## ratio of variances 
##          0.7143691

regarding that p value of F test equals to 0.4289, we could confidently say that variances of two groups scores are equal to each other.

1.7.1 Finally, with a 95 % confidence interval we could say that there is a significant fall in creativity due to having extrinsic motivation rather than intrinsic motivation.

t.test(data$Score ~ data$Treatment, conf.level=0.95 )
## 
##  Welch Two Sample t-test
## 
## data:  data$Score by data$Treatment
## t = -2.9153, df = 43.108, p-value = 0.005618
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.010803 -1.277603
## sample estimates:
## mean in group Extrinsic mean in group Intrinsic 
##                15.73913                19.88333

1.8 We found that motivation has a significant effect on the level of creativity. Unless the that was a randomized experiment, its also possible that difference in creativity scores could be caused by differences between questionnaires. Hence subjects were assigned by results of these two surveys, subject’s inference on these questions could be the factor behind difference between creativity measures.

So extending this inference to any other groups might be speculative.

1.9 Randomization Distribution of the test statistics

a.1 If the means of populations 1 and 2 are 15, and 19.2, respectively, then 42 - ??, becomes the single parameter for answering the questions of interest. Inferences and uncertainty measures are based on the difference, f, - Y1, between sample averages, which estimates the difference in population means.

Regarding on treatment assignments supplied by the book, we know that difference between means of “Intrinsic” and “Extrinsic” groups is 4.14

diffmeans = mean(data$Score[data$Treatment=="Intrinsic"]) - mean(data$Score[data$Treatment=="Extrinsic"])
diffmeans
## [1] 4.144203

a.2 But the crucial question here is when we repeat this analysis with different treatment group assignments for 47 individuals does this difference stays within a narrow range or diversify notably. If second option holds than we have to deduce that our sampling distribution leads to rare outcome and it’s impossible to state a decision by this distribution

a.3 Let’s try to check, if 1000 completely different treatment assignments for our 47 individuals.

numsim=1000
require(mosaic)
## Loading required package: mosaic
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: lattice
## Loading required package: ggplot2
## Loading required package: mosaicData
## Loading required package: Matrix
## 
## The 'mosaic' package masks several functions from core packages in order to add additional features.  
## The original behavior of these functions should not be affected by this.
## 
## Attaching package: 'mosaic'
## The following object is masked from 'package:Matrix':
## 
##     mean
## The following objects are masked from 'package:dplyr':
## 
##     count, do, tally
## The following objects are masked from 'package:stats':
## 
##     D, IQR, binom.test, cor, cov, fivenum, median, prop.test,
##     quantile, sd, t.test, var
## The following objects are masked from 'package:base':
## 
##     max, mean, min, prod, range, sample, sum
nulldist = do(numsim)*diff(mean(data$Score~shuffle(data$Treatment))) #This code creates 1000 different treatment distrubition profiles and calculates difference between score averages 

a.4 When we calculate the confidence of the difference of average of 10000 values coming from our previous simulation, we could see that 4.14 estimates are far distant from confidence intervals. So it’s impossible say that our treatment label assignment is random and reliable enough to generalize.

nrow(nulldist) #Shows number of rows for nulldist distribution 
## [1] 1000
confint(nulldist)
##        name     lower    upper level     method estimate
## 1 Intrinsic -2.991368 3.182491  0.95 percentile 4.144203

a.5 A histogram might be a much better way to illustrate how small portion of the distribution converges to 4.14 value.

histogram(~ Intrinsic, nint=25, data=nulldist, v=c(-4.14,4.14))

CASE 2

1.9 Did a bank discriminatory pay higher starting salaries to men than women ? H. V. Roberts study cited bellow investigates this phenomenon by using salaries of entry level clerical employees hired by a bank

Roberts, H.V. (1979). Harris Trust and Savings Bank: An Analysis of Employee Compensation, Report 7946, Center for Mathematical Studies in Business and Economics, University of Chicago Graduate School of Business.

data2<-Sleuth3::case0102

1.10 As it could be seen from the exhibit bellow, there are 32 male and 61 female employees who differ from each other when the data and median level of salaries are compared.

summary(data2)
##      Salary         Sex    
##  Min.   :3900   Female:61  
##  1st Qu.:4980   Male  :32  
##  Median :5400              
##  Mean   :5420              
##  3rd Qu.:6000              
##  Max.   :8100
tapply(data2$Salary,data2$Sex,summary)
## $Female
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3900    4800    5220    5139    5400    6300 
## 
## $Male
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    4620    5400    6000    5957    6075    8100

1.11 Maybe some visual representations could wok better for this case. Here are box plots for two segments.

boxplot(data2$Salary ~ data2$Sex, ylab="Begining Salary", names=c("61 Female Hires","32 Male Hires") ,main="Entry Level Clerical Employee Salaries for a Bank between 1969 and 1977")

1.12 Also the histograms suggest that while Women salaries distributes between 5000 - 5500 and men’s salaries are mainly gathers between 5500 - 6000. On the other hand we could also observe influence of a strong outlines in Men series.

 hist(data2$Salary[data2$Sex=="Female"],main="Begining Salaries for Women",xlab="Salary Level",ylab="Number of Employees")

 hist(data2$Salary[data2$Sex=="Male"],main="Begining Salaries for Men",xlab="Salary Level",ylab="Number of Employees")

1.13 After visual evaluation we could test significance of difference between salary levels of men and women by utilizing a t test. But before conducting a t.test first we should explore basic assumptions.

A - Homogeneity of Variances; as it could be seen from calculated standard deviations its impossible deduce difference between two statistics is significant or not, so we need to make a F.Test to see whether they are equal or not.

tapply(data2$Salary,data2$Sex,sd)
##   Female     Male 
## 539.8707 690.7333

The F.Test results shows that under 0,95 confidence its impossible to reject H0, so we could take variances of two series equal.

var.test(data2$Salary[data2$Sex=="Female"],data2$Salary[data2$Sex=="Male"])
## 
##  F test to compare two variances
## 
## data:  data2$Salary[data2$Sex == "Female"] and data2$Salary[data2$Sex == "Male"]
## F = 0.61088, num df = 60, denom df = 31, p-value = 0.1022
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.3178084 1.1034440
## sample estimates:
## ratio of variances 
##          0.6108839

1.14 Finally conducted t -test shows that true difference in mean is not equal to 0, so we could say women salary is significantly lower than those of Men. The evidence is consistent with the discrimination, but other possible explanations cannot be ruled out. For example, the males may had more years of experience.

 t.test(data2$Salary ~ data2$Sex, var.equal=TRUE)
## 
##  Two Sample t-test
## 
## data:  data2$Salary by data2$Sex
## t = -6.2926, df = 91, p-value = 1.076e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1076.2465  -559.7985
## sample estimates:
## mean in group Female   mean in group Male 
##             5138.852             5956.875

1.15 Permutation Test for Case 2

numsim=1000
require(mosaic)
nulldist2 = do(numsim)*diff(mean(data2$Salary~shuffle(data2$Sex))) #This code creates 1000 different treatment distrubition profiles and calculates difference between score averages
nrow(nulldist2)
## [1] 1000
confint(nulldist2)
##   name     lower    upper level     method estimate
## 1 Male -281.1117 313.5142  0.95 percentile 818.0225
histogram(~ Male, nint=50, data=nulldist2, v=c(-818,818))