Experimental Research Design Exam

For this project, we will use the “jobsatisfaction” dataset in the package “Datarium”. We will try to understand whether there are differences in the job satisfaction between males and females, and wether the education level impact the job satisfaction score.

Let’s start by loading the first libraries required!

library(datarium)
library(ggplot2)
library(dplyr)

## 
## Caricamento pacchetto: 'dplyr'

## I seguenti oggetti sono mascherati da 'package:stats':
## 
##     filter, lag

## I seguenti oggetti sono mascherati da 'package:base':
## 
##     intersect, setdiff, setequal, union

library(nortest)

And now we can pick our dataset.

data("jobsatisfaction")

head(jobsatisfaction)

## # A tibble: 6 x 4
##   id    gender education_level score
##   <fct> <fct>  <fct>           <dbl>
## 1 1     male   school           5.51
## 2 2     male   school           5.65
## 3 3     male   school           5.07
## 4 4     male   school           5.51
## 5 5     male   school           5.94
## 6 6     male   school           5.8

The dataset records the job satisfaction score of 58 people. Our aim in this project is to evaluate the variables that statistically impact this score.

First things first, let’s try to visualize the situation.

ggplot(jobsatisfaction, aes(y=score, x=education_level, fill= gender)) +
   geom_boxplot()+ ylab("Satisfaction score")+ ggtitle("Satisfaction score by gender and education level") +
   scale_fill_manual(values = c("blue", "red"))

Seems that for school and college level of education, female are a little bit more satisfied, while for university males are more satisfied. Let’s see if this differences are significant.

First things first, we need to check for the sample sizes.

table(select(jobsatisfaction, -c(id, score)))

##         education_level
## gender   school college university
##   male        9       9         10
##   female     10      10         10

As we can see, the size of females and males (at least for school and college) samples are different.

We start by performing an independent t-test on whether or not there is a difference in satisfaction between males and females if they have just school as education level. Let’s prepare our data to this test.

male_school_score <- jobsatisfaction %>%
  filter(gender == "male" & education_level == "school")%>%
  select(score)
  

female_school_score <- jobsatisfaction %>%
  filter(gender == "female" & education_level == "school")%>%
  select(score)

both_school_score <- jobsatisfaction%>%
  filter(education_level == "school")%>%
  select(score, gender)

Let’s check for the t-test assumptions!

shapiro.test(male_school_score$score)

## 
##  Shapiro-Wilk normality test
## 
## data:  male_school_score$score
## W = 0.98043, p-value = 0.9664

shapiro.test(female_school_score$score)

## 
##  Shapiro-Wilk normality test
## 
## data:  female_school_score$score
## W = 0.96292, p-value = 0.8185

The p-value is really high in both cases, so we can go on with assumption number 2: the samples have equal variances.

Rough check: the ratio between the SDs must not be greater than 2.

sd(male_school_score$score)/sd(female_school_score$score)

## [1] 0.7669707

sd(female_school_score$score)/sd(male_school_score$score)

## [1] 1.303831

It seems that we are good, but we need to be sure so let’s perform a test.

bartlett.test(both_school_score$score,both_school_score$gender)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  both_school_score$score and both_school_score$gender
## Bartlett's K-squared = 0.55084, df = 1, p-value = 0.458

The last assumption is that the dependent variable should not contain any outlier. In order to remove outliers, we create a function.

remove_outliers_iqr=function(x){
  IQR=summary(x)[5]-summary(x)[2]
  up_out=summary(x)[5]+1.5*IQR
  down_out=summary(x)[2]-1.5*IQR
  out_data=c(which(x<down_out), which(x>up_out))
  return (out_data)
}


male_school_score <- male_school_score %>%
  filter(!score %in% remove_outliers_iqr(male_school_score$score))

female_school_score <- female_school_score %>%
  filter(!score %in% remove_outliers_iqr(female_school_score$score))

Let’s check for the dimensions of the samples.

dim(male_school_score)

## [1] 9 1

dim(female_school_score)

## [1] 10  1

Seems that we hadn’t any outlier.

Now that assumptions are checked, we can perform our t-test. Hyphotesis: we’re going to test the hyphotesis assumption that females have different satisfaction with respect to males. Our null Hyphotesis will be that the satisfaction is equal for the 2 genders.

H0: mean(females) = mean(males)

Ha: mean(females) \(\neq\) mean(males)

t.test(female_school_score$score, male_school_score$score, alternative = "two.sided" )

## 
##  Welch Two Sample t-test
## 
## data:  female_school_score$score and male_school_score$score
## t = 1.6293, df = 16.621, p-value = 0.122
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.09340319  0.72206985
## sample estimates:
## mean of x mean of y 
##  5.741000  5.426667

Our p-value= 0.12 is greater than our significance level alpha = 0.05, so we can accept the null Hyphotesis: Females are equal satisfied by their jobs if they just have primary school as education.

Let’s now conduct the same experiment, but with college as level of education.

male_college_score <- jobsatisfaction %>%
  filter(gender == "male" & education_level == "college")%>%
  select(score)


female_college_score <- jobsatisfaction %>%
  filter(gender == "female" & education_level == "college")%>%
  select(score)

both_college_score <- jobsatisfaction %>%
  filter(education_level =="college")%>%
  select(score, gender)

Let’s check the assumptions now! 1st: Normality. We check with the Anderson-Darling normality test.

ad.test(male_college_score$score)

## 
##  Anderson-Darling normality test
## 
## data:  male_college_score$score
## A = 0.265, p-value = 0.5989

ad.test(female_college_score$score)

## 
##  Anderson-Darling normality test
## 
## data:  female_college_score$score
## A = 0.16973, p-value = 0.9055

At significance level 0.05 the normality is checked.

2nd: Homoschedasticity

sd(male_college_score$score)/sd(female_college_score$score)

## [1] 0.7154759

sd(female_college_score$score)/sd(male_college_score$score)

## [1] 1.397671

Checked: the ratio between the SDs is less than 2. Let’s perform also a test:

fligner.test(both_college_score$score, both_college_score$gender)

## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  both_college_score$score and both_college_score$gender
## Fligner-Killeen:med chi-squared = 1.0117, df = 1, p-value = 0.3145

Checked with rough roule and with test.

We need to remove outliers.

male_college_score <- male_college_score%>%
  filter(!score %in% remove_outliers_iqr(male_college_score$score))

female_college_score <- female_college_score %>%
  filter(!score %in% remove_outliers_iqr(female_college_score$score))

dim(male_college_score)

## [1] 9 1

dim(female_college_score)

## [1] 10  1

Again, seems that we didn’t have any outlier.

Now we can perform the t-test.

Hyphotesis: Female at college education level have different satisfaction with respect to males on the job.

H0: mean(females) = mean(males)

Ha: mean(females) \(\neq\) mean(males)

t.test(female_college_score$score, male_college_score$score, alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  female_college_score$score and male_college_score$score
## t = 1.2747, df = 16.239, p-value = 0.2203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1584331  0.6377664
## sample estimates:
## mean of x mean of y 
##  6.463000  6.223333

Again, we accept the null Hyphotesis since our p-value = 0.22 is greater than the level of significance alpha = 0.05. So the level of satisfaction seems quite the same.

Now we perform the last t-test!

male_university_score <- jobsatisfaction %>%
  filter(gender == "male" & education_level == "university") %>%
  select(score)

female_university_score <- jobsatisfaction %>%
  filter(gender == "female" & education_level == "university") %>%
  select(score)

both_university_score <- jobsatisfaction %>%
  filter(education_level == "university") %>%
  select(score, gender)

Let’s check again the assumptions for the t-test.

1st: Normality. We use a Lilliefors(Kolmogorov–Smirnov) test.

lillie.test(female_university_score$score)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  female_university_score$score
## D = 0.13815, p-value = 0.8501

lillie.test(male_university_score$score)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  male_university_score$score
## D = 0.21077, p-value = 0.2364

Normality checked for both samples.

Homoschedasticity with rough rule:

sd(male_university_score$score)/sd(female_university_score$score)

## [1] 0.4739722

sd(female_university_score$score)/sd(male_university_score$score)

## [1] 2.109829

Seems that we have a problem by comparing the two variances with the rough rule! Let’s try using a test.

fligner.test(both_university_score$score, both_university_score$gender)

## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  both_university_score$score and both_university_score$gender
## Fligner-Killeen:med chi-squared = 2.2418, df = 1, p-value = 0.1343

At significance level 0.05, homoschedasticity assumption is checked since our p-value is 0.13.

We can now remove outliers!

male_university_score <- male_university_score %>%
  filter(!score %in% remove_outliers_iqr(male_university_score$score))

female_university_score <- female_university_score %>%
  filter(!score %in% remove_outliers_iqr(female_university_score$score))

dim(male_university_score)

## [1] 10  1

dim(female_university_score)

## [1] 10  1

Let’s perform the t-test!

H0:mean(females) = mean(males)

Ha: mean(females) \(\neq\) mean(males)

t.test(male_university_score$score, female_university_score$score, alternative = "two.sided" )

## 
##  Welch Two Sample t-test
## 
## data:  male_university_score$score and female_university_score$score
## t = 2.6994, df = 12.849, p-value = 0.01837
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1760765 1.5959235
## sample estimates:
## mean of x mean of y 
##     9.292     8.406

Look! The p-value is smaller than the level of significance alpha = 0.05, so we reject the null Hyphotesis.

ONE-WAY ANOVA

We want to know if there is any significant difference between the average satisfaction score in 3 different education levels! In this case, we are not interested in differences between genders!

H0: There is not statistically significant difference between the means in different education levels

Ha: At least one mean is significantly different from the others.

Let’s do some primary investigations!

ggplot(jobsatisfaction) + 
  geom_density(aes(x=score, fill=education_level), alpha = 0.4)+
  xlab("Job satisfaction score") + ylab("Density")+
  ggtitle("Score distributions according to the education level")

Each group distribution seems pretty normal, but we perform some additional tests to be sure.

  shapiro.test(both_school_score$score)

## 
##  Shapiro-Wilk normality test
## 
## data:  both_school_score$score
## W = 0.97513, p-value = 0.8722

shapiro.test(both_college_score$score)

## 
##  Shapiro-Wilk normality test
## 
## data:  both_college_score$score
## W = 0.97435, p-value = 0.8588

shapiro.test(both_university_score$score)

## 
##  Shapiro-Wilk normality test
## 
## data:  both_university_score$score
## W = 0.91518, p-value = 0.08005

The normality is checked for each of the education level groups.

Now we need to check for the homogeneity of variances.

fligner.test(jobsatisfaction$score, jobsatisfaction$education_level)

## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  jobsatisfaction$score and jobsatisfaction$education_level
## Fligner-Killeen:med chi-squared = 4.1433, df = 2, p-value = 0.126

Checked! The p-value= 0.126 is greater than our significance level alpha = 0.05.

Let’s perform Anova.

mod <- aov(score ~ education_level , data =jobsatisfaction)
summary(mod)

##                 Df Sum Sq Mean Sq F value Pr(>F)    
## education_level  2  114.0   57.00   153.7 <2e-16 ***
## Residuals       55   20.4    0.37                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The p-value is really close to 0, so we reject the null-Hyphotesis. This means that there is a significant difference in job satisfaction between different education levels.

Problem: Which couples of means are different from each other?

We perform a Post-hoc test in order to find out this.

TUKEY = TukeyHSD(mod)
TUKEY

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = score ~ education_level, data = jobsatisfaction)
## 
## $education_level
##                         diff       lwr      upr     p adj
## college-school     0.7573684 0.2814578 1.233279 0.0009414
## university-school  3.2568947 2.7869706 3.726819 0.0000000
## university-college 2.4995263 2.0296022 2.969450 0.0000000

All the p-values are smaller than our level of significance. With the university education level, however, it seems we have a stronger evidence against the null hyphotesis.

library(multcompView)
plot(TUKEY , las=1 , col="red")

TWO-WAY ANOVA

We use Two-way ANOVA to evaluate simultaneously the effect of two categorical variables on a response numerical variable. So we will evaluate the effect of the 2 categorical variables(gender and education level) simultaneously on the job satisfaction score.

First things first, let’s check(again) sample sizes.

table(select(jobsatisfaction,-c(id, score)))

##         education_level
## gender   school college university
##   male        9       9         10
##   female     10      10         10

As we see, not all of the cells have equal size, so this is an UNBALANCED design.

Let’s start with preliminary investigations by doing a Two-way interaction plot.

interaction.plot(x.factor = jobsatisfaction$education_level, trace.factor = jobsatisfaction$gender, 
                 response = jobsatisfaction$score, fun = mean, type = "b", legend = TRUE,
                 xlab = "Education level",
                 ylab = "Job Satisfaction Score",pch = c(1,19), col = c("blue", "red"))

Let’s check the assumptions for two-way anova! First assumption: normality. We need to check it for each cell.

shapiro.test(jobsatisfaction$score[jobsatisfaction$gender=="male" & jobsatisfaction$education_level == "school"]) #School

## 
##  Shapiro-Wilk normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "male" & jobsatisfaction$education_level == "school"]
## W = 0.98043, p-value = 0.9664

shapiro.test(jobsatisfaction$score[jobsatisfaction$gender == "female" & jobsatisfaction$education_level == "school"])

## 
##  Shapiro-Wilk normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "female" & jobsatisfaction$education_level == "school"]
## W = 0.96292, p-value = 0.8185

ad.test(jobsatisfaction$score[jobsatisfaction$gender=="male" & jobsatisfaction$education_level == "college"])#College

## 
##  Anderson-Darling normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "male" & jobsatisfaction$education_level ==     "college"]
## A = 0.265, p-value = 0.5989

ad.test(jobsatisfaction$score[jobsatisfaction$gender == "female" & jobsatisfaction$education_level == "college"])

## 
##  Anderson-Darling normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "female" & jobsatisfaction$education_level ==     "college"]
## A = 0.16973, p-value = 0.9055

shapiro.test(jobsatisfaction$score[jobsatisfaction$gender == "male" & jobsatisfaction$education_level=="university"])#University

## 
##  Shapiro-Wilk normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "male" & jobsatisfaction$education_level == "university"]
## W = 0.91572, p-value = 0.3226

shapiro.test(jobsatisfaction$score[jobsatisfaction$gender=="female" & jobsatisfaction$education_level =="university"])

## 
##  Shapiro-Wilk normality test
## 
## data:  jobsatisfaction$score[jobsatisfaction$gender == "female" & jobsatisfaction$education_level == "university"]
## W = 0.95044, p-value = 0.6737

The normality assumption is checked for each cell.

We need to check for similarity of variances.

library(car)

## Caricamento del pacchetto richiesto: carData

## 
## Caricamento pacchetto: 'car'

## Il seguente oggetto è mascherato da 'package:dplyr':
## 
##     recode

leveneTest(score ~ education_level*gender, data = jobsatisfaction)

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  5   2.197 0.06856 .
##       52                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The p-value=0.06 is greater than our significance level alpha =0.05, so our assumption is checked.

Now we can perform our Anova.

H0:The effect of education level on the satisfaction score does not depend on the gender.

Ha: For different levels of education level, the gender affects differently the job satisfaction score.

two_way_anova <- aov(score ~ education_level*gender, data = jobsatisfaction)
Anova(two_way_anova, type = "III")

## Anova Table (Type III tests)
## 
## Response: score
##                         Sum Sq Df  F value    Pr(>F)    
## (Intercept)            265.038  1 876.0876 < 2.2e-16 ***
## education_level         80.128  2 132.4321 < 2.2e-16 ***
## gender                   0.468  1   1.5471  0.219147    
## education_level:gender   4.440  2   7.3379  0.001559 ** 
## Residuals               15.731 52                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What we can see from the table above is that there is no statistically significant difference in mean job satisfaction score between females and males (p = 0.21) but there are significant differences between different education levels (p close to 0). Also the interaction, as we see, is significant.

Now, wait: Which pair of education levels are different between them? Let’s do a post-hoc test!

Tukey = TukeyHSD(two_way_anova, which = "education_level")
Tukey

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = score ~ education_level * gender, data = jobsatisfaction)
## 
## $education_level
##                         diff       lwr      upr     p adj
## college-school     0.7573684 0.3268386 1.187898 0.0002637
## university-school  3.2568947 2.8317806 3.682009 0.0000000
## university-college 2.4995263 2.0744121 2.924641 0.0000000

All the couples, as we see, lead to reject the null Hyphotesis.

Experimental Research Design Exam

Michele Corvino

7/1/2022

ONE-WAY ANOVA

TWO-WAY ANOVA