Design of the Study

The main objective of this research was to investigate the effect of online digital stories on the perception of the distinction between two similar English consonants, /p/ and /b/, in Saudi kindergarten children (aged between four and five years-old). The investigation looks at differences between participants’ pre- and post-performance on phoneme identification tasks in an interactive digitalized story group (IG) and those in the non-interactive digitalized story group (NIG). Also, the current research tries to establish whether Saudi EFL kindergarten children enjoy using online digitalized stories (whether interactive or non-interactive) and the differences in enjoyment of the two types. This investigation has four phases:
1. Phoneme perception pre-test.
2. The digital stories (interactive or non-interactive) (i.e., a training phase).
3. Phoneme perception post-test.
4. Evaluation using a rating task.

Questions Addressed

1- What is the influence of interactive and non-interactive DST on the ability of Saudi EFL kindergarten students to perceive the distinction between English [p] and [b]?
2- Do the Saudi EFL kindergarten students prefer the interactive or non-interactive version of DST?


Strategy

  1. General Data Exploration: assess and deal with (a) missing values, (b) distribution and variance of Features (c) collinearity between Features, (d) Theoratical relevance of features.

  2. Answer Question 1

  3. Answer Question 2


Importing Required Libraries

## MagrittR allows us to pipe Data using %>%
library(magrittr)

## Dplyr allows us to manipulate Data
library(dplyr)

## Reshape allows Data Format Manipulation
library(reshape2)

## Data Visualization
library(plotly)

## Visualize Missing Values
library(naniar)

## Stringr is used to match string patters
library(stringr)

## T-test
library(lsr)

## Fisher Test
library(stats)

## Linear Mixed Model Logit Function
library(lme4)

## ICC Calculation
library(sjstats)

## VIF Calculation
library(car)

## Dev tools
library(devtools)

## lattice for visualization
library(lattice)

## Stargazer for model Comparisons
library(stargazer)

General Data Exploration

  1. missing values
  1. distribution and variance of Features
  1. correlation between Features
  1. theoratical relevance of Features

Import Data Set

## Read all Data set
all_data <- read.csv("alldata.csv")

## Read Data set Dictionary
#> Might use through the analysis
data_dictionary <- read.csv("alldata columns dictionary.csv")[1,]

A. Exploring NA Values

## Check Half of the Data 
gg_miss_var(all_data[,1:(ncol(all_data)/2)])

## Check Second Half of the Data
gg_miss_var(all_data[,(ncol(all_data)/2):ncol(all_data)])

  1. other:first_english_interaction NA values reaching over 900 missing values. To deal with this issue, we replace all NA values with “None” to resemble that no other first English interaction was made.
all_data$other.first_English_interaction[is.na(all_data$other.first_English_interaction)] = "None"
  1. what_otherlanguage NA values will be dealt with similarly as above.
all_data$what_otherlanguage[is.na(all_data$what_otherlanguage)] = "None"
  1. namecountry NA values cannot be assumed to be any value, and “None” will not signify anything. Therefore, we will remove this column from our data since more than 600 values /1000 are missing. Also, country will be enough to assess whether the child has lived in an English speaking country or not.
all_data <- all_data %>% dplyr::select(-namecountry)
  1. formal_study has around 200 missing values corresponding to 10 students, we impute these values by the mode (the majority of formal_study) as we cannot assume that these missing values refer to 0 years or other.
## Function to get mode
getmode <- function(v) {
   uniqv <- unique(v)
   uniqv[which.max(tabulate(match(v, uniqv)))]
}

## Get mode
mode_fs = getmode(all_data$formal_study)

## Replace
all_data$formal_study[is.na(all_data$formal_study)] = mode_fs
  1. freqlistX is expected to have missing values, as the less frequent words will fill in less space in the table. We ignore this issue for now, and take it into consideration later in the analysis.
## Re-Check for missing values in the first half
## Check Half of the Data 
gg_miss_var(all_data[,1:(ncol(all_data)/2)])

B. Exploring Distribution

Age Distribution by Gender

Age is restricted between 4 and 5 over both genders.

fig <- plot_ly(all_data, y = ~age, color = ~gender, type = "box") %>%
  layout(title="Age Distribution by Gender")
fig

Gender Distribution

Gender is nearly equally distributed with M=26, F=24

## Get Data
data_gender = unique(all_data[c("ID", "gender")])

## Plot
fig <- plot_ly(data_gender, y= ~gender, color = ~gender, type = "histogram") %>%
  layout(title="Gender Distribution")
fig

KG Stage Distribution

All participants are in KG 2 (20) or 3 (30). The data is nearly balanced. We proceed with no change.

## Get Data
data_kg = unique(all_data[c("ID", "KG_stage")])

fig <- plot_ly(data_kg, y= ~as.factor(KG_stage), color = ~as.factor(KG_stage), type="histogram") %>%
  layout(title="KG Stage Distribution", yaxis=list(title="KG Stage"))
fig

Age Distribution by KG Stage

We can see that all Those of age 5 are in KG3 and all those of age 4 are in KG2. This means that these two variables are directly correlated. We should keep only one in our analysis. We keep age and we remove KG_stage from the data.

## Get Data
data_age_kg = unique(all_data[c("ID", "age", "KG_stage")])

fig <- plot_ly(data_age_kg, y = ~as.factor(age), color = ~as.factor(KG_stage), type = "histogram") %>%
  layout(title="Age Distribution by KG Stage", yaxis=list(title="Age"))
fig
## Remove KG_stage from Data
all_data <- all_data %>% dplyr::select(-KG_stage)

Country: English or Not

The graph shows 17 participants have lived in English speaking countries while 33 of them have lived in non-English Speaking countries. In future analysis, we will be interested in assessing how this difference in countries that speak english or not affects the ability of the participants to differentiate between the phonemes of interest.

## Get Data
data_country = unique(all_data[c("ID", "country")])

fig <- plot_ly(data_country, y = ~as.factor(country), color= ~as.factor(country), type = "histogram") %>%
  layout(title="Number of Participants that Lived in English Countries or Not",
         yaxis=list(title="Lived in English Country (yes) or not (no)"))
fig

First Encounter w/English

The Data Collected contains age of the child when first encountered with english, to deduce years after first encounter with english we subtract age from age when first encountered with english (age_english).

Later in the study, we will analyze the scores of the participants in relation to this value if necessary.

## Save in Temporary vector
age_english = all_data$age_english
## Remove from data set
all_data <- all_data %>% dplyr::select(-age_english)
## Add years of English
all_data <- all_data %>% mutate(years_english = age - age_english)
## Check Years of English has been added
head(unique(all_data[c("ID","years_english")], n= 10))
##       ID years_english
## 1   IG.1             1
## 21  IG.2             3
## 41  IG.3             2
## 61  IG.4             1
## 81  IG.5             2
## 101 IG.6             0
## Plot
data_yearsE = unique(all_data[c("ID", "years_english")])

fig <- plot_ly(data_yearsE, y = ~years_english, color = ~as.factor(years_english), type = "histogram") %>%
  layout(title="Number of Participants vs Years of English Knowledge",
         yaxis=list(title="Years of English Knowledge"),
         xaxis=list(title="Frequency of Participants"))
fig

Years Learning English

Years learning english is a more accurate measure than the first encounter with english. This variable is already collected and has 5 levels in the following order: 0-6 months, 1year, 2years, 3years, 4years. We will encode these levels as 0,1,2,3,4 respecitevely.

Note: The original data set has miss-spelling with “1 year” and “1year” and “1year” but they are all the same. We correct this issue and continue.

We will assess this variable with respect to the students’ scores later in the study.

We also note that the frequency of 1year is high, this might be due to the imputation technique we did. We might have to reassess this technique.

## Check Levels
levels(as.factor(all_data$formal_study))
## [1] "0-6 months" "1 year"     "1year"      "1year "     "2years"    
## [6] "3years"     "4years"
## Correct Levels
all_data$formal_study[all_data$formal_study == "1 year"] = "1year"
all_data$formal_study[all_data$formal_study == "1year "] = "1year"

## Encode
all_data$formal_study[all_data$formal_study == "0-6 months"] = 0
all_data$formal_study[all_data$formal_study == "1year"]      = 1
all_data$formal_study[all_data$formal_study == "2years"]     = 2
all_data$formal_study[all_data$formal_study == "3years"]     = 3
all_data$formal_study[all_data$formal_study == "4years"]     = 4

## Plot
data_formalE = unique(all_data[c("ID", "formal_study")])

fig <- plot_ly(data_formalE, y = ~formal_study, color = ~as.factor(formal_study), type = "histogram") %>%
  layout(title="Number of Participants vs Years of Formal English Education",
         yaxis=list(title="Years of Formal English Education"),
         xaxis=list(title="Frequency of Participants"))
fig

English Proficiency

The Data Collected contains english proficiency of children in three levels Few words, Short Phrases, and Fluent. We encode these levels as 0, 1, 2. Another possible encoding would be -1,0,1 but the latter will penalize participants with few words of knowledge instead of being neutral (0), which we cannot assume. For this reason, we stick to 0,1,2 to resemble the distance between knowledge and how well theoratically might perform.

Later in the study, we will analyze the scores of the participants in relation to their english proficiency.

## Observe Levels
levels(as.factor(all_data$spoken))
## [1] "Few words"     "Fluent"        "Short phrases"
## Encode
all_data$spoken[all_data$spoken=="Few words"]     = 0
all_data$spoken[all_data$spoken=="Short phrases"] = 1
all_data$spoken[all_data$spoken=="Fluent"]        = 2

## Plot
data_spoken = unique(all_data[c("ID", "spoken")])

fig <- plot_ly(data_spoken, y = ~as.factor(spoken), color = ~as.factor(spoken), type = "histogram") %>%
  layout(title="Number of participants vs proficiency of spoken english",
         yaxis=list(title="English Proficiency (0 few words, 1 short phrases, 2 fluent)"),
         xaxis=list(title="Frequency of Participants"))
fig

Parents’ Educational Level

The parents’ educational level has 4 levels: Primary/Intermediate, Intermediate/Secondary, Undergraduate, and Postgraduate. We will encode these as 0,1,2,3 respectively.

Similary, later in the study, we will analyze the scores of the participants in relation to their parents’ level of education. It is important to note that the data is skewed.

## Show levels
levels(as.factor(all_data$parent_ed))
## [1] "Intermediate/Secondary" "Postgraduate"           "Primary/Intermediate"  
## [4] "Undergraduate"
## Encode
all_data$parent_ed[all_data$parent_ed == "Primary/Intermediate"]   = 0
all_data$parent_ed[all_data$parent_ed == "Intermediate/Secondary"] = 1
all_data$parent_ed[all_data$parent_ed == "Undergraduate"]          = 2
all_data$parent_ed[all_data$parent_ed == "Postgraduate"]           = 3

## Plot
data_parent_ed = unique(all_data[c("ID","parent_ed")])

fig <- plot_ly(data_parent_ed, y = ~as.factor(parent_ed), color = ~as.factor(parent_ed), type = "histogram") %>%
  layout(title="Number of Participants vs Parents' level of Education",
         yaxis=list(title="Level of Education (0 primary, 3 postgraduate)"),
         xaxis=list(title="Frequency of Participants"))

fig

Older Siblings

  1. participants have older siblings while the rest (30) dont. It might also be interesting to assess their scores with respect to having older siblings or not.
data_older_sib = unique(all_data[c("ID","older.Sib")])

fig <- plot_ly(data_older_sib, y= ~as.factor(older.Sib), color = ~as.factor(older.Sib), type = "histogram") %>%
  layout(title="Number of Participants vs Presence of Older Siblings",
         yaxis=list(title="Older Siblings"),
         xaxis=list(title="Frequency of Participants"))

fig

Speaking English

We assess the percentages of time the participants speak to with their family and friends and school in English. Later we do the same thing on both groups IG and NIG, pre and post-test.

Findings: What we can see here is that the students speak English consistently with their Family less than they do with their Friends, which in turn is less than they do in School. This finding is logical.

Family

data_speak_fam = unique(all_data[c("ID","R_speak_family")])

fig <- plot_ly(data_speak_fam, y= ~R_speak_family, type = "histogram") %>%
  layout(title="Number of Participants vs % of speaking English with Family",
         yaxis=list(title="% of Speaking English with Family"),
         xaxis=list(title="Frequency of Participants"))

fig

Friends

data_speak_frnds = unique(all_data[c("ID","R_speak_friends")])

fig <- plot_ly(data_speak_frnds, y= ~R_speak_friends, type = "histogram") %>%
  layout(title="Number of Participants vs % of speaking English with Friends",
         yaxis=list(title="% of Speaking English with Friends"),
         xaxis=list(title="Frequency of Participants"))

fig

School

data_speak_school = unique(all_data[c("ID","R_speak_school")])

fig <- plot_ly(data_speak_school, y= ~R_speak_school, type = "histogram") %>%
  layout(title="Number of Participants vs % of speaking English in School",
         yaxis=list(title="% of Speaking English in School"),
         xaxis=list(title="Frequency of Participants"))

fig

Writing English

We assess the percentages of time the participants write with their family and friends and school in English. Later we do the same thing on both groups IG and NIG, pre and post-test.

Findings: the participants write in English more in School than with family and friends. This is a logical finding as well in an Arabian country scenario.

Family

data_write_fam = unique(all_data[c("ID","R_write_family")])

fig <- plot_ly(data_write_fam, y= ~R_write_family, type = "histogram") %>%
  layout(title="Number of Participants vs % of writing English with Family",
         yaxis=list(title="% of Writing English with Family"),
         xaxis=list(title="Frequency of Participants"))

fig

Friends

data_write_frnds = unique(all_data[c("ID","R_write_friends")])

fig <- plot_ly(data_write_frnds, y= ~R_write_friends, type = "histogram") %>%
  layout(title="Number of Participants vs % of writing English with Friends",
         yaxis=list(title="% of Writing English with Friends"),
         xaxis=list(title="Frequency of Participants"))

fig

School

data_write_school = unique(all_data[c("ID","R_write_school")])

fig <- plot_ly(data_write_school, y= ~R_write_school, type = "histogram") %>%
  layout(title="Number of Participants vs % of writing English at School",
         yaxis=list(title="% of Writing English at School"),
         xaxis=list(title="Frequency of Participants"))

fig

Reading English

We assess the percentages of time the participants read english for pleasure or at school. Later we do the same thing on both groups IG and NIG, pre and post-test.

Finding: More participants read more often at school than for pleasure. This finding is also logical for the participants’ age and medium.

Pleasure

data_read_pleaure = unique(all_data[c("ID","R_read_pleasure")])

fig <- plot_ly(data_read_pleaure, y= ~R_read_pleasure, type = "histogram") %>%
  layout(title="Number of Participants vs % of Reading English for Pleasure",
         yaxis=list(title="% of Reading English for Pleasure"),
         xaxis=list(title="Frequency of Participants"))

fig

School

data_read_school = unique(all_data[c("ID","R_read_school")])

fig <- plot_ly(data_read_school, y= ~R_read_school, type = "histogram") %>%
  layout(title="Number of Participants vs % of Reading English at School",
         yaxis=list(title="% of Reading English at School"),
         xaxis=list(title="Frequency of Participants"))

fig

Watching English

We assess the percentages of time the participants watch TV in English. Later we do the same thing on both groups IG and NIG, pre and post-test.

data_watch_eng = unique(all_data[c("ID","R_watch_tv")])

fig <- plot_ly(data_watch_eng, y= ~R_watch_tv, type = "histogram") %>%
  layout(title="Number of Participants vs % of Watching TV in English",
         yaxis=list(title="% of Watching TV in English"),
         xaxis=list(title="Frequency of Participants"))

fig

Listening English

We assess the percentages of time the participants that listen to English through audio books and classes. Later we do the same thing on both groups IG and NIG, pre and post-test.

Findings: We can see that the participants % and their frequency of listening to audio books are uniformly distributed. Most participants however listen to English <50% of the time through online classes.

Audio Books

data_listen_books = unique(all_data[c("ID","R_audio_books")])

fig <- plot_ly(data_listen_books, y= ~R_audio_books, type = "histogram") %>%
  layout(title="Number of Participants vs % of Listening to English Audio Books",
         yaxis=list(title="% of Listening to English Audio Books"),
         xaxis=list(title="Frequency of Participants"))

fig

Online Classes

data_listen_class = unique(all_data[c("ID","R_classes")])

fig <- plot_ly(data_listen_class, y= ~R_classes, type = "histogram") %>%
  layout(title="Number of Participants vs % of Listening to English in Online Classes",
         yaxis=list(title="% of Listening to English in Online Classes"),
         xaxis=list(title="Frequency of Participants"))

fig

Gaming English

We assess the percentages of time the participants that encounter English through PC Games, iPad or Phone Games. Later we do the same thing on both groups IG and NIG, pre and post-test.

PC

data_games = unique(all_data[c("ID","R_games_PC")])

fig <- plot_ly(data_games, y= ~R_games_PC, type = "histogram") %>%
  layout(title="Number of Participants vs % of PC Games played in English",
         yaxis=list(title="% of PC Games played in English"),
         xaxis=list(title="Frequency of Participants"))

fig

iPad

data_games = unique(all_data[c("ID","R_games_iPad")])

fig <- plot_ly(data_games, y= ~R_games_iPad, type = "histogram") %>%
  layout(title="Number of Participants vs % of iPAD Games played in English",
         yaxis=list(title="% of PC Games played in English"),
         xaxis=list(title="Frequency of Participants"))

fig

Phone

data_games = unique(all_data[c("ID","R_games_phone")])

fig <- plot_ly(data_games, y= ~R_games_phone, type = "histogram") %>%
  layout(title="Number of Participants vs % of Phone Games played in English",
         yaxis=list(title="% of PC Games played in English"),
         xaxis=list(title="Frequency of Participants"))

fig

Mixing Languages

We assess the frequency of participants that mix languages while speaking and those who dont. Later we do the same thing on both groups IG and NIG, pre and post-test.

We encode the variables as 0 - Never, 1 - Sometimes, 2 - Always.

The results are normally distributed, with most participants sometimes mixing languages (38) while 6 of them never do and 6 others always do.

## Encode
all_data$mix[all_data$mix == "Never"] = 0
all_data$mix[all_data$mix == "Sometimes"] = 1
all_data$mix[all_data$mix == "Always"] = 2

## Plot
data_mix = unique(all_data[c("ID","mix")])

fig <- plot_ly(data_mix, y= ~mix, type = "histogram") %>%
  layout(title="Number of Participants vs Mixing Languages",
         yaxis=list(title="Mixing Languages (0 never, 1 sometimes, 2 always)"),
         xaxis=list(title="Frequency of Participants"))

fig

C. Features Correlation

In this section, we assess the correlation between features. If features (u,v) have high correlation, we select only one feature for our future analysis. This is to reduce colinearity in the model we aim to build later on.

Further, we can reduce Multicolinearity (emerges when three or more variables, which are highly correlated, are included within a model) by calculating the Variance Inflation Factor of the model. This will be done after modeling to test whether the model has any issues with Multicolinearity or not.

Age & KG_Stage

We have already removed KG_stage and kept age since they were highly correlated in the previous section.

Other Variables

  1. We select all the variables that might potentially be correlated, and we create a correlation matrix and rank it. The variables we are interested in are age, years_english, spoken, parent_ed, formal_study, R_speak_family, R_speak_friends, R_speak_school, R_write_family, R_write_friends, R_write_school, R_read_school, R_read_pleasure, R_watch_tv, R_audio_books, R_classes, R_games_PC, R_games_iPad, R_games_phone, mix.
  1. We remove one of the highly correlated variables to reduce colinearity and increase interpretability of the model we aim to build later.
## Select
data_to_assess <- all_data %>% select(age, years_english, parent_ed,    formal_study,   R_speak_family, R_speak_friends,    R_speak_school, R_write_family, R_write_friends,    R_write_school, R_read_school,  R_read_pleasure,    R_watch_tv, R_audio_books,  R_classes,  R_games_PC, R_games_iPad,   R_games_phone,  mix)
 
## Check Types == All must be numeric (int)
str(data_to_assess)
## 'data.frame':    1000 obs. of  19 variables:
##  $ age            : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ years_english  : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ parent_ed      : chr  "2" "2" "2" "2" ...
##  $ formal_study   : chr  "2" "2" "2" "2" ...
##  $ R_speak_family : int  40 40 40 40 40 40 40 40 40 40 ...
##  $ R_speak_friends: int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_school : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_write_family : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_write_friends: int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_write_school : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_school  : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_pleasure: int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_watch_tv     : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_audio_books  : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_classes      : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_PC     : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_iPad   : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_phone  : int  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix            : chr  "1" "1" "1" "1" ...
## Change parent_ed, formal_study and mix to int
data_to_assess$parent_ed = as.numeric(data_to_assess$parent_ed)
data_to_assess$formal_study = as.numeric(data_to_assess$formal_study)
data_to_assess$mix = as.numeric(data_to_assess$mix)

As expected, the correlation between many variables is high > 0.70, and all highly correlated variables are positively correlated as expected.

For example, R_write_friends is highly correlated with R_write_family, which makes sense since participants who write in English will write in English whether for friends or family.

## Get Correlation Matrix and Plot, Reshape into linear format, and Order Result
corr <- cor(data_to_assess)
fig <- plot_ly(
    x = colnames(corr), y = colnames(corr),
    z = corr, type = "heatmap"
)

fig
## Reshape into linear format
format     = reshape2::melt(corr, na.rm = TRUE, value.name = "corr")
format     = format[format$Var1 != format$Var2,]
format     = format[order(abs(format$corr), decreasing=TRUE),]
final_corr = unique(format)

## Order the Results
final_corr = final_corr[seq(1,nrow(final_corr),by=2),]

## Print top 30 highest correlated variables
head(final_corr, n=30)
##                Var1            Var2      corr
## 142 R_write_friends  R_write_family 0.8794682
## 242   R_audio_books      R_watch_tv 0.8575611
## 182   R_read_school  R_write_school 0.8522485
## 102  R_speak_school R_speak_friends 0.8464040
## 183 R_read_pleasure  R_write_school 0.8403286
## 162  R_write_school R_write_friends 0.8025353
## 125   R_read_school  R_speak_school 0.7834723
## 202 R_read_pleasure   R_read_school 0.7697258
## 302    R_games_iPad      R_games_PC 0.7693502
## 143  R_write_school  R_write_family 0.7622563
## 164 R_read_pleasure R_write_friends 0.7522794
## 145 R_read_pleasure  R_write_family 0.7297233
## 226    R_games_iPad R_read_pleasure 0.7279022
## 124  R_write_school  R_speak_school 0.7243320
## 223   R_audio_books R_read_pleasure 0.7184819
## 188    R_games_iPad  R_write_school 0.7045557
## 106   R_read_school R_speak_friends 0.6998321
## 207    R_games_iPad   R_read_school 0.6815862
## 264    R_games_iPad   R_audio_books 0.6641871
## 222      R_watch_tv R_read_pleasure 0.6598502
## 245    R_games_iPad      R_watch_tv 0.6583373
## 104 R_write_friends R_speak_friends 0.6508115
## 225      R_games_PC R_read_pleasure 0.6461757
## 185   R_audio_books  R_write_school 0.6433082
## 187      R_games_PC  R_write_school 0.6396081
## 126 R_read_pleasure  R_speak_school 0.6387176
## 224       R_classes R_read_pleasure 0.6376098
## 168      R_games_PC R_write_friends 0.6316348
## 148       R_classes  R_write_family 0.6304613
## 186       R_classes  R_write_school 0.6161651

To reduce the dimensions of the dataset, we replace the columns R_write_* with R_write which is the average of all R_write_* and R_read_* with R_read and R_games_* with R_games with their average, and other R_.*… and reassess the correlation matrix.

## New Filtered Data variable to store our new data
filtered_data = all_data

## Get all columns whose names match "R_write"
write_data = all_data[str_detect(names(all_data), "R_write")]
R_write_mean = write_data %>% rowwise() %>% dplyr::mutate(R_write_mean = mean(c(R_write_family, R_write_friends, R_write_school))) %>% select(R_write_mean)
#> Remove columns and add R_write_mean only
filtered_data = filtered_data[!str_detect(names(filtered_data),"R_write")]
#> Add new R_write_mean column = mean of all R_write*
filtered_data = filtered_data %>% mutate(R_write_mean = R_write_mean$R_write_mean)


## Do the same thing for R_read_*
read_data = all_data[str_detect(names(all_data), "R_read")]
R_read_mean = read_data %>% rowwise() %>% dplyr::mutate(R_read_mean = mean(c(R_read_school, R_read_pleasure))) %>% select(R_read_mean)
#> Remove columns and add R_read_mean only
filtered_data = filtered_data[!str_detect(names(filtered_data),"R_read")]
#> Add new R_write_mean column = mean of all R_write*
filtered_data = filtered_data %>% mutate(R_read_mean = R_read_mean$R_read_mean)

## Do the same thing for R_games
games_data = all_data[str_detect(names(all_data), "R_games")]
R_games_mean = games_data %>% rowwise() %>% dplyr::mutate(R_games_mean = mean(c(R_games_PC, R_games_iPad, R_games_phone))) %>% select(R_games_mean)
#> Remove columns and add R_read_mean only
filtered_data = filtered_data[!str_detect(names(filtered_data),"R_games")]
#> Add new R_write_mean column = mean of all R_write*
filtered_data = filtered_data %>% mutate(R_games_mean = R_games_mean$R_games_mean)

## Do the same thing for R_speak
speak_data = all_data[str_detect(names(all_data), "R_speak")]
R_speak_mean = speak_data %>% rowwise() %>% dplyr::mutate(R_speak_mean = mean(c(R_speak_family, R_speak_friends, R_speak_school))) %>% select(R_speak_mean)
#> Remove columns and add R_read_mean only
filtered_data = filtered_data[!str_detect(names(filtered_data),"R_speak")]
#> Add new R_write_mean column = mean of all R_write*
filtered_data = filtered_data %>% mutate(R_speak_mean = R_speak_mean$R_speak_mean)

## Do the same thing for R_watch_tv, R_audio_books, R_classes
listen_data = all_data[c("R_watch_tv", "R_audio_books", "R_classes")]
R_listen_mean = listen_data %>% rowwise() %>% dplyr::mutate(R_listen_mean = mean(c(R_watch_tv, R_audio_books, R_classes))) %>% select(R_listen_mean)
#> Remove columns and add R_read_mean only
filtered_data = filtered_data[,-which(names(filtered_data) == "R_watch_tv")]
filtered_data = filtered_data[,-which(names(filtered_data) == "R_audio_books")]
filtered_data = filtered_data[,-which(names(filtered_data) == "R_classes")]

#> Add new R_write_mean column = mean of all R_write*
filtered_data = filtered_data %>% mutate(R_listen_mean = R_listen_mean$R_listen_mean)

Reassess Correlation

Finding: We can still see few high correlations between the calculated means, which we keep note of for future analysis. The data dimensions were reduced significantly, and the correlations are reduced. We proceed our analysis using filtered_data.

## Select
data_to_assess <- filtered_data %>% select(age, years_english, parent_ed,   formal_study,   R_write_mean, R_read_mean, R_games_mean,R_speak_mean, R_listen_mean,    mix)
 
## Check Types == All must be numeric (int)
str(data_to_assess)
## 'data.frame':    1000 obs. of  10 variables:
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ years_english: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ parent_ed    : chr  "2" "2" "2" "2" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
## Change parent_ed, formal_study and mix to int
data_to_assess$parent_ed = as.numeric(data_to_assess$parent_ed)
data_to_assess$formal_study = as.numeric(data_to_assess$formal_study)
data_to_assess$mix = as.numeric(data_to_assess$mix)

## Get Correlation Matrix and Plot, Reshape into linear format, and Order Result
corr <- cor(data_to_assess)
fig <- plot_ly(
    x = colnames(corr), y = colnames(corr),
    z = corr, type = "heatmap"
)

fig
## Reshape into linear format
format     = reshape2::melt(corr, na.rm = TRUE, value.name = "corr")
format     = format[format$Var1 != format$Var2,]
format     = format[order(abs(format$corr), decreasing=TRUE),]
final_corr = unique(format)

## Order the Results
final_corr = final_corr[seq(1,nrow(final_corr),by=2),]

## Print top 30 highest correlated variables
head(final_corr, n=30)
##             Var1          Var2       corr
## 46   R_read_mean  R_write_mean  0.8226387
## 47  R_games_mean  R_write_mean  0.7673658
## 59 R_listen_mean   R_read_mean  0.7664570
## 58  R_speak_mean   R_read_mean  0.7370083
## 57  R_games_mean   R_read_mean  0.7220222
## 69 R_listen_mean  R_games_mean  0.6661825
## 49 R_listen_mean  R_write_mean  0.6630379
## 48  R_speak_mean  R_write_mean  0.6452089
## 79 R_listen_mean  R_speak_mean  0.5887342
## 68  R_speak_mean  R_games_mean  0.5308447
## 80           mix  R_speak_mean  0.4894100
## 19 R_listen_mean years_english  0.3791474
## 29 R_listen_mean     parent_ed  0.2981671
## 60           mix   R_read_mean  0.2949220
## 28  R_speak_mean     parent_ed  0.2829055
## 16   R_read_mean years_english  0.2560244
## 10           mix           age -0.2500000
## 26   R_read_mean     parent_ed  0.2482795
## 90           mix R_listen_mean  0.2440038
## 20           mix years_english  0.2401922
## 15  R_write_mean years_english  0.2184415
## 30           mix     parent_ed  0.2097031
## 18  R_speak_mean years_english  0.2010753
## 13     parent_ed years_english -0.1914024
## 36   R_read_mean  formal_study  0.1755496
## 24  formal_study     parent_ed  0.1618014
## 50           mix  R_write_mean  0.1461835
## 6    R_read_mean           age  0.1367253
## 37  R_games_mean  formal_study  0.1338545
## 17  R_games_mean years_english  0.1309563

D. Theoratical Significance of Features

In this section, note that we need to filter out our data based on the theoratical significance of features as discussed in the literature. The further filtered data will be used for modeling, but we keep our filtered_data data untouched to analyze data per groups (IG vs NIG) pre and post-test.

The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now, which potentially can be removed later in the modeling stage. And we include scorepre and scorepost as they are our response variables, in addition to ID and group to allow us to subset by ID and group further in the analysis.


Answering Q1: IG vs NIG

Q1- What is the influence of interactive and non-interactive DST on the ability of Saudi EFL kindergarten students to perceive the distinction between English [p] and [b]?

To answer this question, we follow the strategy:
1. Split and Analyze by Group (IG vs NIG)
2. Visualize the attributes of the two groups
3. Split into step 1 for pre-test analysis, and step 2 for post-test analysis

Step 1 - pretest analysis:
a. Model (linear mixed-effect model with a logit function) the sounds score (soundpre).
b. Model (linear mixed-effect model with a logit function) the Items score (scorepre).

Visualize Attributes

Age by Group

Age Distribution between groups is equivalent. t-test p-value = 1.

## Get Data
data_age = unique(filtered_data[c("ID","age", "group")])

fig <- plot_ly(filtered_data, y = ~age, color = ~group, type = "box") %>%
  layout(title="Age Distribution by Group")
fig
## T-test
independentSamplesTTest(formula = age~group, data=data_age)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   age 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     4.600 4.600
##    std dev. 0.500 0.500
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  0 
##    degrees of freedom:  48 
##    p-value:  1 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-0.284, 0.284] 
##    estimated effect size (Cohen's d):  0

Gender by Group

The number of males and females are distributed equally across the groups IG vs NIG. Fisher’s Exact Test shows p-value = 1.

## Get Group Color by group value for each participant
group_color = unique(filtered_data[c("ID", "group")])

## Get Data
data_gender = unique(filtered_data[c("ID", "gender", "group")])

## Plot
fig <- plot_ly(data_gender, y= ~gender, color = group_color$group, type = "histogram") %>%
  layout(title="Gender Distribution by group")
fig
## Fisher's Test
fisher.test(table(data_gender[c("gender","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_gender[c("gender", "group")])
## p-value = 1
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.2866588 3.4884677
## sample estimates:
## odds ratio 
##          1

Country by Group

Here, we assess the distribution of participants that lived in an english country or not by their groups. There is no significant difference at all, p-value = 1.

## Get Data
data_country = unique(all_data[c("ID", "country", "group")])

fig <- plot_ly(data_country, y = ~as.factor(country), color= group_color$group, type = "histogram") %>%
  layout(title="Number of Participants that Lived in English Countries or Not",
         yaxis=list(title="Lived in English Country (yes) or not (no) by Group IG vs NIG"))
fig
## Fisher's Test
fisher.test(table(data_country[c("country","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_country[c("country", "group")])
## p-value = 1
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.3169554 4.5472629
## sample estimates:
## odds ratio 
##   1.191047

First Encounter w/English by Group

We see that the two groups have very similar distribution of Years of first encounter with English. Fisher’s Exact Test shows p-value of 0.86.

## Get Data
data_yearsE = unique(filtered_data[c("ID", "years_english", "group")])

## Plot
fig <- plot_ly(data_yearsE, y = ~years_english, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Years of English Knowledge by Group",
         yaxis=list(title="Years of English Knowledge by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## Fisher's Test
fisher.test(table(data_yearsE[c("years_english","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_yearsE[c("years_english", "group")])
## p-value = 0.8686
## alternative hypothesis: two.sided

Years Learning English by Group

We see that the two groups have very similar distribution of Years Learning English. Fisher’s p-value is 0.9.

## Get Data
data_formalE = unique(all_data[c("ID", "formal_study", "group")])

## Plot
fig <- plot_ly(data_formalE, y = ~formal_study, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Years of Formal English Education by Group",
         yaxis=list(title="Years of Formal English Education by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## Fisher's Test
fisher.test(table(data_formalE[c("formal_study","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_formalE[c("formal_study", "group")])
## p-value = 0.902
## alternative hypothesis: two.sided

English Proficiency by Group

IG group has slightly lower Spoken English Proficiency than NIG. It shows that NIG will increase the proficiency with ~1.2 which is one and quarter step into a higher proficiency on a scale of (0-2), however, this is not significant as Fisher’s Test shows p-value 0.13 which is greater than 0.05.

## Get Data
data_spoken = unique(all_data[c("ID", "spoken","group")])

## Plot
fig <- plot_ly(data_spoken, y = ~as.factor(spoken), color = group_color$group, type = "histogram") %>%
  layout(title="Number of participants vs proficiency of spoken english by Group",
         yaxis=list(title="English Proficiency (0 few words, 1 short phrases, 2 fluent) by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## Check Significance of difference
#> Get Data
mod_data  = unique(all_data[c("ID","spoken","group")])
sig_model = glm(formula = as.factor(spoken)~as.factor(group), data = mod_data, family="binomial")
summ_sig  = summary(sig_model) 
p_value   = summ_sig$coefficients[2,4]
message("P-value of significance (",p_value," > 0.05) of Group when determining English Proficiency is not significant.")
## P-value of significance (0.139767986968651 > 0.05) of Group when determining English Proficiency is not significant.
## Fisher's Test
fisher.test(table(data_spoken[c("spoken","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_spoken[c("spoken", "group")])
## p-value = 0.1318
## alternative hypothesis: two.sided

Parents’ Education by Group

Fisher’s Test shows that there is a significant difference in Parents’ Education with p-value of 0.0002. There is more Postgraduate Parents in group NIG than group IG. But the majority in both groups belong to levels 2 (Undergraduate), and 3 (Postgraduate). Since both groups have most participants’ parents belonging to two groups level 2 and level 3, we might be able to assume that this difference is not drastic on the outcome. Yes there is a difference, but this difference is between two close levels.

When we convert the column parent_ed into a numerical column, considering that there is continuous levels between level 2 and level 3; i.e. the mean of parent_ed could be calculated => We do a T-test and see that the means are exactly the same across groups, with p-value of 1. This supports the above analysis; there is no significant difference in parent_ed across groups.

## Get Data
data_parent_ed = unique(all_data[c("ID","parent_ed","group")])

## Plot
fig <- plot_ly(data_parent_ed, y = ~as.factor(parent_ed), color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Parents' level of Education by Group",
         yaxis=list(title="Level of Education (0 primary, 3 postgraduate) by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## Fisher's Test
fisher.test(table(data_parent_ed[c("parent_ed","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_parent_ed[c("parent_ed", "group")])
## p-value = 0.0002868
## alternative hypothesis: two.sided
## T-test 
t_data_ed <- data_parent_ed
t_data_ed$parent_ed <- as.numeric(t_data_ed$parent_ed)

independentSamplesTTest(formula = parent_ed~group, data=t_data_ed)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   parent_ed 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     2.440 2.440
##    std dev. 0.507 1.003
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  0 
##    degrees of freedom:  35.491 
##    p-value:  1 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-0.456, 0.456] 
##    estimated effect size (Cohen's d):  0
## To maintain no difference between groups in our data,
#> we conver the column into numerical column
filtered_data$parent_ed <- as.numeric(filtered_data$parent_ed)

Speaking English by Group

The difference of average % of speaking English across groups is insignificant with T-test p-value of 0.354, and close averages of 48.94 (IG) and 57.2% (NIG).

data_speak = unique(filtered_data[c("ID","R_speak_mean", "group")])

fig <- plot_ly(data_speak, y= ~R_speak_mean, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Average % of speaking English by Group",
         yaxis=list(title="Average % of Speaking English by Group"),
         xaxis=list(title="Frequency of Participants"))

fig
## T-Test Test
independentSamplesTTest(formula = R_speak_mean~group, data=data_speak)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   R_speak_mean 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                 IG    NIG
##    mean     49.973 57.200
##    std dev. 27.968 26.573
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.937 
##    degrees of freedom:  47.875 
##    p-value:  0.354 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-22.741, 8.288] 
##    estimated effect size (Cohen's d):  0.265

Writing English by Group

There is slight difference in the 20%-40% bins distribution of participants by group NIG vs IG. However since the % is close, we do can consider this difference insignificant. T-test shows highly insignificant difference of p-value = 0.935.

## Get Data
data_write = unique(filtered_data[c("ID","R_write_mean", "group")])

## Plot
fig <- plot_ly(data_write, y= ~R_write_mean,color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Average % of writing English by Group",
         yaxis=list(title="Average % of writing English by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## T-Test Test
independentSamplesTTest(formula = R_write_mean~group, data=data_write)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   R_write_mean 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                 IG    NIG
##    mean     43.013 43.827
##    std dev. 35.608 34.108
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.082 
##    degrees of freedom:  47.911 
##    p-value:  0.935 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-20.642, 19.016] 
##    estimated effect size (Cohen's d):  0.023

Reading English by Group

There is a slight difference in the distribution of Average % of reading in English between NIG and IG groups. This difference doesn’t seem to be significant; NIG has slightly higher average % (60) than IG (50). T-Test shows p-value of 0.311 > 0.05 => Not significant.

## Get Data
data_read = unique(filtered_data[c("ID","R_read_mean", "group")])

## Plot
fig <- plot_ly(data_read, y= ~R_read_mean, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Average % of Reading English by Group",
         yaxis=list(title="Average % of Reading English by Group"),
         xaxis=list(title="Frequency of Participants"))
fig
## T-Test Test
independentSamplesTTest(formula = R_read_mean~group, data=data_read)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   R_read_mean 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                 IG    NIG
##    mean     50.600 60.280
##    std dev. 30.917 35.716
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -1.025 
##    degrees of freedom:  47.034 
##    p-value:  0.311 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-28.686, 9.326] 
##    estimated effect size (Cohen's d):  0.29

Listening to English by Group

There is a slight difference in the distribution of Average % of listening to English between NIG and IG groups. This difference doesn’t seem to be significant; the major difference lies in between the frequency of participants whose Average % of listning to english is between %50 and %70 where NIG participants have higher % of listening to English (more participants have 60-70%) compared to the IG (more in the 40-50%). This is consistent with the difference seen in the section analyzing Average % Reading English across groups. This is expected since Reading English and Listening to English were seen to be correlated. Also, similarly as above, T-test shows very insignificant difference with p-value of 0.831.

## Get Data
data_listen_eng = unique(filtered_data[c("ID","R_listen_mean", "group")])

## Plot
fig <- plot_ly(data_listen_eng, y= ~R_listen_mean, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Average % of Listening to English by Group",
         yaxis=list(title="Average % of Listening to English by Group"),
         xaxis=list(title="Frequency of Participants"))

fig
## T-Test Test
independentSamplesTTest(formula = R_listen_mean~group, data=data_listen_eng)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   R_listen_mean 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                 IG    NIG
##    mean     52.587 54.200
##    std dev. 28.557 24.544
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.214 
##    degrees of freedom:  46.94 
##    p-value:  0.831 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-16.764, 13.538] 
##    estimated effect size (Cohen's d):  0.061

English Games by Group

There doesn’t seem to be any significant different in % of Games being in English across the two groups IG and NIG. T-test shows p-value is 0.165 > 0.05.

## Get Data
data_games = unique(filtered_data[c("ID","R_games_mean", "group")])

## Plot
fig <- plot_ly(data_games, y= ~R_games_mean, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Average % of Games played in English by Group",
         yaxis=list(title="Average % of Games played in English by Group"),
         xaxis=list(title="Frequency of Participants"))

fig
## T-Test Test
independentSamplesTTest(formula = R_games_mean~group, data=data_games)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   R_games_mean 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                 IG    NIG
##    mean     54.280 42.800
##    std dev. 28.351 29.175
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  1.411 
##    degrees of freedom:  47.961 
##    p-value:  0.165 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-4.879, 27.839] 
##    estimated effect size (Cohen's d):  0.399

Mixing Languages by Group

There is no signficant difference of frequency of participants mixing languages in speech between IG and NIG. Fisher’s p-value shows 0.231 > 0.05.

## Get Data
data_mix = unique(filtered_data[c("ID","mix","group")])

## Plot
fig <- plot_ly(data_mix, y= ~mix, color = group_color$group, type = "histogram") %>%
  layout(title="Number of Participants vs Mixing Languages by Group (IG vs NIG)",
         yaxis=list(title="Mixing Languages (0 never, 1 sometimes, 2 always) by Group (IG vs NIG)"),
         xaxis=list(title="Frequency of Participants"))

fig
## Fisher's Test
fisher.test(table(data_mix[c("mix","group")]))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(data_mix[c("mix", "group")])
## p-value = 0.2318
## alternative hypothesis: two.sided

Pre-Test Analysis

Strategy: 1. The influence of interactive DST on the perception of English /p/ and /b/ among Saudi EFL kindergarten children
1.a. Participants’ Performance Scores
1.b. Performance on Items Scores
1.c. Performance on Sounds Scores

2. The influence of non-interactive DST on the perception of English /p/ and /b/ among Saudi EFL kindergarten children
Do the same analysis as above (a,b,c).

3. Compare the two groups’ performance

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.736, which is insignificant. This is favorable as the performance of the two groups should pre-test should similar.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","scorepre")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     0.480 1.040
##    std dev. 5.636 6.031
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.339 
##    degrees of freedom:  47.781 
##    p-value:  0.736 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-3.88, 2.76] 
##    estimated effect size (Cohen's d):  0.096

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepre)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepre     : int  0 1 0 0 0 0 0 0 0 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepre     <- as.factor(data_model$scorepre)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepre     : Factor w/ 2 levels "0","1": 1 2 1 1 1 1 1 1 1 1 ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group)
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group)

Model

IG - Intercept Only Model

First, we model the data using the ID only.

model_intercept_ig <- glmer(formula = scorepre ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 694.2, BIC = 702.6; the intercept is very low 0.0493 and close to zero, indicating that it is nearly equally probably to get a score of 0 and a score of 1 in the pre-test. This means that the participants in this group are nearly guessing.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    694.2    702.6   -345.1    690.2      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2173 -0.9800  0.8215  0.9243  1.1791 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1125   0.3354  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.0493     0.1128   0.437    0.662

Third, We check the Inter-coefficient correlation = 0.033. This means that only 3.3% of the variation in the scores is explained by the difference in participants. Therefore, the variance in the scores is not due to the clusters (participants). This reaffirms that no participants perform remarkably better/worse than others.

Also, this indicates that a simpler analysis can be used here, instead of linear mixed models, we can remove the random effect and use a linear model instead.

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.033
##   Conditional ICC: 0.033
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepre ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.047
##   Conditional ICC: 0.047

We take note of the AIC = 690.8 and BIC 699.2.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    690.8    699.2   -343.4    686.8      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2934 -0.9848  0.7731  0.8869  1.1628 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1638   0.4047  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.1081     0.1221   0.886    0.376
IG - Full Model

We note that parent_ed is slightly significant (p-value of 0.0437, Wald Test for Linear Mixed Models), with coeff = 0.59. Therefore, every one level increase in the parents’ education, increases the odds of scoring (1) by exp(0.59) = 1.803 times.

model_full_ig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    696.3    734.3   -339.2    678.3      491 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3265 -0.9420  0.7538  0.9076  1.3971 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -1.57544    1.13682  -1.386   0.1658  
## age           0.08218    0.20114   0.409   0.6829  
## countryYes    0.02275    0.24054   0.095   0.9247  
## formal_study -0.05999    0.10075  -0.595   0.5515  
## parent_ed     0.48351    0.23974   2.017   0.0437 *
## spoken       -0.30816    0.18447  -1.671   0.0948 .
## R_write_mean  0.13631    0.52875   0.258   0.7966  
## R_read_mean   0.79640    0.64398   1.237   0.2162  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.896                                          
## countryYes   0.146 -0.046                                   
## formal_stdy  0.040 -0.150  0.172                            
## parent_ed   -0.547  0.176 -0.327 -0.081                     
## spoken      -0.162  0.256 -0.183 -0.067 -0.197              
## R_write_men -0.081 -0.011 -0.303  0.058  0.279 -0.109       
## R_read_mean  0.160 -0.097  0.221 -0.031 -0.354 -0.246 -0.773
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

However, we see an issue of singularity (some variance is very close to zero) and the full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_ig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.178534     1.515645     1.067370     1.714932     2.132834     4.085525 
##  R_read_mean 
##     4.550349
IG - Subset Model

We see that the parent_ed is still significant and is proven by P-value 0.01992 < 0.05 of Chi-sq test with Anova between a model containing Parent_ed and a model without Parent_ed.

model_sub_ig           <- glmer(formula = scorepre ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    697.2    726.7   -341.6    683.2      493 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3017 -0.9643  0.7682  0.9882  1.2629 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.03752  0.1937  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -1.97097    1.21953  -1.616    0.106  
## age           0.16051    0.21609   0.743    0.458  
## countryYes    0.05430    0.24915   0.218    0.827  
## formal_study -0.06806    0.10919  -0.623    0.533  
## parent_ed     0.59386    0.24420   2.432    0.015 *
## spoken       -0.08789    0.16978  -0.518    0.605  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.898                            
## countryYes   0.128 -0.051                     
## formal_stdy  0.045 -0.152  0.202              
## parent_ed   -0.532  0.154 -0.277 -0.098       
## spoken      -0.112  0.204 -0.283 -0.056 -0.355
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepre ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepre ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_no_ed_ig    6 700.65 725.94 -344.33   688.65                       
## model_sub_ig          7 697.24 726.74 -341.62   683.24 5.4188  1    0.01992 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.151055     1.383693     1.065951     1.503609     1.538038

Compare Full model to Sub Model: When comparing the full model and the sub model we see that there isn’t any significant difference; the AIC and BIC are nearly the same with the sub_model with lower BIC. The sub model however has no multicollinearity.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_ig     7 697.24 726.74 -341.62   683.24                       
## model_full_ig    9 696.32 734.25 -339.16   678.32 4.9162  2     0.0856 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
NIG - Full Model

We note that parent_ed is more significant here and spoken is slightly significant in the model. This means that the education of the parents and spoken english level by participants are the most significant in determining the scores, and they increase the likelyhood of scoring (1) as their coefficients are positive.

model_full_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    684.6    722.5   -333.3    666.6      491 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7558 -0.9618  0.6093  0.9185  1.4055 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.39720    1.28303  -1.089 0.276161    
## age           0.09532    0.25382   0.376 0.707261    
## countryYes   -0.59848    0.22976  -2.605 0.009194 ** 
## formal_study -0.14268    0.10051  -1.420 0.155738    
## parent_ed     0.37416    0.10577   3.537 0.000404 ***
## spoken        0.45308    0.23085   1.963 0.049685 *  
## R_write_mean  1.32431    0.51728   2.560 0.010464 *  
## R_read_mean  -1.06849    0.60277  -1.773 0.076287 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.967                                          
## countryYes   0.358 -0.323                                   
## formal_stdy -0.337  0.339 -0.116                            
## parent_ed   -0.350  0.186 -0.201 -0.149                     
## spoken      -0.574  0.474 -0.365  0.018  0.236              
## R_write_men -0.087  0.065  0.040 -0.012  0.295 -0.065       
## R_read_mean  0.485 -0.479  0.042 -0.199 -0.359 -0.462 -0.675
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

However, we see an issue of singularity (some variance is very close to zero) and the full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_ig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.178534     1.515645     1.067370     1.714932     2.132834     4.085525 
##  R_read_mean 
##     4.550349
NIG - Subset Model

We see that the parent_ed is still significant, and AIC, BIC are very close to the full model. Country is also showing to be significant, with a negative coefficient. Theoratically, it is difficult to tell whether living in an English speaking country would impact a child that is still 3-4 years of age without understanding the social behaviour/participation inside or outside the house.

model_sub_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    687.2    716.7   -336.6    673.2      493 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5119 -0.9567  0.6614  0.9324  1.3682 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.01078  0.1038  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -1.06926    1.07012  -0.999  0.31770   
## age           0.04252    0.20973   0.203  0.83935   
## countryYes   -0.63083    0.23265  -2.712  0.00670 **
## formal_study -0.14385    0.09852  -1.460  0.14424   
## parent_ed     0.29547    0.10077   2.932  0.00337 **
## spoken        0.49120    0.17116   2.870  0.00411 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.950                            
## countryYes   0.389 -0.347                     
## formal_stdy -0.229  0.230 -0.088              
## parent_ed   -0.268  0.057 -0.209 -0.230       
## spoken      -0.314  0.131 -0.414 -0.257  0.155

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.197155     1.432279     1.260034     1.120106     1.353055

Compare Full model to Sub Model: There is a slightly significant difference between the subset model and the full model. R_write and R_read definitely might an impact on the score even if they cause multicollinearity.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_nig     7 687.21 716.71 -336.60   673.21                       
## model_full_nig    9 684.59 722.52 -333.29   666.59 6.6237  2    0.03645 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

NIG vs IG Participants Score

It appears that there is a very slight difference in the models between IG and NIG, although the predictors were assessed previously and no significant difference was shown across groups (using T-tests and Fisher’s Tests).

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                               scorepre          
##                          (1)            (2)     
## ------------------------------------------------
## age                     0.161          0.043    
##                        (0.216)        (0.210)   
##                                                 
## countryYes              0.054        -0.631***  
##                        (0.249)        (0.233)   
##                                                 
## formal_study            -0.068        -0.144    
##                        (0.109)        (0.099)   
##                                                 
## parent_ed              0.594**       0.295***   
##                        (0.244)        (0.101)   
##                                                 
## spoken                  -0.088       0.491***   
##                        (0.170)        (0.171)   
##                                                 
## Constant                -1.971        -1.069    
##                        (1.220)        (1.070)   
##                                                 
## ------------------------------------------------
## Observations             500            500     
## Log Likelihood         -341.618      -336.604   
## Akaike Inf. Crit.      697.236        687.209   
## Bayesian Inf. Crit.    726.738        716.711   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per Item and comparing the latter across groups (IG vs NIG) using a bar plot.

We can see that the performance on items is almost the same across groups; when an item scores positively in NIG, it also scores positively on IG (also for negative scores) and vice versa; with exception of two items people and picture that score near 0 total score.

Words such as ball, black, penguin, pin, plant, play, pot performed worse than other words.

It is very iteresting that the words bed, bus and pen performed the best!

## Get scores per ID (0/1)
scores <- table(filtered_data[c("most.freqitems1","scorepre","group")])
scores <- as.data.frame(scores)
items  <- scores$most.freqitems1

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scorepre==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepre==0) %>% select(Freq)
ig_score = data.frame(items = items[0:20], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scorepre==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepre==0) %>% select(Freq)
nig_score = data.frame(items = items[0:20], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the Items as random effect.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(most.freqitems1)

## Model
model_intercept_ig <- glmer(formula = scorepre ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

lattice::dotplot(ranef(model_intercept_ig, which = "items", condVar = TRUE))
## $items

We take note of the AIC = 669, BIC = 677.5; the intercept is very low 0.0493 and close to zero, indicating that it is nearly equally probably to get a score of 0 and a score of 1 in the pre-test. This means that the participants in this group are nearly guessing.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    669.0    677.5   -332.5    665.0      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7514 -0.8121  0.5710  0.8019  1.7250 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.4987   0.7062  
## Number of obs: 500, groups:  items, 20
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.05055    0.18417   0.274    0.784

Third, We check the Inter-coefficient correlation = 0.132 which shows that 13.2% of the variation in the score is due to the difference in the items.

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.132
##   Conditional ICC: 0.132
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(most.freqitems1)

## Model
model_intercept_nig <- glmer(formula = scorepre ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

lattice::dotplot(ranef(model_intercept_nig, which = "items", condVar = TRUE))
## $items

We take note of the AIC = 663.2 and BIC 671.6.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    663.2    671.6   -329.6    659.2      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8025 -0.8129  0.5548  0.7910  1.5042 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.5625   0.75    
## Number of obs: 500, groups:  items, 20
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.1187     0.1931   0.615    0.539

Third, We check the Inter-coefficient correlation = 0.146 which shows that 14.6% of the variation in the score is due to the difference in the items. This is very similar to the results of group IG.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.146
##   Conditional ICC: 0.146
Check Fixed Effect of Items

We observe the same result for group NIG.

## Model
model_items_ig <- glmer(formula = scorepre ~ 1 + items + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

## Model
model_items_nig <- glmer(formula = scorepre ~ 1 + items + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

stargazer(model_items_ig, model_items_nig, type = "text", title="Model Comparison between Two Groups")
## 
## Model Comparison between Two Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                               scorepre          
##                          (1)            (2)     
## ------------------------------------------------
## itemsball               0.350        -1.734***  
##                        (0.593)        (0.662)   
##                                                 
## itemsbat               -1.043*        -1.216*   
##                        (0.602)        (0.661)   
##                                                 
## itemsbed               1.473**        -0.246    
##                        (0.690)        (0.702)   
##                                                 
## itemsbird               0.735         -1.216*   
##                        (0.614)        (0.661)   
##                                                 
## itemsblack            -1.979***      -2.704***  
##                        (0.691)        (0.706)   
##                                                 
## itemsboat               -0.169        -0.580    
##                        (0.582)        (0.674)   
##                                                 
## itemsbox                0.537        -2.049***  
##                        (0.602)        (0.675)   
##                                                 
## itemsbubble             0.537         -0.858    
##                        (0.602)        (0.669)   
##                                                 
## itemsbus                0.537          0.284    
##                        (0.602)        (0.756)   
##                                                 
## itemspat                -0.169       -1.389**   
##                        (0.582)        (0.660)   
##                                                 
## itemspen                0.735         0.00000   
##                        (0.614)        (0.724)   
##                                                 
## itemspenguin           -1.457**      -2.132***  
##                        (0.632)        (0.668)   
##                                                 
## itemspeople             -0.678        -0.668    
##                        (0.587)        (0.676)   
##                                                 
## itemspicture           0.00000       -2.091***  
##                        (0.584)        (0.671)   
##                                                 
## itemspig                0.537         -0.668    
##                        (0.602)        (0.676)   
##                                                 
## itemspin                -0.927       -2.449***  
##                        (0.590)        (0.693)   
##                                                 
## itemsplant              -0.856       -2.484***  
##                        (0.594)        (0.690)   
##                                                 
## itemsplay              -1.043*       -2.484***  
##                        (0.602)        (0.690)   
##                                                 
## itemspot               -1.174*       -2.091***  
##                        (0.617)        (0.671)   
##                                                 
## Constant                0.253        1.474***   
##                        (0.422)        (0.525)   
##                                                 
## ------------------------------------------------
## Observations             500            500     
## Log Likelihood         -306.431      -300.680   
## Akaike Inf. Crit.      654.862        643.360   
## Bayesian Inf. Crit.    743.369        731.867   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01
IG - Full Model

As we’ve seen in the first graph, the items with negative coefficient had negative performance (scores) while the items with positive coefficients had positive performance (scores).

model_full_ig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    659.0    781.2   -300.5    601.0      471 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2108 -0.8320  0.3385  0.7699  2.6913 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.03581  0.1892  
##  items  (Intercept) 0.00000  0.0000  
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -1.67546    1.38238  -1.212  0.22551   
## age           0.09827    0.23429   0.419  0.67489   
## countryYes    0.02827    0.27997   0.101  0.91957   
## formal_study -0.07118    0.11718  -0.607  0.54354   
## parent_ed     0.57286    0.27981   2.047  0.04063 * 
## spoken       -0.36073    0.21464  -1.681  0.09283 . 
## R_write_mean  0.15824    0.61603   0.257  0.79728   
## R_read_mean   0.93623    0.75053   1.247  0.21224   
## itemsball     0.35020    0.59320   0.590  0.55496   
## itemsbat     -1.04270    0.60240  -1.731  0.08347 . 
## itemsbed      1.47260    0.69048   2.133  0.03295 * 
## itemsbird     0.73512    0.61393   1.197  0.23115   
## itemsblack   -1.97933    0.69151  -2.862  0.00421 **
## itemsboat    -0.16892    0.58182  -0.290  0.77156   
## itemsbox      0.53652    0.60182   0.891  0.37266   
## itemsbubble   0.53651    0.60182   0.891  0.37268   
## itemsbus      0.53652    0.60182   0.891  0.37266   
## itemspat     -0.16894    0.58182  -0.290  0.77154   
## itemspen      0.73512    0.61393   1.197  0.23115   
## itemspenguin -1.45767    0.63167  -2.308  0.02102 * 
## itemspeople  -0.67837    0.58760  -1.154  0.24830   
## itemspicture  0.00013    0.58358   0.000  0.99982   
## itemspig      0.53653    0.60183   0.892  0.37266   
## itemspin     -0.93716    0.59062  -1.587  0.11258   
## itemsplant   -0.85628    0.59365  -1.442  0.14919   
## itemsplay    -1.04269    0.60240  -1.731  0.08347 . 
## itemspot     -1.16290    0.61759  -1.883  0.05970 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 27 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_ig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.179655  1        1.086119
## country      1.515653  1        1.231119
## formal_study 1.069098  1        1.033972
## parent_ed    1.720952  1        1.311851
## spoken       2.131194  1        1.459861
## R_write_mean 4.093269  1        2.023183
## R_read_mean  4.571377  1        2.138078
## items        1.021267 19        1.000554
IG - Sub Model
model_sub_ig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    659.9    773.7   -302.9    605.9      473 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3412 -0.7957  0.3442  0.7629  2.5724 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.09566  0.3093  
##  items  (Intercept) 0.00000  0.0000  
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -2.1403860  1.4938052  -1.433  0.15190   
## age           0.1898080  0.2550997   0.744  0.45684   
## countryYes    0.0647102  0.2941187   0.220  0.82586   
## formal_study -0.0810591  0.1288368  -0.629  0.52924   
## parent_ed     0.7039331  0.2891162   2.435  0.01490 * 
## spoken       -0.1028288  0.2004020  -0.513  0.60787   
## itemsball     0.3500810  0.5930298   0.590  0.55497   
## itemsbat     -1.0429176  0.6024158  -1.731  0.08341 . 
## itemsbed      1.4719947  0.6903221   2.132  0.03298 * 
## itemsbird     0.7348509  0.6137598   1.197  0.23119   
## itemsblack   -1.9804430  0.6916738  -2.863  0.00419 **
## itemsboat    -0.1689137  0.5816914  -0.290  0.77152   
## itemsbox      0.5363357  0.6016433   0.891  0.37269   
## itemsbubble   0.5363336  0.6016467   0.891  0.37269   
## itemsbus      0.5363283  0.6016386   0.891  0.37269   
## itemspat     -0.1689110  0.5816784  -0.290  0.77152   
## itemspen      0.7348297  0.6137559   1.197  0.23120   
## itemspenguin -1.4582151  0.6317502  -2.308  0.02099 * 
## itemspeople  -0.6783952  0.5875433  -1.155  0.24824   
## itemspicture  0.0001229  0.5834283   0.000  0.99983   
## itemspig      0.5363275  0.6016397   0.891  0.37269   
## itemspin     -0.9179916  0.5903925  -1.555  0.11997   
## itemsplant   -0.8563655  0.5936225  -1.443  0.14913   
## itemsplay    -1.0429002  0.6024089  -1.731  0.08341 . 
## itemspot     -1.1844576  0.6177348  -1.917  0.05518 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 25 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; the full model has lower AIC and higher BIC, it is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##               npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)  
## model_sub_ig    27 659.88 773.68 -302.94   605.88                      
## model_full_ig   29 659.01 781.23 -300.50   601.01 4.875  2    0.08738 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Including R_Write_mean; we see P-value is significant in the anova test, indicating that R_write truly affects the score; which is logical.

model_sub2_ig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_ig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.171154  1        1.082198
## country      1.445846  1        1.202433
## formal_study 1.069859  1        1.034340
## parent_ed    1.512749  1        1.229939
## spoken       1.997357  1        1.413279
## R_write_mean 1.640472  1        1.280809
## items        1.017973 19        1.000469
anova(model_sub2_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_ig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_ig    27 659.88 773.68 -302.94   605.88                       
## model_sub2_ig   28 658.52 776.53 -301.26   602.52 3.3579  1    0.06689 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Visualize the Variation of scores accross Participants.

lattice::dotplot(ranef(model_sub2_ig, which = "ID", condVar = TRUE))
## $ID

NIG - Full Model
model_full_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    638.4    760.6   -290.2    580.4      471 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2993 -0.7710  0.3427  0.7364  2.2231 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -3.207e-01  1.484e+00  -0.216 0.828906    
## age           1.078e-01  2.775e-01   0.388 0.697760    
## countryYes   -7.177e-01  2.528e-01  -2.839 0.004527 ** 
## formal_study -1.775e-01  1.100e-01  -1.614 0.106494    
## parent_ed     4.521e-01  1.171e-01   3.862 0.000112 ***
## spoken        5.570e-01  2.537e-01   2.195 0.028139 *  
## R_write_mean  1.589e+00  5.690e-01   2.793 0.005214 ** 
## R_read_mean  -1.278e+00  6.629e-01  -1.928 0.053849 .  
## itemsball    -1.735e+00  6.618e-01  -2.622 0.008736 ** 
## itemsbat     -1.217e+00  6.610e-01  -1.841 0.065665 .  
## itemsbed     -2.455e-01  7.022e-01  -0.350 0.726625    
## itemsbird    -1.217e+00  6.610e-01  -1.841 0.065666 .  
## itemsblack   -2.710e+00  7.056e-01  -3.840 0.000123 ***
## itemsboat    -5.721e-01  6.742e-01  -0.849 0.396086    
## itemsbox     -2.058e+00  6.742e-01  -3.053 0.002268 ** 
## itemsbubble  -8.585e-01  6.691e-01  -1.283 0.199466    
## itemsbus      2.838e-01  7.559e-01   0.375 0.707314    
## itemspat     -1.390e+00  6.597e-01  -2.107 0.035120 *  
## itemspen      4.606e-14  7.237e-01   0.000 1.000000    
## itemspenguin -2.129e+00  6.682e-01  -3.186 0.001443 ** 
## itemspeople  -6.682e-01  6.766e-01  -0.988 0.323352    
## itemspicture -2.094e+00  6.710e-01  -3.121 0.001802 ** 
## itemspig     -6.682e-01  6.766e-01  -0.988 0.323352    
## itemspin     -2.459e+00  6.927e-01  -3.550 0.000385 ***
## itemsplant   -2.489e+00  6.901e-01  -3.606 0.000311 ***
## itemsplay    -2.489e+00  6.901e-01  -3.606 0.000311 ***
## itemspot     -2.094e+00  6.710e-01  -3.121 0.001802 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 27 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.822665  1        1.350061
## country      1.455063  1        1.206260
## formal_study 1.360738  1        1.166507
## parent_ed    1.315681  1        1.147031
## spoken       2.597210  1        1.611586
## R_write_mean 3.509100  1        1.873259
## R_read_mean  5.360962  1        2.315375
## items        1.046250 19        1.001191
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    641.6    755.4   -293.8    587.6      473 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8152 -0.7652  0.3564  0.7245  2.1821 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.06792  0.2606  
##  items  (Intercept) 0.00000  0.0000  
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   5.618e-02  1.382e+00   0.041 0.967576    
## age           4.568e-02  2.532e-01   0.180 0.856837    
## countryYes   -7.604e-01  2.820e-01  -2.696 0.007011 ** 
## formal_study -1.796e-01  1.190e-01  -1.510 0.131140    
## parent_ed     3.596e-01  1.221e-01   2.945 0.003234 ** 
## spoken        6.092e-01  2.074e-01   2.937 0.003316 ** 
## itemsball    -1.732e+00  6.618e-01  -2.617 0.008858 ** 
## itemsbat     -1.215e+00  6.608e-01  -1.839 0.065974 .  
## itemsbed     -2.453e-01  7.019e-01  -0.349 0.726775    
## itemsbird    -1.215e+00  6.608e-01  -1.839 0.065974 .  
## itemsblack   -2.703e+00  7.059e-01  -3.829 0.000129 ***
## itemsboat    -5.678e-01  6.741e-01  -0.842 0.399647    
## itemsbox     -2.056e+00  6.744e-01  -3.048 0.002302 ** 
## itemsbubble  -8.575e-01  6.688e-01  -1.282 0.199828    
## itemsbus      2.837e-01  7.557e-01   0.375 0.707388    
## itemspat     -1.388e+00  6.595e-01  -2.104 0.035368 *  
## itemspen      2.831e-05  7.234e-01   0.000 0.999969    
## itemspenguin -2.123e+00  6.683e-01  -3.176 0.001492 ** 
## itemspeople  -6.674e-01  6.763e-01  -0.987 0.323675    
## itemspicture -2.090e+00  6.711e-01  -3.114 0.001846 ** 
## itemspig     -6.674e-01  6.763e-01  -0.987 0.323679    
## itemspin     -2.455e+00  6.929e-01  -3.543 0.000396 ***
## itemsplant   -2.483e+00  6.904e-01  -3.596 0.000323 ***
## itemsplay    -2.483e+00  6.904e-01  -3.596 0.000323 ***
## itemspot     -2.090e+00  6.711e-01  -3.114 0.001846 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 25 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; the full model has lower AIC and higher BIC, it is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_nig    27 641.58 755.37 -293.79   587.58                       
## model_full_nig   29 638.38 760.60 -290.19   580.38 7.2016  2     0.0273 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Including R_Write_mean; we see P-value is nearly significant in the anova test, indicating that R_write might affect the score; which is logical.

model_sub2_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.413883  1        1.189068
## country      1.449737  1        1.204050
## formal_study 1.311138  1        1.145049
## parent_ed    1.140867  1        1.068114
## spoken       2.041773  1        1.428906
## R_write_mean 1.892055  1        1.375520
## items        1.038140 19        1.000986
anova(model_sub2_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_nig: scorepre ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_nig    27 641.58 755.37 -293.79   587.58                       
## model_sub2_nig   28 640.03 758.04 -292.01   584.03 3.5494  1    0.05957 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Visualize the Variation of scores accross Participants and Across Items.

lattice::dotplot(ranef(model_sub2_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG). We can see that sound ‘b’ is percieved correctly way more than ‘p’. This finding is logical as phoneme ‘p’ does not exist in the arabic language and children will logically find it harder to percieve.

## Get scores per group
scores <- table(filtered_data[c("soundpre","scorepre","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundpre

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scorepre==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepre==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scorepre==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepre==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants. Participant 1 has relatively low score on both.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$scorepre, by=list(filtered_data$ID, filtered_data$soundpre), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the sounds only.

## Add Items
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(soundpre)

## Model
model_intercept_ig <- glmer(formula = scorepre ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 669, BIC = 677.5; the intercept is very low 0.0487 and close to zero, indicating that it is nearly equally probably to get a score of 0 and a score of 1 in the pre-test. This means that the participants in this group are nearly guessing.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    692.6    701.1   -344.3    688.6      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1407 -0.9203  0.8766  0.8766  1.0866 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0.05906  0.243   
## Number of obs: 500, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.04872    0.19405   0.251    0.802

Third, We check the Inter-coefficient correlation = 0.018, which shows that 1.8% of the score variation is due to sound.

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.018
##   Conditional ICC: 0.018
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(soundpre)

## Model
model_intercept_nig <- glmer(formula = scorepre ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 663.2 and BIC 671.6.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepre ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    692.0    700.4   -344.0    688.0      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1681 -0.9512  0.8561  0.8561  1.0514 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0.05493  0.2344  
## Number of obs: 500, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.1055     0.1887   0.559    0.576

Third, We check the Inter-coefficient correlation = 0.016 which shows that 1.6% of the variation is due to sound.

Since the ICC is very low, a linear mixed model will perform the same as a normal Ordinary least squares model, which means that it will perform similarly as the above analysis since the introduction of sound as a random effect is not causing any significant change.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.016
##   Conditional ICC: 0.016

Visualize the Variation of scores accross Participants and Across Items.

lattice::dotplot(ranef(model_intercept_nig, which = "sounds", condVar = TRUE))
## $sounds

lattice::dotplot(ranef(model_intercept_ig, which = "sounds", condVar = TRUE))
## $sounds

##### NIG & IG - Subset Models

It is clear that in both groups, sound being p negatively impacts the score. The participants are exp(-0.56) = 1.75 times more likely to score (0) if the sound is p. This is consistent in both groups IG and NIG.

model_sub_nig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + sounds + (1 | sounds) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
model_sub_ig      <- glmer(formula = scorepre ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + sounds + (1 | sounds) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
stargazer(model_sub_ig, model_sub_nig, type="text")
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                               scorepre          
##                          (1)            (2)     
## ------------------------------------------------
## age                     0.108         -0.122    
##                        (0.208)        (0.225)   
##                                                 
## countryYes              -0.044       -0.597***  
##                        (0.244)        (0.230)   
##                                                 
## formal_study            -0.058        -0.182*   
##                        (0.104)        (0.099)   
##                                                 
## parent_ed              0.603***      0.314***   
##                        (0.233)        (0.099)   
##                                                 
## spoken                  -0.258         0.271    
##                        (0.185)        (0.207)   
##                                                 
## R_write_mean            0.658*        0.721*    
##                        (0.346)        (0.385)   
##                                                 
## soundsp               -0.564***      -0.557***  
##                        (0.184)        (0.185)   
##                                                 
## Constant                -1.558        -0.029    
##                        (1.168)        (1.137)   
##                                                 
## ------------------------------------------------
## Observations             500            500     
## Log Likelihood         -335.143      -330.297   
## Akaike Inf. Crit.      690.286        680.595   
## Bayesian Inf. Crit.    732.432        722.741   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Final Conclusion so far: There is no significant difference between IG and NIG in pre-test either in participants’ score, considering items, or sounds.



Post-Test Analysis

4 post-test analysis: post-test word type 1, post-test word type 2, post-test word type 3, post-test word type 4; in each do:

Strategy: 1. The influence of interactive DST on the perception of English /p/ and /b/ among Saudi EFL kindergarten children
1.a. Participants’ Performance Scores
1.b. Performance on Items Scores
1.c. Performance on Sounds Scores

2. The influence of non-interactive DST on the perception of English /p/ and /b/ among Saudi EFL kindergarten children
Do the same analysis as above (a,b,c).

3. Compare the two groups’ performance

Word Type 1 - Post Test

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.58, which is insignificant.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","scorepost")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     8.720 7.680
##    std dev. 5.029 8.117
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  0.545 
##    degrees of freedom:  40.059 
##    p-value:  0.589 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-2.82, 4.9] 
##    estimated effect size (Cohen's d):  0.154

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepost)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : int  1 1 1 1 1 0 0 1 1 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepost     <- as.factor(data_model$scorepost)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : Factor w/ 2 levels "0","1": 2 2 2 2 2 1 1 2 2 1 ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group)
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group)

Model

IG - Intercept Only Model

First, we model the data using the ID only.

model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 596, and BIC = 604, with estimate of intercept = 0.964 which means that participants are 2.6 more likely to score (1) in this group.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    596.3    604.8   -296.2    592.3      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9900 -1.3940  0.5771  0.6307  0.7800 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.136    0.3687  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.9640     0.1269   7.598 3.01e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(0.964)
## [1] 2.622164

Third, We check the Inter-coefficient correlation = 0.04. This means that only 4% of the variation in the scores is explained by the difference in participants. Therefore, the variance in the scores is not due to the clusters (participants). This reaffirms that no participants perform remarkably better/worse than others even after post-test.

Also, this indicates that a simpler analysis can be used here, instead of linear mixed models, we can remove the random effect and use a linear model instead.

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.040
##   Conditional ICC: 0.040
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

Similarly as IG, we see the same results for NIG.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    582.1    590.5   -289.0    578.1      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8362 -0.9260  0.4102  0.6460  1.0799 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.9158   0.957   
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.9756     0.2228   4.379 1.19e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(0.9756)
## [1] 2.652758

In the NIG group, 21% of the scores post-test are due to student clustering. Therefore there is a difference between participants inside the group NIG.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.218
##   Conditional ICC: 0.218

View Difference:

lattice::dotplot(ranef(model_intercept_nig, which = "ID", condVar = TRUE))
## $ID

IG - Full Model

We note that parent_ed is slightly significant (p-value of 0.0437, Wald Test for Linear Mixed Models), with coeff = 0.59. Therefore, every one level increase in the parents’ education, increases the odds of scoring (1) by exp(0.59) = 1.803 times.

model_full_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    599.8    637.8   -290.9    581.8      491 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8041 -1.2064  0.5461  0.6602  0.8648 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   -0.8968     1.2887  -0.696   0.4865  
## age            0.1729     0.2292   0.754   0.4506  
## countryYes    -0.5192     0.2752  -1.887   0.0592 .
## formal_study  -0.1895     0.1135  -1.669   0.0951 .
## parent_ed      0.4293     0.2696   1.592   0.1114  
## spoken         0.1060     0.2027   0.523   0.6011  
## R_write_mean   0.1755     0.5999   0.293   0.7698  
## R_read_mean    0.5095     0.7148   0.713   0.4760  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.903                                          
## countryYes   0.192 -0.091                                   
## formal_stdy  0.125 -0.211  0.272                            
## parent_ed   -0.548  0.193 -0.383 -0.169                     
## spoken      -0.225  0.308 -0.230 -0.122 -0.153              
## R_write_men -0.092  0.015 -0.287  0.002  0.275 -0.103       
## R_read_mean  0.163 -0.111  0.231  0.029 -0.345 -0.209 -0.796
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

However, we see an issue of singularity (some variance is very close to zero) and the full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_ig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.242124     1.631382     1.134301     1.707019     2.023338     4.148094 
##  R_read_mean 
##     4.449098
IG - Subset Model

We see that the parent_ed is close to be significant. THe Anova test shows p-value of 0.06 between model with parent_ed and model without parent_ed. 

model_sub_ig           <- glmer(formula = scorepost ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    598.3    627.8   -292.1    584.3      493 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.6755 -1.2680  0.5608  0.6715  0.8169 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.023    0.1517  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   -1.1471     1.3230  -0.867   0.3859  
## age            0.2289     0.2356   0.972   0.3312  
## countryYes    -0.4863     0.2758  -1.763   0.0779 .
## formal_study  -0.1958     0.1187  -1.649   0.0991 .
## parent_ed      0.4972     0.2675   1.859   0.0631 .
## spoken         0.2575     0.1853   1.390   0.1646  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.901                            
## countryYes   0.176 -0.094                     
## formal_stdy  0.097 -0.189  0.273              
## parent_ed   -0.527  0.156 -0.328 -0.151       
## spoken      -0.175  0.268 -0.302 -0.098 -0.311
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepost ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepost ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_no_ed_ig    6 599.60 624.89 -293.80   587.60                       
## model_sub_ig          7 598.29 627.79 -292.14   584.29 3.3145  1    0.06867 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.203525     1.493529     1.115386     1.542661     1.586921

Compare Full model to Sub Model: It is ambiguous which model performs better; however sub_model has lower BIC.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_ig     7 598.29 627.79 -292.14   584.29                     
## model_full_ig    9 599.82 637.75 -290.91   581.82 2.4664  2     0.2914
NIG - Full Model

We note that parent_ed is more significant here and spoken is slightly significant in the model. This means that the education of the parents and spoken english level by participants are the most significant in determining the scores, and they increase the likelyhood of scoring (1) as their coefficients are positive.

model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    579.4    617.3   -280.7    561.4      491 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.4796 -0.9325  0.4084  0.6185  1.2198 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.3092   0.5561  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.28814    2.10903  -0.611 0.541349    
## age           0.09594    0.42051   0.228 0.819518    
## countryYes   -0.74166    0.38093  -1.947 0.051539 .  
## formal_study -0.39658    0.17219  -2.303 0.021270 *  
## parent_ed     0.65515    0.17754   3.690 0.000224 ***
## spoken        0.67284    0.38216   1.761 0.078305 .  
## R_write_mean  1.56935    0.86616   1.812 0.070008 .  
## R_read_mean  -0.93907    0.97080  -0.967 0.333388    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.967                                          
## countryYes   0.314 -0.282                                   
## formal_stdy -0.304  0.298 -0.045                            
## parent_ed   -0.331  0.180 -0.213 -0.225                     
## spoken      -0.553  0.449 -0.344  0.014  0.207              
## R_write_men -0.070  0.043  0.065 -0.038  0.318 -0.101       
## R_read_mean  0.481 -0.471 -0.001 -0.181 -0.355 -0.438 -0.658

The full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_nig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.763008     1.410374     1.344774     1.372268     2.414321     3.338671 
##  R_read_mean 
##     5.021610
NIG - Subset Model

We see that the parent_ed is still significant, and AIC, BIC are very close to the full model. Country is also showing to be significant, with a negative coefficient. Theoratically, it is difficult to tell whether living in an English speaking country would impact a child that is still 3-4 years of age without understanding the social behaviour/participation inside or outside the house.

In the analysis, we need to consider the SD of the coefficient. If coeff +- SD == 0 or changes the sign of the coeff then we cannot deduce any significant relation.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    578.5    608.0   -282.2    564.5      493 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0645 -0.9020  0.4168  0.6381  1.2034 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.4034   0.6352  
## Number of obs: 500, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   -1.3930     1.8580  -0.750  0.45341   
## age            0.1401     0.3673   0.381  0.70289   
## countryYes    -0.8162     0.4085  -1.998  0.04571 * 
## formal_study  -0.3834     0.1778  -2.156  0.03107 * 
## parent_ed      0.5788     0.1781   3.250  0.00115 **
## spoken         0.8230     0.3016   2.729  0.00635 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.951                            
## countryYes   0.363 -0.324                     
## formal_stdy -0.191  0.180 -0.028              
## parent_ed   -0.265  0.075 -0.237 -0.291       
## spoken      -0.294  0.107 -0.410 -0.245  0.152

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.159680     1.406180     1.239135     1.173819     1.306776

Compare Full model to Sub Model: No significant difference between the models, but subset model has lower AIC and BIC.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig     7 578.49 608.00 -282.25   564.49                     
## model_full_nig    9 579.35 617.28 -280.68   561.35 3.1425  2     0.2078

NIG vs IG Participants Score

It appears that there is a very slight difference in the models between IG and NIG, although the predictors were assessed previously and no significant difference was shown across groups (using T-tests and Fisher’s Tests). However, we cannot deduce a significant difference as all the coefficients have the same signs, and most of them (country, formal_study, spoken have similar significance).

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     0.229          0.140    
##                        (0.236)        (0.367)   
##                                                 
## countryYes             -0.486*       -0.816**   
##                        (0.276)        (0.409)   
##                                                 
## formal_study           -0.196*       -0.383**   
##                        (0.119)        (0.178)   
##                                                 
## parent_ed               0.497*       0.579***   
##                        (0.268)        (0.178)   
##                                                 
## spoken                  0.257        0.823***   
##                        (0.185)        (0.302)   
##                                                 
## Constant                -1.147        -1.393    
##                        (1.323)        (1.858)   
##                                                 
## ------------------------------------------------
## Observations             500            500     
## Log Likelihood         -292.144      -282.247   
## Akaike Inf. Crit.      598.288        578.495   
## Bayesian Inf. Crit.    627.790        607.997   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

There is a huge improvement over pre-test! in both groups.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("most.freqitems1","scorepost","group")])
scores <- as.data.frame(scores)
items  <- scores$most.freqitems1

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scorepost==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepost==0) %>% select(Freq)
ig_score = data.frame(items = items[0:20], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scorepost==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepost==0) %>% select(Freq)
nig_score = data.frame(items = items[0:20], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(most.freqitems1)

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 596 and BIC = 605 much lower than pre-test AIC and BIC. We also note that the coeff = 0.95 (pre-test was nearly zero), which means that the participants are now 2.58 times more likely to score (1).

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    596.7    605.1   -296.3    592.7      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0267 -1.3956  0.5761  0.6437  0.7166 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.1164   0.3411  
## Number of obs: 500, groups:  items, 20
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.9597     0.1280   7.499 6.41e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(0.9597)
## [1] 2.610913

Third, We check the Inter-coefficient correlation = 0.034 showing that only 3.4% of the variation in the scores is due to items post-test. This shows that despite the items, the participants are 2.5 more likely to score correctly (1).

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.034
##   Conditional ICC: 0.034
NIG - Intercept Only

We observe similar result for group NIG with coeff = 0.87.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(most.freqitems1)

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    608.9    617.3   -302.5    604.9      498 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4483 -1.1551  0.5557  0.6421  0.9435 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.3387   0.582   
## Number of obs: 500, groups:  items, 20
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8703     0.1661   5.239 1.62e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Then, We check the Inter-coefficient correlation = 0.093 which shows that 9.3% of the variation in the score is due to the difference in the items. Therefore, in NIG we see that the items still have a slight effect on the scores.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.093
##   Conditional ICC: 0.093
NIG - Full Model

All the items have slight significance in determining score in post-test.

model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    553.9    676.1   -247.9    495.9      471 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -12.2250  -0.6303   0.3041   0.5748   1.8514 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.4707   0.6861  
##  items  (Intercept) 0.0000   0.0000  
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.130e+00  2.505e+00  -0.451 0.651867    
## age           9.548e-02  4.894e-01   0.195 0.845317    
## countryYes   -8.582e-01  4.435e-01  -1.935 0.053021 .  
## formal_study -4.721e-01  1.999e-01  -2.362 0.018189 *  
## parent_ed     7.638e-01  2.076e-01   3.679 0.000234 ***
## spoken        8.025e-01  4.459e-01   1.800 0.071920 .  
## R_write_mean  1.786e+00  1.005e+00   1.777 0.075556 .  
## R_read_mean  -1.050e+00  1.138e+00  -0.923 0.356168    
## itemsball    -1.750e+00  6.894e-01  -2.538 0.011137 *  
## itemsbat     -1.132e+00  6.849e-01  -1.653 0.098431 .  
## itemsbed      2.773e-01  7.432e-01   0.373 0.709112    
## itemsbird     2.773e-01  7.433e-01   0.373 0.709111    
## itemsblack   -4.865e-01  6.980e-01  -0.697 0.485853    
## itemsboat     2.379e+00  1.162e+00   2.048 0.040545 *  
## itemsbox     -1.557e-01  7.239e-01  -0.215 0.829699    
## itemsbubble  -7.089e-01  6.913e-01  -1.025 0.305142    
## itemsbus     -2.519e-01  7.080e-01  -0.356 0.721951    
## itemspat     -1.750e+00  6.894e-01  -2.538 0.011136 *  
## itemspen      9.707e-01  8.257e-01   1.176 0.239742    
## itemspenguin -1.042e-01  7.065e-01  -0.147 0.882794    
## itemspeople  -7.089e-01  6.913e-01  -1.025 0.305145    
## itemspicture -1.338e+00  6.846e-01  -1.954 0.050700 .  
## itemspig      1.464e+00  9.203e-01   1.591 0.111574    
## itemspin     -1.513e+00  6.903e-01  -2.192 0.028406 *  
## itemsplant    7.401e-06  7.224e-01   0.000 0.999992    
## itemsplay    -9.230e-01  6.871e-01  -1.343 0.179128    
## itemspot     -2.519e-01  7.080e-01  -0.356 0.721951    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 27 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.767131  1        1.329335
## country      1.416695  1        1.190250
## formal_study 1.347799  1        1.160947
## parent_ed    1.369350  1        1.170192
## spoken       2.432231  1        1.559561
## R_write_mean 3.360498  1        1.833166
## R_read_mean  5.058221  1        2.249049
## items        1.027823 19        1.000722
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    552.9    666.7   -249.5    498.9      473 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -11.0514  -0.6324   0.3052   0.5780   1.9155 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.5953   0.7716  
##  items  (Intercept) 0.0000   0.0000  
## Number of obs: 500, groups:  ID, 25; items, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -1.302e+00  2.221e+00  -0.586  0.55757   
## age           1.548e-01  4.274e-01   0.362  0.71719   
## countryYes   -9.401e-01  4.750e-01  -1.979  0.04782 * 
## formal_study -4.558e-01  2.063e-01  -2.209  0.02714 * 
## parent_ed     6.796e-01  2.080e-01   3.267  0.00109 **
## spoken        9.827e-01  3.518e-01   2.794  0.00521 **
## itemsball    -1.752e+00  6.892e-01  -2.541  0.01104 * 
## itemsbat     -1.134e+00  6.848e-01  -1.655  0.09786 . 
## itemsbed      2.781e-01  7.436e-01   0.374  0.70842   
## itemsbird     2.781e-01  7.436e-01   0.374  0.70842   
## itemsblack   -4.876e-01  6.981e-01  -0.698  0.48492   
## itemsboat     2.383e+00  1.162e+00   2.051  0.04027 * 
## itemsbox     -1.550e-01  7.243e-01  -0.214  0.83057   
## itemsbubble  -7.104e-01  6.913e-01  -1.028  0.30417   
## itemsbus     -2.526e-01  7.082e-01  -0.357  0.72135   
## itemspat     -1.752e+00  6.892e-01  -2.541  0.01104 * 
## itemspen      9.737e-01  8.262e-01   1.179  0.23859   
## itemspenguin -1.055e-01  7.067e-01  -0.149  0.88137   
## itemspeople  -7.104e-01  6.913e-01  -1.028  0.30417   
## itemspicture -1.340e+00  6.845e-01  -1.957  0.05035 . 
## itemspig      1.469e+00  9.207e-01   1.595  0.11071   
## itemspin     -1.514e+00  6.902e-01  -2.194  0.02826 * 
## itemsplant    1.995e-05  7.227e-01   0.000  0.99998   
## itemsplay    -9.247e-01  6.870e-01  -1.346  0.17832   
## itemspot     -2.526e-01  7.082e-01  -0.357  0.72136   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 25 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; It is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    27 552.92 666.71 -249.46   498.92                     
## model_full_nig   29 553.86 676.08 -247.93   495.86 3.0589  2     0.2167

Including R_Write_mean; We see that there isnt much of a difference between the two models.

model_sub2_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.376175  1        1.173105
## country      1.419002  1        1.191219
## formal_study 1.301956  1        1.141033
## parent_ed    1.197457  1        1.094284
## spoken       1.965659  1        1.402020
## R_write_mean 1.889114  1        1.374450
## items        1.025945 19        1.000674
anova(model_sub2_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    27 552.92 666.71 -249.46   498.92                     
## model_sub2_nig   28 552.69 670.70 -248.35   496.69 2.2258  1     0.1357

Visualize the Variation of scores intercept accross Participants and Across Items.

lattice::dotplot(ranef(model_sub2_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG).

Previously: In Pre-Test: We can see that sound 'b' is percieved correctly way more than 'p'. This finding is logical as phoneme 'p' does not exist in the arabic language and children will logically find it harder to percieve.

Post-Test: We see huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds.

## Get scores per group
scores <- table(filtered_data[c("soundpost","scorepost","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundpost

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scorepost==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepost==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scorepost==1) %>% select(Freq) - scores %>% subset(group == "IG" & scorepost==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups Post Test",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants.

Several participants that were low performers in the pre-test performed relatively well post-test such as Participant IG1.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$scorepost, by=list(filtered_data$ID, filtered_data$soundpost), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants; It seems like on average, IG participants are performing slightly better than NIG participants. However, we conducted a significance test in the beginning and showed that there is no significant difference between the groups.

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(scorepost)

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 9 and BIC = 17.5 which are very low compared to previous (pre-test) model including sounds.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##      9.0     17.5     -2.5      5.0      498 
## 
## Scaled residuals: 
##       Min        1Q    Median        3Q       Max 
## -0.007494 -0.007494  0.004696  0.004696  0.004696 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 1295     35.99   
## Number of obs: 500, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.4681    20.3479   0.023    0.982

Third, We check the Inter-coefficient correlation = 0.997 which means nearly 100% of the variance that’s left over after including the fixed effect (1) is explained by difference in the sounds. This indicates high similarity between values from the same group (group sound b or group sound p)

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.997
##   Conditional ICC: 0.997
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(scorepost)

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##      9.0     17.5     -2.5      5.0      498 
## 
## Scaled residuals: 
##       Min        1Q    Median        3Q       Max 
## -0.007151 -0.007151  0.004771  0.004771  0.004771 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 1306     36.14   
## Number of obs: 500, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.4054    22.1294   0.018    0.985

Similarly for NIG. ICC ~= 100% meaning that all the variance in the score is due to the sounds.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.997
##   Conditional ICC: 0.997

Visualize the Variation of sounds intercepts across groups; nearly identical.

lattice::dotplot(ranef(model_intercept_nig, which = "sounds", condVar = TRUE))
## $sounds

lattice::dotplot(ranef(model_intercept_ig, which = "sounds", condVar = TRUE))
## $sounds

##### Subset Models

We can see that the p-values of all other variables is, showing that the improvements in the scores of participants is not due to the predictors in the model.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | sounds) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
model_sub_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | sounds) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX
## Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     sounds + (1 | sounds) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##     18.0     55.9      0.0      0.0      491 
## 
## Scaled residuals: 
##       Min        1Q    Median        3Q       Max 
## -1.49e-08 -1.49e-08  1.49e-08  1.49e-08  1.49e-08 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1192   0.3452  
##  sounds (Intercept) 0.0000   0.0000  
## Number of obs: 500, groups:  ID, 25; sounds, 2
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.957e+01  3.440e+07       0        1
## age           4.913e-07  6.720e+06       0        1
## countryYes    1.198e-06  7.484e+06       0        1
## formal_study  2.267e-07  3.174e+06       0        1
## parent_ed    -6.191e-07  3.288e+06       0        1
## spoken       -1.710e-07  5.547e+06       0        1
## sounds        7.888e+01  6.774e+06       0        1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken
## age         -0.949                                   
## countryYes   0.383 -0.358                            
## formal_stdy -0.224  0.211 -0.079                     
## parent_ed   -0.257  0.070 -0.206 -0.244              
## spoken      -0.316  0.156 -0.418 -0.263  0.176       
## sounds      -0.040 -0.030  0.126  0.135 -0.218 -0.204
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
summary(model_sub_ig)
## Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX

## Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     sounds + (1 | sounds) + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##     18.0     55.9      0.0      0.0      491 
## 
## Scaled residuals: 
##       Min        1Q    Median        3Q       Max 
## -1.49e-08 -1.49e-08  1.49e-08  1.49e-08  1.49e-08 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1199   0.3463  
##  sounds (Intercept) 0.0000   0.0000  
## Number of obs: 500, groups:  ID, 25; sounds, 2
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.957e+01  3.698e+07       0        1
## age          -1.062e-05  6.555e+06       0        1
## countryYes    1.051e-06  7.601e+06       0        1
## formal_study -1.437e-06  3.327e+06       0        1
## parent_ed    -2.244e-06  7.418e+06       0        1
## spoken       -2.100e-06  5.169e+06       0        1
## sounds        7.888e+01  6.737e+06       0        1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken
## age         -0.891                                   
## countryYes   0.126 -0.057                            
## formal_stdy  0.036 -0.154  0.204                     
## parent_ed   -0.523  0.153 -0.282 -0.095              
## spoken      -0.105  0.201 -0.291 -0.063 -0.345       
## sounds      -0.061 -0.041  0.079  0.073 -0.084 -0.064
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')


Final Conclusion: Both groups IG and NIG improved in Post-test results for type-1 Words; despite age, country, formal study, etc…

Word Type 2 - Post Test

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.604, which is insignificant.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","scoreF2")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     6.560 7.200
##    std dev. 4.454 4.203
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.523 
##    degrees of freedom:  47.839 
##    p-value:  0.604 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-3.103, 1.823] 
##    estimated effect size (Cohen's d):  0.148

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepost)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : int  1 1 1 1 1 0 0 1 1 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepost     <- as.factor(filtered_data$scoreF2)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group) %>% subset(!(is.na(scorepost)))
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group) %>% subset(!(is.na(scorepost)))

Model

IG - Intercept Only Model

First, we model the data using the ID only.

model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 487.4 and BIC = 495.3, with a 0.8994 coefficient of intercept and low std error of 0.1360. Also P-value is significant. This means that the participants are more likely to perform positively on word list-2.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    487.4    495.3   -241.7    483.4      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8713 -1.3394  0.5898  0.6492  0.7815 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1386   0.3722  
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8994     0.1360   6.614 3.73e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Third, We check the Inter-coefficient correlation = 0.04. This means that only 4% of the variation in the scores is explained by the difference in participants. Therefore, the variance in the scores is not due to the clusters (participants). This reaffirms that no participants perform remarkably better/worse than others even after post-test.

Also, this indicates that a simpler analysis can be used here, instead of linear mixed models, we can remove the random effect and use a linear model instead.

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.040
##   Conditional ICC: 0.040
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

The participants perform similarly in NIG.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    473.4    481.4   -234.7    469.4      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8779 -1.3454  0.5774  0.6251  0.7580 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.1055   0.3248  
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.9930     0.1329   7.474 7.77e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the NIG group, ICC is 0.031, which means 3.1% of the score is explained by clusters of students - there is no significant pattern of performance between participants.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.031
##   Conditional ICC: 0.031

View Difference:

lattice::dotplot(ranef(model_intercept_nig, which = "ID", condVar = TRUE))
## $ID

IG - Full Model

No predictors are significant in predicting the score.

model_full_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_full_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    493.1    529.0   -237.5    475.1      391 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3934 -1.2220  0.5863  0.6612  0.8505 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.01015  0.1008  
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -0.02425    1.41930  -0.017   0.9864  
## age          -0.02228    0.25426  -0.088   0.9302  
## countryYes   -0.19344    0.31104  -0.622   0.5340  
## formal_study -0.23506    0.12648  -1.859   0.0631 .
## parent_ed     0.59795    0.31625   1.891   0.0587 .
## spoken        0.04951    0.22875   0.216   0.8286  
## R_write_mean -0.62011    0.67045  -0.925   0.3550  
## R_read_mean   0.32579    0.81016   0.402   0.6876  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.893                                          
## countryYes   0.178 -0.058                                   
## formal_stdy  0.061 -0.150  0.233                            
## parent_ed   -0.505  0.121 -0.376 -0.142                     
## spoken      -0.175  0.280 -0.177 -0.042 -0.245              
## R_write_men -0.089  0.006 -0.282  0.001  0.272 -0.121       
## R_read_mean  0.142 -0.078  0.204  0.010 -0.350 -0.168 -0.797

However, we see an issue of singularity (some variance is very close to zero) and the full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

We repeat the prior workflow - you might notice here that some of the codes are just repetitive.

car::vif(model_full_ig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.183883     1.635163     1.090241     1.858023     2.069721     4.247381 
##  R_read_mean 
##     4.587964
IG - Subset Model
model_sub_ig           <- glmer(formula = scorepost ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    490.2    518.1   -238.1    476.2      393 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3173 -1.3058  0.5680  0.6696  0.8568 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.0236   0.1536  
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -0.06038    1.43408  -0.042   0.9664  
## age          -0.03586    0.25828  -0.139   0.8896  
## countryYes   -0.28153    0.30295  -0.929   0.3527  
## formal_study -0.23256    0.12922  -1.800   0.0719 .
## parent_ed     0.63993    0.29859   2.143   0.0321 *
## spoken       -0.02811    0.20673  -0.136   0.8918  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.898                            
## countryYes   0.158 -0.062                     
## formal_stdy  0.079 -0.162  0.259              
## parent_ed   -0.491  0.107 -0.334 -0.167       
## spoken      -0.163  0.263 -0.269 -0.043 -0.356
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepost ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepost ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
## model_sub_no_ed_ig    6 492.52 516.47 -240.26   480.52                       
## model_sub_ig          7 490.20 518.14 -238.10   476.20 4.3205  1    0.03766 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.176402     1.488176     1.104562     1.593679     1.618517

Compare Full model to Sub Model: It is ambiguous which model performs better; however sub_model has lower AIC and BIC.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_ig     7 490.20 518.14 -238.10   476.20                     
## model_full_ig    9 493.05 528.98 -237.53   475.05 1.1486  2     0.5631
NIG - Full Model
model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    476.1    512.0   -229.0    458.1      391 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.1902 -1.2089  0.4935  0.6570  0.8595 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   3.200357   1.531467   2.090  0.03664 * 
## age          -0.395867   0.300354  -1.318  0.18750   
## countryYes   -0.141625   0.284573  -0.498  0.61871   
## formal_study -0.365482   0.120829  -3.025  0.00249 **
## parent_ed    -0.093217   0.137850  -0.676  0.49890   
## spoken       -0.001887   0.287962  -0.007  0.99477   
## R_write_mean -0.876179   0.653558  -1.341  0.18004   
## R_read_mean   1.403636   0.777984   1.804  0.07120 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.964                                          
## countryYes   0.354 -0.323                                   
## formal_stdy -0.322  0.317 -0.048                            
## parent_ed   -0.366  0.196 -0.194 -0.153                     
## spoken      -0.535  0.436 -0.385  0.024  0.207              
## R_write_men -0.118  0.098  0.085  0.036  0.265 -0.065       
## R_read_mean  0.467 -0.451  0.013 -0.232 -0.351 -0.455 -0.699
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

The full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_nig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.668634     1.476697     1.358870     1.296269     2.706788     3.687190 
##  R_read_mean 
##     5.844424
NIG - Subset Model

We see that formal study is significant, which might be interesting - does that mean only formal study is important in post-test (on word list 2)? What would be a theoratical reasoning behind that?

In the analysis, we need to consider the SD of the coefficient. If coeff +- SD == 0 or changes the sign of the coeff then we cannot deduce any significant relation.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    475.5    503.4   -230.7    461.5      393 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0379 -1.2753  0.5445  0.6331  0.8851 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 400, groups:  ID, 25
## 
## Fixed effects:
##               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   1.950416   1.295238   1.506  0.13211   
## age          -0.155571   0.253475  -0.614  0.53938   
## countryYes   -0.135829   0.279203  -0.486  0.62662   
## formal_study -0.320523   0.115602  -2.773  0.00556 **
## parent_ed    -0.005004   0.126280  -0.040  0.96839   
## spoken        0.213125   0.204488   1.042  0.29730   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.950                            
## countryYes   0.378 -0.342                     
## formal_stdy -0.220  0.218 -0.051              
## parent_ed   -0.272  0.062 -0.195 -0.258       
## spoken      -0.302  0.130 -0.431 -0.252  0.125
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.193123     1.433234     1.256286     1.129138     1.358078

Compare Full model to Sub Model: No significant difference between the models, but subset model has lower AIC and BIC.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##                npar   AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig     7 475.5 503.44 -230.75    461.5                     
## model_full_nig    9 476.1 512.02 -229.05    458.1 3.3982  2     0.1828

NIG vs IG Participants Score

There doesnt appear to be a significant difference between the two groups.

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     -0.036        -0.156    
##                        (0.258)        (0.253)   
##                                                 
## countryYes              -0.282        -0.136    
##                        (0.303)        (0.279)   
##                                                 
## formal_study           -0.233*       -0.321***  
##                        (0.129)        (0.116)   
##                                                 
## parent_ed              0.640**        -0.005    
##                        (0.299)        (0.126)   
##                                                 
## spoken                  -0.028         0.213    
##                        (0.207)        (0.204)   
##                                                 
## Constant                -0.060         1.950    
##                        (1.434)        (1.295)   
##                                                 
## ------------------------------------------------
## Observations             400            400     
## Log Likelihood         -238.101      -230.749   
## Akaike Inf. Crit.      490.202        475.497   
## Bayesian Inf. Crit.    518.142        503.437   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

boy, bell, prince have relatively high scores; basket, pail, police have relatively lower scores. This is interesting; to see boy being correctly classified by participants (maybe because it is a highly used word at their age/in their environment?). Similarly to prince (which might occur often in kids stories), as well as bell (which might occur frequently between classes; bell rings).

## Get scores per ID (0/1)
scores <- table(filtered_data[c("lessfreqlist2items","scoreF2","group")])
scores <- as.data.frame(scores)
items  <- scores$lessfreqlist2items

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
ig_score = data.frame(items = items[0:16], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
nig_score = data.frame(items = items[0:16], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words (list 2, less frequency) in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words of less Freq (2)"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(lessfreqlist2items) %>% subset(!is.na(lessfreqlist2items))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 487.4 and BIC = 495.4 much lower than pre-test AIC and BIC. We also note that the coeff = 0.8965 (pre-test was nearly zero), which means that the participants are now 2.45101 times more likely to score (1).

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    487.4    495.4   -241.7    483.4      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9996 -1.4140  0.6094  0.6818  0.7333 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.1223   0.3497  
## Number of obs: 400, groups:  items, 16
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8965     0.1432   6.259 3.87e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(0.8965)
## [1] 2.45101

Third, We check the Inter-coefficient correlation = 0.036 showing that only 3.6% of the variation in the scores is due to items post-test. This shows that despite the items, the participants are 2.5 more likely to score correctly (1).

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.036
##   Conditional ICC: 0.036
NIG - Intercept Only

We observe similar result for group NIG with coeff = 2.6363 and but sligthly higher STD (0.1576 > 0.14) which shows that it is not necessary that participants in NIG group have slightly higher odds of scoring (1) than participants in the IG group.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(lessfreqlist2items) %>% subset(!is.na(lessfreqlist2items))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    470.7    478.7   -233.3    466.7      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0972 -1.1915  0.5565  0.6131  0.8393 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.1795   0.4237  
## Number of obs: 400, groups:  items, 16
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   1.0088     0.1576     6.4 1.55e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(1.0088)
## [1] 2.742308
NIG - Full Model

All the items have slight significance in determining score in post-test.

model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    477.1    576.9   -213.6    427.1      375 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6031 -0.8868  0.4366  0.6196  1.6612 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 400, groups:  ID, 25; items, 16
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   3.830e+00  1.679e+00   2.281  0.02254 * 
## age          -4.304e-01  3.138e-01  -1.371  0.17022   
## countryYes   -1.517e-01  2.964e-01  -0.512  0.60867   
## formal_study -3.972e-01  1.265e-01  -3.139  0.00169 **
## parent_ed    -1.010e-01  1.430e-01  -0.706  0.48030   
## spoken       -1.700e-03  2.999e-01  -0.006  0.99548   
## R_write_mean -9.476e-01  6.802e-01  -1.393  0.16355   
## R_read_mean   1.518e+00  8.093e-01   1.876  0.06071 . 
## itemsbell     2.221e-07  7.178e-01   0.000  1.00000   
## itemsbin      6.213e-01  8.025e-01   0.774  0.43880   
## itemsblock   -2.412e-01  6.961e-01  -0.347  0.72894   
## itemsboot     1.134e-06  7.178e-01   0.000  1.00000   
## itemsboy      1.079e+00  8.993e-01   1.200  0.23030   
## itemsbride   -8.412e-01  6.626e-01  -1.270  0.20422   
## itemsbutter  -2.412e-01  6.961e-01  -0.347  0.72894   
## itemspail    -1.695e+00  6.548e-01  -2.589  0.00963 **
## itemspan     -1.191e+00  6.543e-01  -1.820  0.06882 . 
## itemspanda   -6.553e-01  6.702e-01  -0.978  0.32819   
## itemsplane   -1.019e+00  6.574e-01  -1.549  0.12127   
## itemsplates  -2.412e-01  6.961e-01  -0.347  0.72894   
## itemspolice  -8.412e-01  6.626e-01  -1.270  0.20422   
## itemspony     2.795e-01  7.502e-01   0.373  0.70945   
## itemsprince  -4.570e-01  6.809e-01  -0.671  0.50211   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 23 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.678891  1        1.295720
## country      1.473604  1        1.213921
## formal_study 1.367911  1        1.169577
## parent_ed    1.294126  1        1.137597
## spoken       2.698874  1        1.642825
## R_write_mean 3.678256  1        1.917878
## R_read_mean  5.820152  1        2.412499
## items        1.011061 15        1.000367
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    476.7    568.5   -215.4    430.7      377 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.3802 -0.9075  0.4480  0.6117  1.7239 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.02114  0.1454  
##  items  (Intercept) 0.00000  0.0000  
## Number of obs: 400, groups:  ID, 25; items, 16
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   2.464e+00  1.474e+00   1.671  0.09468 . 
## age          -1.676e-01  2.718e-01  -0.617  0.53741   
## countryYes   -1.501e-01  3.002e-01  -0.500  0.61707   
## formal_study -3.493e-01  1.250e-01  -2.794  0.00521 **
## parent_ed    -5.461e-03  1.350e-01  -0.040  0.96773   
## spoken        2.339e-01  2.204e-01   1.061  0.28856   
## itemsbell     1.558e-05  7.165e-01   0.000  0.99998   
## itemsbin      6.198e-01  8.016e-01   0.773  0.43940   
## itemsblock   -2.402e-01  6.947e-01  -0.346  0.72947   
## itemsboot     1.991e-05  7.165e-01   0.000  0.99998   
## itemsboy      1.077e+00  8.986e-01   1.198  0.23084   
## itemsbride   -8.368e-01  6.611e-01  -1.266  0.20558   
## itemsbutter  -2.402e-01  6.947e-01  -0.346  0.72947   
## itemspail    -1.684e+00  6.537e-01  -2.576  0.01000 * 
## itemspan     -1.183e+00  6.528e-01  -1.813  0.06988 . 
## itemspanda   -6.521e-01  6.687e-01  -0.975  0.32947   
## itemsplane   -1.013e+00  6.560e-01  -1.544  0.12254   
## itemsplates  -2.402e-01  6.947e-01  -0.346  0.72947   
## itemspolice  -8.368e-01  6.611e-01  -1.266  0.20558   
## itemspony     2.787e-01  7.491e-01   0.372  0.70983   
## itemsprince  -4.550e-01  6.795e-01  -0.670  0.50310   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 21 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; It is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    23 476.73 568.53 -215.36   430.73                     
## model_full_nig   25 477.10 576.89 -213.55   427.10 3.6279  2      0.163

Including R_Write_mean; We see that there isnt much of a difference between the two models.

model_sub2_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.349161  1        1.161534
## country      1.459417  1        1.208063
## formal_study 1.303340  1        1.141639
## parent_ed    1.128820  1        1.062460
## spoken       2.131893  1        1.460100
## R_write_mean 1.862702  1        1.364808
## items        1.008853 15        1.000294
anova(model_sub2_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    23 476.73 568.53 -215.36   430.73                     
## model_sub2_nig   24 478.71 574.51 -215.35   430.71 0.0184  1     0.8922

Visualize the Variation of scores intercept accross Participants and Across Items.

lattice::dotplot(ranef(model_sub2_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG).

Previously: In Pre-Test: We can see that sound 'b' is percieved correctly way more than 'p'. This finding is logical as phoneme 'p' does not exist in the arabic language and children will logically find it harder to percieve.

Post-Test: We see huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds.

Post-Test Word List 2: We Also see a huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds over pre-test.

## Get scores per group
scores <- table(filtered_data[c("soundF2","scoreF2","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundF2

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups Post Test - Words List 2",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants.

Several participants that were low performers in the pre-test performed relatively well post-test such as Participant IG1.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$scoreF2, by=list(filtered_data$ID, filtered_data$soundF2), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Word List 2",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants; It seems like on average, IG participants are performing slightly better than NIG participants. However, we conducted a significance test in the beginning and showed that there is no significant difference between the groups.

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the Sound only. There doesnt seem to be enough variation in the scores accross sounds in IG group.

## Add Sounds
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(soundF2) %>% subset(!is.na(soundF2))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    489.3    497.2   -242.6    485.3      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5459 -1.5459  0.6469  0.6469  0.6469 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  sounds (Intercept) 2.566e-15 5.066e-08
## Number of obs: 400, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8712     0.1096   7.946 1.92e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(soundF2) %>% subset(!is.na(soundF2))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    469.8    477.8   -232.9    465.8      398 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8899 -1.4220  0.5291  0.7032  0.7032 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0.1022   0.3197  
## Number of obs: 400, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.9924     0.2534   3.917 8.97e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.030
##   Conditional ICC: 0.030

Visualize the Variation of sounds intercepts across groups; nearly identical.

lattice::dotplot(ranef(model_intercept_nig, which = "sounds", condVar = TRUE))
## $sounds

lattice::dotplot(ranef(model_intercept_ig, which = "sounds", condVar = TRUE))
## $sounds

##### Subset Models

We can see that the p-values of all other variables is, showing that the improvements in the scores of participants is not due to the predictors in the model.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

model_sub_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

stargazer(model_sub_nig, model_sub_ig, type = "text")
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     -0.160        -0.036    
##                        (0.257)        (0.260)   
##                                                 
## countryYes              -0.139        -0.283    
##                        (0.284)        (0.304)   
##                                                 
## formal_study          -0.329***       -0.234*   
##                        (0.118)        (0.130)   
##                                                 
## parent_ed               -0.005        0.643**   
##                        (0.128)        (0.300)   
##                                                 
## spoken                  0.219         -0.028    
##                        (0.208)        (0.208)   
##                                                 
## soundsp               -0.732***       -0.297    
##                        (0.233)        (0.223)   
##                                                 
## Constant                2.372*         0.087    
##                        (1.324)        (1.446)   
##                                                 
## ------------------------------------------------
## Observations             400            400     
## Log Likelihood         -225.672      -237.210   
## Akaike Inf. Crit.      467.344        490.421   
## Bayesian Inf. Crit.    499.276        522.352   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Final Conclusion: …



Word Type 3 - Post Test

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.2, which is insignificant.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","scoreF3")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     1.360 0.160
##    std dev. 3.040 3.555
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  1.283 
##    degrees of freedom:  46.868 
##    p-value:  0.206 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-0.682, 3.082] 
##    estimated effect size (Cohen's d):  0.363

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepost)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : int  1 1 1 1 1 0 0 1 1 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepost     <- as.factor(filtered_data$scoreF3)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : Factor w/ 2 levels "0","1": 2 1 1 2 2 2 2 2 2 2 ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group) %>% subset(!(is.na(scorepost)))
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group) %>% subset(!(is.na(scorepost)))

Model

IG - Intercept Only Model

First, we model the data using the ID only.

model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')

We take note of the AIC = 487.4 and BIC = 495.3, with a 0.8994 coefficient of intercept and low std error of 0.1360. Also P-value is significant. This means that the participants are more likely to perform positively on word list-2.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.9    353.0   -171.0    341.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1466 -1.1466  0.8721  0.8721  0.8721 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2737     0.1277   2.144   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
performance::icc(model_intercept_ig)
## Warning: Can't compute random effect variances. Some variance components equal
##   zero. Your model may suffer from singularity (see '?lme4::isSingular'
##   and '?performance::check_singularity').
##   Solution: Respecify random structure! You may also decrease the
##   'tolerance' level to enforce the calculation of random effect variances.
## [1] NA
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

The participants perform similarly in NIG.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##      350      357     -173      346      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1327 -1.0134  0.8828  0.9509  1.1029 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.09102  0.3017  
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.03266    0.14146   0.231    0.817

In the NIG group, ICC is 0.031, which means 3.1% of the score is explained by clusters of students - there is no significant pattern of performance between participants.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.027
##   Conditional ICC: 0.027

View Difference:

lattice::dotplot(ranef(model_intercept_nig, which = "ID", condVar = TRUE))
## $ID

IG - Full Model

No predictors are significant in predicting the score.

model_full_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    356.5    388.1   -169.2    338.5      241 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4947 -1.0995  0.7419  0.9001  1.1649 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)   1.23230    1.60821   0.766    0.444
## age          -0.25472    0.28458  -0.895    0.371
## countryYes    0.13349    0.34298   0.389    0.697
## formal_study -0.05738    0.14213  -0.404    0.686
## parent_ed     0.14092    0.34509   0.408    0.683
## spoken        0.14894    0.26073   0.571    0.568
## R_write_mean -0.38910    0.75618  -0.515    0.607
## R_read_mean  -0.15586    0.92122  -0.169    0.866
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.893                                          
## countryYes   0.153 -0.050                                   
## formal_stdy  0.030 -0.147  0.171                            
## parent_ed   -0.539  0.161 -0.328 -0.065                     
## spoken      -0.145  0.244 -0.163 -0.070 -0.215              
## R_write_men -0.087 -0.009 -0.300  0.059  0.288 -0.124       
## R_read_mean  0.156 -0.085  0.203 -0.032 -0.368 -0.222 -0.776
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

However, we see an issue of singularity (some variance is very close to zero) and the full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

We repeat the prior workflow - you might notice here that some of the codes are just repetitive.

car::vif(model_full_ig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.161128     1.531018     1.065586     1.768360     2.127645     4.205563 
##  R_read_mean 
##     4.666107
IG - Subset Model
model_sub_ig           <- glmer(formula = scorepost ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    353.5    378.2   -169.8    339.5      243 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3495 -1.0598  0.7651  0.8673  1.0065 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)   1.37620    1.58180   0.870    0.384
## age          -0.29606    0.28100  -1.054    0.292
## countryYes    0.06840    0.32514   0.210    0.833
## formal_study -0.04984    0.14136  -0.353    0.724
## parent_ed     0.12030    0.31842   0.378    0.706
## spoken        0.01402    0.22128   0.063    0.949
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.897                            
## countryYes   0.135 -0.055                     
## formal_stdy  0.039 -0.148  0.200              
## parent_ed   -0.525  0.143 -0.283 -0.093       
## spoken      -0.105  0.199 -0.276 -0.054 -0.363
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepost ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepost ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_no_ed_ig    6 351.68 372.81 -169.84   339.68                     
## model_sub_ig          7 353.54 378.19 -169.77   339.54 0.1428  1     0.7055

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.138302     1.383551     1.063466     1.511487     1.535665

Compare Full model to Sub Model: It is ambiguous which model performs better; however sub_model has lower AIC and BIC.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_ig     7 353.54 378.19 -169.77   339.54                     
## model_full_ig    9 356.45 388.15 -169.23   338.45 1.0863  2     0.5809
NIG - Full Model
model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    347.7    379.4   -164.8    329.7      241 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5806 -1.0018  0.6327  0.8739  1.7733 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)   -1.6949     1.8357  -0.923  0.35587   
## age            0.3889     0.3627   1.072  0.28357   
## countryYes    -0.2261     0.3273  -0.691  0.48959   
## formal_study  -0.2075     0.1421  -1.461  0.14413   
## parent_ed      0.1219     0.1471   0.829  0.40709   
## spoken         0.6643     0.3369   1.972  0.04860 * 
## R_write_mean   1.7545     0.7510   2.336  0.01948 * 
## R_read_mean   -2.8124     0.9085  -3.096  0.00196 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_
## age         -0.968                                          
## countryYes   0.375 -0.350                                   
## formal_stdy -0.293  0.290 -0.122                            
## parent_ed   -0.361  0.202 -0.170 -0.150                     
## spoken      -0.593  0.501 -0.356 -0.001  0.227              
## R_write_men -0.134  0.120  0.032  0.021  0.283 -0.024       
## R_read_mean  0.496 -0.496  0.054 -0.184 -0.340 -0.490 -0.707
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

The full model contains high VIF for spoken, R_write_mean and R_read_mean. This was expected as these variables were found to be correlated. We proceed by subsetting the model.

car::vif(model_full_nig)
##          age      country formal_study    parent_ed       spoken R_write_mean 
##     1.842310     1.427050     1.309931     1.246560     2.755155     3.788420 
##  R_read_mean 
##     5.957584
NIG - Subset Model

We see that formal study is significant, which might be interesting - does that mean only formal study is important in post-test (on word list 2)? What would be a theoratical reasoning behind that?

In the analysis, we need to consider the SD of the coefficient. If coeff +- SD == 0 or changes the sign of the coeff then we cannot deduce any significant relation.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    354.1    378.8   -170.1    340.1      243 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2680 -1.0123  0.7886  0.9499  1.3741 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 2.5e-13  5e-07   
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   1.02336    1.46528   0.698    0.485  
## age          -0.14557    0.28651  -0.508    0.611  
## countryYes   -0.19841    0.31566  -0.629    0.530  
## formal_study -0.28619    0.13571  -2.109    0.035 *
## parent_ed    -0.04599    0.13682  -0.336    0.737  
## spoken        0.20600    0.23232   0.887    0.375  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.951                            
## countryYes   0.385 -0.351                     
## formal_stdy -0.218  0.216 -0.089              
## parent_ed   -0.271  0.063 -0.185 -0.221       
## spoken      -0.326  0.149 -0.403 -0.255  0.139
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.197082     1.402032     1.244957     1.102436     1.337918

Compare Full model to Sub Model: No significant difference between the models, but subset model has lower AIC and BIC.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## model_sub_nig     7 354.10 378.75 -170.05   340.10                        
## model_full_nig    9 347.68 379.38 -164.84   329.68 10.417  2   0.005469 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

NIG vs IG Participants Score

There doesnt appear to be a significant difference between the two groups.

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     -0.296        -0.146    
##                        (0.281)        (0.287)   
##                                                 
## countryYes              0.068         -0.198    
##                        (0.325)        (0.316)   
##                                                 
## formal_study            -0.050       -0.286**   
##                        (0.141)        (0.136)   
##                                                 
## parent_ed               0.120         -0.046    
##                        (0.318)        (0.137)   
##                                                 
## spoken                  0.014          0.206    
##                        (0.221)        (0.232)   
##                                                 
## Constant                1.376          1.023    
##                        (1.582)        (1.465)   
##                                                 
## ------------------------------------------------
## Observations             250            250     
## Log Likelihood         -169.770      -170.050   
## Akaike Inf. Crit.      353.541        354.100   
## Bayesian Inf. Crit.    378.191        378.751   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

boy, bell, prince have relatively high scores; basket, pail, police have relatively lower scores. This is interesting; to see boy being correctly classified by participants (maybe because it is a highly used word at their age/in their environment?). Similarly to prince (which might occur often in kids stories), as well as bell (which might occur frequently between classes; bell rings).

## Get scores per ID (0/1)
scores <- table(filtered_data[c("lessfreqlist2items","scoreF2","group")])
scores <- as.data.frame(scores)
items  <- scores$lessfreqlist2items

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
ig_score = data.frame(items = items[0:16], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
nig_score = data.frame(items = items[0:16], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words (list 2, less frequency) in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words of less Freq (2)"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(lessfreqlist2items) %>% subset(!is.na(lessfreqlist2items))
## Warning in `[<-.data.frame`(`*tmp*`, "items", value =
## structure(list(lessfreqlist2items = c("boy", : replacement element 1 has 400
## rows to replace 250 rows
## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

We take note of the AIC = 487.4 and BIC = 495.4 much lower than pre-test AIC and BIC. We also note that the coeff = 0.8965 (pre-test was nearly zero), which means that the participants are now 2.45101 times more likely to score (1).

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.8    352.8   -170.9    341.8      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2723 -1.1373  0.8282  0.8793  0.9194 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.04029  0.2007  
## Number of obs: 250, groups:  items, 16
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2758     0.1379       2   0.0455 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(0.8965)
## [1] 2.45101

Third, We check the Inter-coefficient correlation = 0.036 showing that only 3.6% of the variation in the scores is due to items post-test. This shows that despite the items, the participants are 2.5 more likely to score correctly (1).

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.012
##   Conditional ICC: 0.012
NIG - Intercept Only

We observe similar result for group NIG with coeff = 2.6363 and but sligthly higher STD (0.1576 > 0.14) which shows that it is not necessary that participants in NIG group have slightly higher odds of scoring (1) than participants in the IG group.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(lessfreqlist2items) %>% subset(!is.na(lessfreqlist2items))
## Warning in `[<-.data.frame`(`*tmp*`, "items", value =
## structure(list(lessfreqlist2items = c("boy", : replacement element 1 has 400
## rows to replace 250 rows
## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    350.5    357.5   -173.2    346.5      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0490 -1.0150  0.9533  0.9805  1.0179 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.02015  0.142   
## Number of obs: 250, groups:  items, 16
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.03226    0.13171   0.245    0.807
exp(1.0088)
## [1] 2.742308
NIG - Full Model

All the items have slight significance in determining score in post-test.

model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    360.2    448.3   -155.1    310.2      225 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0011 -0.8583  0.4082  0.8449  3.0439 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25; items, 16
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -2.09966    2.03446  -1.032  0.30205   
## age           0.39391    0.38657   1.019  0.30821   
## countryYes   -0.29655    0.34491  -0.860  0.38991   
## formal_study -0.26808    0.15060  -1.780  0.07507 . 
## parent_ed     0.16331    0.15455   1.057  0.29064   
## spoken        0.70158    0.35591   1.971  0.04870 * 
## R_write_mean  2.10792    0.81727   2.579  0.00990 **
## R_read_mean  -3.03529    0.98283  -3.088  0.00201 **
## itemsbell     0.68283    0.75322   0.907  0.36464   
## itemsbin      0.09905    0.73891   0.134  0.89337   
## itemsblock    0.75632    0.75592   1.001  0.31705   
## itemsboot     0.27328    0.74029   0.369  0.71201   
## itemsboy      0.21390    0.74302   0.288  0.77344   
## itemsbride    0.93095    0.76776   1.213  0.22530   
## itemsbutter  -0.71117    0.76494  -0.930  0.35252   
## itemspail     0.82490    0.77093   1.070  0.28462   
## itemspan      1.59944    0.82303   1.943  0.05197 . 
## itemspanda   -0.63405    0.81138  -0.781  0.43454   
## itemsplane   -0.56521    0.76948  -0.735  0.46262   
## itemsplates   0.60570    0.76009   0.797  0.42552   
## itemspolice   0.29495    0.74504   0.396  0.69219   
## itemspony     0.89952    0.76799   1.171  0.24149   
## itemsprince  -0.06506    0.76740  -0.085  0.93244   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 23 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.930778  1        1.389524
## country      1.455186  1        1.206311
## formal_study 1.367372  1        1.169347
## parent_ed    1.279867  1        1.131312
## spoken       2.860048  1        1.691168
## R_write_mean 4.130102  1        2.032265
## R_read_mean  6.457261  1        2.541114
## items        1.258963 15        1.007706
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    366.9    447.9   -160.5    320.9      227 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8317 -0.9031  0.4596  0.8945  1.8434 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25; items, 16
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.70897    1.65134   0.429    0.668  
## age          -0.15455    0.30252  -0.511    0.609  
## countryYes   -0.26335    0.33070  -0.796    0.426  
## formal_study -0.33925    0.14346  -2.365    0.018 *
## parent_ed    -0.01921    0.14279  -0.135    0.893  
## spoken        0.26021    0.24373   1.068    0.286  
## itemsbell     0.75641    0.73299   1.032    0.302  
## itemsbin      0.28303    0.72196   0.392    0.695  
## itemsblock    0.83996    0.73522   1.142    0.253  
## itemsboot     0.25957    0.72147   0.360    0.719  
## itemsboy      0.31154    0.72320   0.431    0.667  
## itemsbride    1.09310    0.75027   1.457    0.145  
## itemsbutter  -0.57748    0.74900  -0.771    0.441  
## itemspail     0.69256    0.74644   0.928    0.354  
## itemspan      1.34064    0.79083   1.695    0.090 .
## itemspanda   -0.75945    0.78826  -0.963    0.335  
## itemsplane   -0.56786    0.75035  -0.757    0.449  
## itemsplates   0.51950    0.73900   0.703    0.482  
## itemspolice   0.24341    0.72272   0.337    0.736  
## itemspony     0.80239    0.74633   1.075    0.282  
## itemsprince  -0.14900    0.74572  -0.200    0.842  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 21 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; It is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## model_sub_nig    23 366.91 447.90 -160.45   320.91                        
## model_full_nig   25 360.22 448.25 -155.11   310.22 10.689  2   0.004775 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Including R_Write_mean; We see that there isnt much of a difference between the two models.

model_sub2_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.428688  1        1.195277
## country      1.431915  1        1.196627
## formal_study 1.333136  1        1.154615
## parent_ed    1.119154  1        1.057901
## spoken       2.111042  1        1.452942
## R_write_mean 2.000471  1        1.414380
## items        1.157043 15        1.004874
anova(model_sub2_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    23 366.91 447.90 -160.45   320.91                     
## model_sub2_nig   24 368.57 453.08 -160.28   320.57 0.3378  1     0.5611

Visualize the Variation of scores intercept accross Participants and Across Items.

lattice::dotplot(ranef(model_sub2_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG).

Previously: In Pre-Test: We can see that sound 'b' is percieved correctly way more than 'p'. This finding is logical as phoneme 'p' does not exist in the arabic language and children will logically find it harder to percieve.

Post-Test: We see huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds.

Post-Test Word List 2: We Also see a huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds over pre-test.

## Get scores per group
scores <- table(filtered_data[c("soundF2","scoreF2","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundF2

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF2==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF2==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups Post Test - Words List 2",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants.

Several participants that were low performers in the pre-test performed relatively well post-test such as Participant IG1.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$scoreF2, by=list(filtered_data$ID, filtered_data$soundF2), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Word List 2",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants; It seems like on average, IG participants are performing slightly better than NIG participants. However, we conducted a significance test in the beginning and showed that there is no significant difference between the groups.

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the Sound only.

## Add Sounds
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(soundF2) %>% subset(!is.na(soundF2))
## Warning in `[<-.data.frame`(`*tmp*`, "sounds", value = structure(list(soundF2 =
## c("b", : replacement element 1 has 400 rows to replace 250 rows
## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.9    353.0   -171.0    341.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1466 -1.1466  0.8721  0.8721  0.8721 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0        0       
## Number of obs: 250, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2737     0.1277   2.144   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(soundF2) %>% subset(!is.na(soundF2))
## Warning in `[<-.data.frame`(`*tmp*`, "sounds", value = structure(list(soundF2 =
## c("b", : replacement element 1 has 400 rows to replace 250 rows
## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    350.5    357.6   -173.3    346.5      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0161 -1.0161  0.9841  0.9841  0.9841 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  sounds (Intercept) 1.755e-17 4.189e-09
## Number of obs: 250, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.0320     0.1265   0.253      0.8
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
performance::icc(model_intercept_nig)
## Warning: Can't compute random effect variances. Some variance components equal
##   zero. Your model may suffer from singularity (see '?lme4::isSingular'
##   and '?performance::check_singularity').
##   Solution: Respecify random structure! You may also decrease the
##   'tolerance' level to enforce the calculation of random effect variances.
## [1] NA

Visualize the Variation of sounds intercepts across groups; nearly identical.

lattice::dotplot(ranef(model_intercept_nig, which = "sounds", condVar = TRUE))
## $sounds

lattice::dotplot(ranef(model_intercept_ig, which = "sounds", condVar = TRUE))
## $sounds

##### Subset Models

We can see that the p-values of all other variables is, showing that the improvements in the scores of participants is not due to the predictors in the model.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
model_sub_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
stargazer(model_sub_nig, model_sub_ig, type = "text")
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     -0.161        -0.295    
##                        (0.289)        (0.281)   
##                                                 
## countryYes              -0.191         0.063    
##                        (0.316)        (0.328)   
##                                                 
## formal_study           -0.286**       -0.048    
##                        (0.136)        (0.142)   
##                                                 
## parent_ed               -0.044         0.122    
##                        (0.137)        (0.319)   
##                                                 
## spoken                  0.199          0.013    
##                        (0.233)        (0.222)   
##                                                 
## soundsp                 -0.112        -0.031    
##                        (0.259)        (0.263)   
##                                                 
## Constant                1.153          1.383    
##                        (1.497)        (1.583)   
##                                                 
## ------------------------------------------------
## Observations             250            250     
## Log Likelihood         -169.958      -169.763   
## Akaike Inf. Crit.      355.915        355.527   
## Bayesian Inf. Crit.    384.087        383.699   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01


Word Type 3 - Post Test

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.2, which is insignificant.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","scoreF3")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     1.360 0.160
##    std dev. 3.040 3.555
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  1.283 
##    degrees of freedom:  46.868 
##    p-value:  0.206 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-0.682, 3.082] 
##    estimated effect size (Cohen's d):  0.363

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepost)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : int  1 1 1 1 1 0 0 1 1 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepost     <- as.factor(filtered_data$scoreF3)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : Factor w/ 2 levels "0","1": 2 1 1 2 2 2 2 2 2 2 ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group) %>% subset(!(is.na(scorepost)))
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group) %>% subset(!(is.na(scorepost)))

Model

IG - Intercept Only Model

First, we model the data using the ID only. The singularity issue could be due to less and less amount of data to model.

model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')

We take note of the AIC = 345.9 and BIC = 353.0, with a 0.2737 coefficient of intercept which means that the participants are 1.3 times more likely to score (1).

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.9    353.0   -171.0    341.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1466 -1.1466  0.8721  0.8721  0.8721 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2737     0.1277   2.144   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

In the NIG group, the performance of the students is less likely to be (1) and the p-value is not significant.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##      350      357     -173      346      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1327 -1.0134  0.8828  0.9509  1.1029 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.09102  0.3017  
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.03266    0.14146   0.231    0.817
exp(0.032)
## [1] 1.032518

In the NIG group, ICC is 0.027, which means 2.7% of the score is explained by clusters of students - there is no significant pattern of performance between participants.

performance::icc(model_intercept_nig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.027
##   Conditional ICC: 0.027

View Difference:

lattice::dotplot(ranef(model_intercept_nig, which = "ID", condVar = TRUE))
## $ID

IG - Subset Model
model_sub_ig           <- glmer(formula = scorepost ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    353.5    378.2   -169.8    339.5      243 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3495 -1.0598  0.7651  0.8673  1.0065 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)   1.37620    1.58180   0.870    0.384
## age          -0.29606    0.28100  -1.054    0.292
## countryYes    0.06840    0.32514   0.210    0.833
## formal_study -0.04984    0.14136  -0.353    0.724
## parent_ed     0.12030    0.31842   0.378    0.706
## spoken        0.01402    0.22128   0.063    0.949
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.897                            
## countryYes   0.135 -0.055                     
## formal_stdy  0.039 -0.148  0.200              
## parent_ed   -0.525  0.143 -0.283 -0.093       
## spoken      -0.105  0.199 -0.276 -0.054 -0.363
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepost ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepost ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_no_ed_ig    6 351.68 372.81 -169.84   339.68                     
## model_sub_ig          7 353.54 378.19 -169.77   339.54 0.1428  1     0.7055

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.138302     1.383551     1.063466     1.511487     1.535665

Compare Full model to Sub Model: It is ambiguous which model performs better; however sub_model has lower AIC and BIC.

anova(model_full_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
## model_full_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + (1 | ID)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_ig     7 353.54 378.19 -169.77   339.54                     
## model_full_ig    9 356.45 388.15 -169.23   338.45 1.0863  2     0.5809
NIG - Subset Model

We see that formal study is significant, which might be interesting - does that mean only formal study is important in post-test (on word list 2)? What would be a theoratical reasoning behind that?

In the analysis, we need to consider the SD of the coefficient. If coeff +- SD == 0 or changes the sign of the coeff then we cannot deduce any significant relation.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    354.1    378.8   -170.1    340.1      243 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2680 -1.0123  0.7886  0.9499  1.3741 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 2.5e-13  5e-07   
## Number of obs: 250, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   1.02336    1.46528   0.698    0.485  
## age          -0.14557    0.28651  -0.508    0.611  
## countryYes   -0.19841    0.31566  -0.629    0.530  
## formal_study -0.28619    0.13571  -2.109    0.035 *
## parent_ed    -0.04599    0.13682  -0.336    0.737  
## spoken        0.20600    0.23232   0.887    0.375  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.951                            
## countryYes   0.385 -0.351                     
## formal_stdy -0.218  0.216 -0.089              
## parent_ed   -0.271  0.063 -0.185 -0.221       
## spoken      -0.326  0.149 -0.403 -0.255  0.139
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.197082     1.402032     1.244957     1.102436     1.337918

NIG vs IG Participants Score

There doesnt appear to be a significant difference between the two groups.

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     -0.296        -0.146    
##                        (0.281)        (0.287)   
##                                                 
## countryYes              0.068         -0.198    
##                        (0.325)        (0.316)   
##                                                 
## formal_study            -0.050       -0.286**   
##                        (0.141)        (0.136)   
##                                                 
## parent_ed               0.120         -0.046    
##                        (0.318)        (0.137)   
##                                                 
## spoken                  0.014          0.206    
##                        (0.221)        (0.232)   
##                                                 
## Constant                1.376          1.023    
##                        (1.582)        (1.465)   
##                                                 
## ------------------------------------------------
## Observations             250            250     
## Log Likelihood         -169.770      -170.050   
## Akaike Inf. Crit.      353.541        354.100   
## Bayesian Inf. Crit.    378.191        378.751   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

We see that the performances have dropped on infrequent words…

## Get scores per ID (0/1)
scores <- table(filtered_data[c("infreqlist3items","scoreF3","group")])
scores <- as.data.frame(scores)
items  <- scores$infreqlist3items

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF3==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF3==0) %>% select(Freq)
ig_score = data.frame(items = items[0:10], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF3==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF3==0) %>% select(Freq)
nig_score = data.frame(items = items[0:10], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words (list 3, less frequency) in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words of less Freq (3)"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(infreqlist3items) %>% subset(!is.na(infreqlist3items))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')

We take note of the AIC = 345.9 and BIC = 353.0. We also note that the coeff = 0.2737 (pre-test was nearly zero), which means that the participants are now 1.31482 times more likely to score (1).

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.9    353.0   -171.0    341.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1466 -1.1466  0.8721  0.8721  0.8721 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  items  (Intercept) 1.326e-17 3.642e-09
## Number of obs: 250, groups:  items, 10
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2737     0.1277   2.144   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
exp(0.2737)
## [1] 1.31482
NIG - Intercept Only

We observe similar result as word list 2- the performance drops from IG group to NIG for word list 3.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(infreqlist3items) %>% subset(!is.na(infreqlist3items))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    342.9    349.9   -169.4    338.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7381 -0.8793  0.5753  0.9160  1.3438 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.3309   0.5753  
## Number of obs: 250, groups:  items, 10
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.04094    0.22457   0.182    0.855
exp(0.04094)
## [1] 1.04179
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    344.1    404.0   -155.1    310.1      233 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3233 -0.8533  0.2850  0.8487  2.0897 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0.05279  0.2298  
##  items  (Intercept) 0.00000  0.0000  
## Number of obs: 250, groups:  ID, 25; items, 10
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   1.548e+00  1.699e+00   0.911   0.3621  
## age          -1.647e-01  3.225e-01  -0.511   0.6097  
## countryYes   -2.263e-01  3.561e-01  -0.636   0.5251  
## formal_study -3.272e-01  1.539e-01  -2.126   0.0335 *
## parent_ed    -5.239e-02  1.534e-01  -0.341   0.7327  
## spoken        2.347e-01  2.616e-01   0.897   0.3696  
## itemsbeaver  -1.408e+00  6.189e-01  -2.276   0.0229 *
## itemsblend   -1.718e-01  5.867e-01  -0.293   0.7696  
## itemsbomb     3.187e-05  5.904e-01   0.000   1.0000  
## itemsbulb    -8.480e-01  5.912e-01  -1.434   0.1515  
## itemspaw     -1.025e+00  5.974e-01  -1.716   0.0862 .
## itemspeacock -1.718e-01  5.867e-01  -0.293   0.7696  
## itemspearl    1.642e+00  7.513e-01   2.186   0.0288 *
## itemspears   -1.025e+00  5.974e-01  -1.716   0.0862 .
## itemspit     -5.079e-01  5.851e-01  -0.868   0.3853  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Visualize the Variation of scores intercept accross Participants and Across Items.

lattice::dotplot(ranef(model_sub_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG).

Previously: In Pre-Test: We can see that sound 'b' is percieved correctly way more than 'p'. This finding is logical as phoneme 'p' does not exist in the arabic language and children will logically find it harder to percieve.

Post-Test: We see huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds.

Post-Test Word List 2: We Also see a huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds over pre-test.

Post-Test Word List 3: We see that IG Group Performs better than NIG group, and both groups perform better with p sounds.

## Get scores per group
scores <- table(filtered_data[c("soundF3","scoreF3","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundF3

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & scoreF3==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF3==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & scoreF3==1) %>% select(Freq) - scores %>% subset(group == "IG" & scoreF3==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups Post Test - Words List 3",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants.

Several participants that were low performers in the pre-test performed relatively well post-test such as Participant IG1.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$scoreF3, by=list(filtered_data$ID, filtered_data$soundF3), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Word List 3",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants…

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Word List 3",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the Sound only. There doesnt seem to be enough variation in the scores accross sounds in IG group.

## Add Sounds
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(soundF3) %>% subset(!is.na(soundF3))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    345.9    353.0   -171.0    341.9      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1466 -1.1466  0.8721  0.8721  0.8721 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev.
##  sounds (Intercept) 1.339e-17 3.66e-09
## Number of obs: 250, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.2737     0.1277   2.144   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(soundF3) %>% subset(!is.na(soundF3))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    350.5    357.6   -173.3    346.5      248 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0161 -1.0161  0.9841  0.9841  0.9841 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  sounds (Intercept) 7.629e-16 2.762e-08
## Number of obs: 250, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.0320     0.1265   0.253      0.8
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
stargazer(model_intercept_ig, model_intercept_nig, type = "text")
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## Constant               0.274**         0.032    
##                        (0.128)        (0.127)   
##                                                 
## ------------------------------------------------
## Observations             250            250     
## Log Likelihood         -170.968      -173.255   
## Akaike Inf. Crit.      345.935        350.510   
## Bayesian Inf. Crit.    352.978        357.553   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Final Conclusion: IG performs better than NIG with word list - 3.



Word Type 4 - Post Test

Comarping Participants’ Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per participant and comparing the latter across groups (IG vs NIG) using a T-test.

The t-test shows p-value of 0.4, which is insignificant.

## Get scores per ID (0/1)
scores <- table(filtered_data[c("ID","novelscore4")])
groups <- unique(filtered_data[c("ID","group")])

## Total Score = Sum(correct) - Sum(incorrect)
total_score <- scores[,2] - scores[,1]

## Add to dataframe
df_scores <- data.frame(group = as.factor(groups$group), total_score = total_score)

## T-test for difference in Total Participants' Scores
independentSamplesTTest(formula = total_score~group, data=df_scores)
## 
##    Welch's independent samples t-test 
## 
## Outcome variable:   total_score 
## Grouping variable:  group 
## 
## Descriptive statistics: 
##                IG   NIG
##    mean     0.560 0.880
##    std dev. 1.583 1.301
## 
## Hypotheses: 
##    null:        population means equal for both groups
##    alternative: different population means in each group
## 
## Test results: 
##    t-statistic:  -0.781 
##    degrees of freedom:  46.265 
##    p-value:  0.439 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-1.145, 0.505] 
##    estimated effect size (Cohen's d):  0.221

Modeling

Prepare Data for Modeling

As we’ve noted: The features considered theoratically significant are: age, country (lived in english speaking country), formal_study, parent_ed, spoken (English Proficiency), as main features, and we include R_*_mean for now.

## Select Columns
data_model = filtered_data %>%
  dplyr::select(ID, age, country, formal_study, parent_ed, spoken,R_write_mean, R_read_mean, R_speak_mean, R_listen_mean, R_games_mean, mix, group, scorepost)

## Check Data Types
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : chr  "No" "No" "No" "No" ...
##  $ formal_study : chr  "2" "2" "2" "2" ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : chr  "1" "1" "1" "1" ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : chr  "1" "1" "1" "1" ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : int  1 1 1 1 1 0 0 1 1 0 ...
## Set to Factors 
data_model$country      <- as.factor(data_model$country)
data_model$formal_study <- as.numeric(data_model$formal_study)
data_model$parent_ed    <- as.numeric(data_model$parent_ed)
data_model$spoken       <- as.numeric(data_model$spoken)
data_model$mix          <- as.factor(data_model$mix)
data_model$scorepost     <- as.factor(filtered_data$novelscore4)

## Validate Data Types have been changed
str(data_model)
## 'data.frame':    1000 obs. of  14 variables:
##  $ ID           : chr  "IG.1" "IG.1" "IG.1" "IG.1" ...
##  $ age          : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ country      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ formal_study : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ parent_ed    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ spoken       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ R_write_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_read_mean  : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_speak_mean : num  46.7 46.7 46.7 46.7 46.7 ...
##  $ R_listen_mean: num  50 50 50 50 50 50 50 50 50 50 ...
##  $ R_games_mean : num  50 50 50 50 50 50 50 50 50 50 ...
##  $ mix          : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ group        : chr  "IG" "IG" "IG" "IG" ...
##  $ scorepost    : Factor w/ 2 levels "0","1": 2 2 1 1 NA NA NA NA NA NA ...
## Re-scaling Data Function (0 to 1)
rescale <- function(x){(x-min(x))/(max(x)-min(x))}
#> Re-scale R_means
data_model$R_write_mean <- rescale(data_model$R_write_mean)
data_model$R_read_mean <- rescale(data_model$R_read_mean)
data_model$R_speak_mean <- rescale(data_model$R_speak_mean)
data_model$R_listen_mean <- rescale(data_model$R_listen_mean)
data_model$R_games_mean <- rescale(data_model$R_games_mean)

## Split into two groups IG and NIG z
data_ig <- data_model %>% subset (group == "IG")  %>% select(-group) %>% subset(!(is.na(scorepost)))
data_nig <- data_model %>% subset (group == "NIG") %>% select(-group) %>% subset(!(is.na(scorepost)))

Model

IG - Intercept Only Model

First, we model the data using the ID only.

model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    140.7    145.9    -68.3    136.7       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1513 -1.1513  0.8686  0.8686  0.8686 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2819     0.2020   1.395    0.163
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')

The participants perform similarly in NIG.

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    137.7    143.0    -66.9    133.7       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2506 -1.2506  0.7996  0.7996  0.7996 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.4473     0.2050   2.182   0.0291 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
IG - Subset Model
model_sub_ig           <- glmer(formula = scorepost ~ age + country + formal_study +
                                  parent_ed + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    145.9    164.1    -65.9    131.9       93 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7674 -1.0001  0.6638  0.7862  1.2963 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -4.24888    2.56063  -1.659   0.0971 .
## age           0.87602    0.45149   1.940   0.0523 .
## countryYes    0.25168    0.52194   0.482   0.6297  
## formal_study  0.13147    0.23321   0.564   0.5729  
## parent_ed     0.07528    0.50762   0.148   0.8821  
## spoken        0.06796    0.35498   0.191   0.8482  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.898                            
## countryYes   0.103 -0.029                     
## formal_stdy  0.007 -0.118  0.206              
## parent_ed   -0.534  0.159 -0.266 -0.084       
## spoken      -0.125  0.211 -0.277 -0.042 -0.345
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## Prove Significance
model_sub_no_ed_ig      <- glmer(formula = scorepost ~ age + country + formal_study + spoken + (1 | ID),
                                  family = binomial(link="logit"),
                                  data = data_ig,
                                  control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
## Proves the 
anova(model_sub_no_ed_ig, model_sub_ig)
## Data: data_ig
## Models:
## model_sub_no_ed_ig: scorepost ~ age + country + formal_study + spoken + (1 | ID)
## model_sub_ig: scorepost ~ age + country + formal_study + parent_ed + spoken + (1 | ID)
##                    npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## model_sub_no_ed_ig    6 143.92 159.55 -65.961   131.92                    
## model_sub_ig          7 145.90 164.14 -65.950   131.90 0.022  1     0.8821

Multicollinearity seems to be good in the subset model.

car::vif(model_sub_ig)
##          age      country formal_study    parent_ed       spoken 
##     1.160204     1.370978     1.060154     1.477561     1.520544
NIG - Subset Model

In the analysis, we need to consider the SD of the coefficient. If coeff +- SD == 0 or changes the sign of the coeff then we cannot deduce any significant relation.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    141.2    159.5    -63.6    127.2       93 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0776 -0.9279  0.5684  0.7456  1.1802 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   -4.0347     2.4367  -1.656   0.0978 .
## age            0.8646     0.4761   1.816   0.0694 .
## countryYes    -0.3324     0.5227  -0.636   0.5248  
## formal_study   0.3738     0.2346   1.593   0.1111  
## parent_ed      0.1817     0.2192   0.829   0.4071  
## spoken        -0.2461     0.3933  -0.626   0.5314  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d
## age         -0.951                            
## countryYes   0.406 -0.375                     
## formal_stdy -0.249  0.250 -0.133              
## parent_ed   -0.286  0.086 -0.195 -0.204       
## spoken      -0.331  0.140 -0.378 -0.249  0.158
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Multicollinearity seems to be fixed.

car::vif(model_sub_nig)
##          age      country formal_study    parent_ed       spoken 
##     1.233528     1.411546     1.269077     1.106378     1.313796

NIG vs IG Participants Score

There doesnt appear to be a significant difference between the two groups.

stargazer(model_sub_ig, model_sub_nig, title="Results Comparison of Linear Mixed Models between Groups", align=TRUE, type="text")
## 
## Results Comparison of Linear Mixed Models between Groups
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     0.876*        0.865*    
##                        (0.451)        (0.476)   
##                                                 
## countryYes              0.252         -0.332    
##                        (0.522)        (0.523)   
##                                                 
## formal_study            0.131          0.374    
##                        (0.233)        (0.235)   
##                                                 
## parent_ed               0.075          0.182    
##                        (0.508)        (0.219)   
##                                                 
## spoken                  0.068         -0.246    
##                        (0.355)        (0.393)   
##                                                 
## Constant               -4.249*        -4.035*   
##                        (2.561)        (2.437)   
##                                                 
## ------------------------------------------------
## Observations             100            100     
## Log Likelihood         -65.950        -63.623   
## Akaike Inf. Crit.      145.900        141.245   
## Bayesian Inf. Crit.    164.136        159.481   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Comarping Performance on Items

Comparing Total Scores across Groups by Items

The groups perform similarly, NIG group performs better on word pemlev while IG performs better on word Poffkin

## Get scores per ID (0/1)
scores <- table(filtered_data[c("novellist4items","novelscore4","group")])
scores <- as.data.frame(scores)
items  <- scores$novellist4items

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & novelscore4==1) %>% select(Freq) - scores %>% subset(group == "IG" & novelscore4==0) %>% select(Freq)
ig_score = data.frame(items = items[0:4], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & novelscore4==1) %>% select(Freq) - scores %>% subset(group == "IG" & novelscore4==0) %>% select(Freq)
nig_score = data.frame(items = items[0:4], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~items, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Total Scores on Words (list 4, less frequency) in NIG vs IG groups",
         yaxis=list(title="Total Scores on Words of Novel List (4)"),
         xaxis=list(title="Words/Items"))

fig

Model

IG - Intercept Only Model

First, we model the data using the ID only.

## Add Items
data_ig['items'] = filtered_data %>% subset (group == "IG") %>% select(novellist4items) %>% subset(!is.na(novellist4items))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

Participants are 1.3 times more likely to score (1) in IG group.

summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    139.4    144.6    -67.7    135.4       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4195 -1.0501  0.7045  0.8035  1.0359 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.1772   0.421   
## Number of obs: 100, groups:  items, 4
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2945     0.2951   0.998    0.318
exp(0.2945)
## [1] 1.342455

Third, We check the Inter-coefficient correlation = 0.051 showing that only 5.1% of the variation in the scores is due to items post-test. This shows that despite the items, the participants are 1.3 more likely to score correctly (1).

performance::icc(model_intercept_ig)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.051
##   Conditional ICC: 0.051
NIG - Intercept Only

We observe similar result for group NIG with coeff = 0.47 (1.6 times more likely to score (1)) and but sligthly higher STD (0.3317 > 0.2951) which shows that it is not necessary that participants in NIG group have slightly higher odds of scoring (1) than participants in the IG group.

## Add Items
data_nig['items'] = filtered_data %>% subset (group == "NIG") %>% select(novellist4items) %>% subset(!is.na(novellist4items))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|items),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))

summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | items)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    135.6    140.8    -65.8    131.6       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5330 -1.0297  0.6523  0.6523  0.9835 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  items  (Intercept) 0.259    0.5089  
## Number of obs: 100, groups:  items, 4
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.4756     0.3317   1.434    0.152
exp(0.4756)
## [1] 1.608979
NIG - Full Model

Just two items have significant effect on the score, as seen in the plot too; Bukmuz is (1) mostly in NIG.

model_full_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_full_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    141.3    175.2    -57.6    115.3       87 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3899 -0.7730  0.4241  0.6763  1.8686 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25; items, 4
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   -2.9556     3.2252  -0.916   0.3594  
## age            0.5355     0.6276   0.853   0.3935  
## countryYes    -0.3294     0.5639  -0.584   0.5591  
## formal_study   0.3436     0.2570   1.337   0.1812  
## parent_ed      0.1068     0.2495   0.428   0.6685  
## spoken        -0.7118     0.5986  -1.189   0.2344  
## R_write_mean  -0.7153     1.2820  -0.558   0.5769  
## R_read_mean    1.7890     1.5526   1.152   0.2492  
## itemsBukmuz    1.3544     0.6491   2.087   0.0369 *
## itemsPemlev    1.3544     0.6491   2.087   0.0369 *
## itemsPoffkin  -0.1786     0.5980  -0.299   0.7652  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken R_wrt_ R_rd_m itmsBk
## age         -0.962                                                        
## countryYes   0.377 -0.352                                                 
## formal_stdy -0.314  0.311 -0.144                                          
## parent_ed   -0.367  0.216 -0.183 -0.125                                   
## spoken      -0.579  0.490 -0.334 -0.021  0.229                            
## R_write_men -0.094  0.077  0.056  0.046  0.276 -0.035                     
## R_read_mean  0.468 -0.469  0.027 -0.190 -0.342 -0.488 -0.700              
## itemsBukmuz -0.131  0.043 -0.027  0.059  0.022 -0.051 -0.022  0.050       
## itemsPemlev -0.131  0.043 -0.027  0.059  0.022 -0.051 -0.022  0.050  0.438
## itemsPoffkn -0.087 -0.005  0.004 -0.010 -0.003  0.009  0.005 -0.009  0.455
##             itmsPm
## age               
## countryYes        
## formal_stdy       
## parent_ed         
## spoken            
## R_write_men       
## R_read_mean       
## itemsBukmuz       
## itemsPemlev       
## itemsPoffkn  0.455
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

VIF for multicollinearity check: R_write and R_read are correlated as expected.

car::vif(model_full_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.880282  1        1.371234
## country      1.448778  1        1.203652
## formal_study 1.346057  1        1.160197
## parent_ed    1.260811  1        1.122858
## spoken       2.748725  1        1.657928
## R_write_mean 3.732448  1        1.931954
## R_read_mean  5.924956  1        2.434123
## items        1.039303  3        1.006446
IG - Sub Model
model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_sub_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ age + country + formal_study + parent_ed + spoken +  
##     items + (1 | items) + (1 | ID)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    138.8    167.5    -58.4    116.8       89 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8083 -0.8117  0.4494  0.7621  1.7938 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  ID     (Intercept) 0        0       
##  items  (Intercept) 0        0       
## Number of obs: 100, groups:  ID, 25; items, 4
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   -5.1396     2.6438  -1.944   0.0519 .
## age            0.9676     0.5074   1.907   0.0565 .
## countryYes    -0.3701     0.5532  -0.669   0.5034  
## formal_study   0.4158     0.2478   1.678   0.0933 .
## parent_ed      0.2044     0.2329   0.878   0.3801  
## spoken        -0.2743     0.4150  -0.661   0.5086  
## itemsBukmuz    1.3320     0.6431   2.071   0.0383 *
## itemsPemlev    1.3320     0.6431   2.071   0.0383 *
## itemsPoffkin  -0.1755     0.5928  -0.296   0.7672  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) age    cntryY frml_s prnt_d spoken itmsBk itmsPm
## age         -0.943                                                 
## countryYes   0.403 -0.376                                          
## formal_stdy -0.260  0.258 -0.134                                   
## parent_ed   -0.289  0.091 -0.196 -0.197                            
## spoken      -0.314  0.135 -0.379 -0.252  0.153                     
## itemsBukmuz -0.192  0.088 -0.029  0.072  0.043 -0.029              
## itemsPemlev -0.192  0.088 -0.029  0.072  0.043 -0.029  0.436       
## itemsPoffkn -0.098 -0.013  0.005 -0.013 -0.006  0.005  0.456  0.456
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Comparing Models; It is ambiguous which model performs better, but we know theoratically R_write and R_read can impact the scores; but they are correlated. We can include one of them in the model and check if VIF is acceptable.

anova(model_full_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_full_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + R_read_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    11 138.79 167.45 -58.397   116.79                     
## model_full_nig   13 141.29 175.16 -57.644   115.29 1.5053  2     0.4711

Including R_Write_mean; We see that there isnt much of a difference between the two models.

model_sub2_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
car::vif(model_sub2_nig)
##                  GVIF Df GVIF^(1/(2*Df))
## age          1.477814  1        1.215654
## country      1.433646  1        1.197350
## formal_study 1.309368  1        1.144276
## parent_ed    1.114763  1        1.055823
## spoken       2.064234  1        1.436744
## R_write_mean 1.887017  1        1.373687
## items        1.032975  3        1.005422
anova(model_sub2_nig, model_sub_nig)
## Data: data_nig
## Models:
## model_sub_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + items + (1 | items) + (1 | ID)
## model_sub2_nig: scorepost ~ age + country + formal_study + parent_ed + spoken + R_write_mean + items + (1 | items) + (1 | ID)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model_sub_nig    11 138.79 167.45 -58.397   116.79                     
## model_sub2_nig   12 140.68 171.94 -58.338   116.68 0.1187  1     0.7305

Visualize the Variation of scores intercept accross Participants and Across Items.

lattice::dotplot(ranef(model_sub2_nig, which = "ID", condVar = TRUE))
## $ID

Comarping Sound Scores

Comparing Total Scores across Groups

This method of comparison comprises of taking the total number of correct - total number of incorrect answers per sound and comparing the latter across groups (IG vs NIG).

Previously: In Pre-Test: We can see that sound 'b' is percieved correctly way more than 'p'. This finding is logical as phoneme 'p' does not exist in the arabic language and children will logically find it harder to percieve.

Post-Test: We see huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds.

Post-Test Word List 2: We Also see a huge improvement in both groups, IG and NIG in differentiating ‘p’ and ‘b’ sounds over pre-test.

## Get scores per group
scores <- table(filtered_data[c("soundnovel4","novelscore4","group")])
scores <- as.data.frame(scores)
sounds  <- scores$soundnovel4

## Total Score = Sum(correct) - Sum(incorrect)
ig_score = scores %>% subset(group == "IG" & novelscore4==1) %>% select(Freq) - scores %>% subset(group == "IG" & novelscore4==0) %>% select(Freq)
ig_score = data.frame(sounds = sounds[0:2], group = rep("IG", nrow(ig_score)), score = ig_score)

nig_score = scores %>% subset(group == "NIG" & novelscore4==1) %>% select(Freq) - scores %>% subset(group == "IG" & novelscore4==0) %>% select(Freq)
nig_score = data.frame(sounds = sounds[0:2], group = rep("NIG", nrow(nig_score)), score = nig_score)

all_scores = rbind(ig_score, nig_score)

## Plot
fig <- plot_ly(all_scores, x=~sounds, y=~Freq, color =~group, type = "bar") %>%
  layout(title="Sounds Differentiation accross groups Post Test - Novel List 4",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

We can see higher peaks of ‘b’ and low peaks of ‘p’ across both groups and across different participants.

Variable performances can be seen.

## Get scores per ID (0/1)
scores <- aggregate(filtered_data$novelscore4, by=list(filtered_data$ID, filtered_data$soundnovel4), FUN=sum)

fig <- plot_ly(scores, x=~Group.1, y=~x, color =~Group.2, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Novel List 2",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Total Scores for participants; We see higher peeks for IG but the data size is low.

total_scores <- aggregate(scores$x, by=list(scores$Group.1), FUN=sum)

fig <- plot_ly(total_scores, x=~Group.1, y=~x, type = "bar") %>%
  layout(title="Sounds Differentiation accross participants Post Test Novel List 4",
         yaxis=list(title="Total Scores on Sounds"),
         xaxis=list(title="Sounds"))

fig

Model

IG - Intercept Only Model

First, we model the data using the Sound only.

## Add Sounds
data_ig['sounds'] = filtered_data %>% subset (group == "IG") %>% select(soundnovel4) %>% subset(!is.na(soundnovel4))

## Model
model_intercept_ig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_ig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_ig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    140.7    145.9    -68.3    136.7       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1513 -1.1513  0.8686  0.8686  0.8686 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0        0       
## Number of obs: 100, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2819     0.2020   1.395    0.163
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
NIG - Intercept Only

We observe the same result for group NIG.

## Add Items
data_nig['sounds'] = filtered_data %>% subset (group == "NIG") %>% select(soundnovel4) %>% subset(!is.na(soundnovel4))

## Model
model_intercept_nig <- glmer(formula = scorepost ~ 1 + (1|sounds),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
summary(model_intercept_nig)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: scorepost ~ 1 + (1 | sounds)
##    Data: data_nig
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    137.7    143.0    -66.9    133.7       98 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2506 -1.2506  0.7996  0.7996  0.7996 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  sounds (Intercept) 0        0       
## Number of obs: 100, groups:  sounds, 2
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.4473     0.2050   2.182   0.0291 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Subset Models

We can see that the p-values of all other variables is, showing that the improvements in the scores of participants is not due to the predictors in the model.

model_sub_nig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_nig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
model_sub_ig      <- glmer(formula = scorepost ~ age + country + formal_study +
                            parent_ed + spoken + sounds + (1 | ID),
                            family = binomial(link="logit"),
                            data = data_ig,
                            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=2e5)))
## boundary (singular) fit: see help('isSingular')
stargazer(model_sub_nig, model_sub_ig, type = "text")
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                              scorepost          
##                          (1)            (2)     
## ------------------------------------------------
## age                     0.865*        0.886*    
##                        (0.476)        (0.454)   
##                                                 
## countryYes              -0.333         0.254    
##                        (0.523)        (0.525)   
##                                                 
## formal_study            0.374          0.133    
##                        (0.235)        (0.234)   
##                                                 
## parent_ed               0.182          0.076    
##                        (0.219)        (0.510)   
##                                                 
## spoken                  -0.246         0.069    
##                        (0.393)        (0.357)   
##                                                 
## soundsp                 -0.090        -0.430    
##                        (0.424)        (0.417)   
##                                                 
## Constant                -3.992        -4.081    
##                        (2.445)        (2.577)   
##                                                 
## ------------------------------------------------
## Observations             100            100     
## Log Likelihood         -63.600        -65.413   
## Akaike Inf. Crit.      143.200        146.827   
## Bayesian Inf. Crit.    164.042        167.668   
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

We see that both groups performed relatively the same with the novel list 4.


END OF QUESTION 1


Answering Question 2

Do the Saudi EFL kindergarten students prefer the interactive or non-interactive version of DST?

Model

The model shows that the major contributor to the evaluation is the group; if NIG => There is an average decrease of -0.85 in the evaluation. The participants favored interactive stories more.

model <- glm(formula = evaluation ~ age + country + formal_study +
                            parent_ed + spoken + R_speak_mean + group + scorepost + scoreF2 + scoreF3 + novelscore4,
                            data = filtered_data)

summary(model)
## 
## Call:
## glm(formula = evaluation ~ age + country + formal_study + parent_ed + 
##     spoken + R_speak_mean + group + scorepost + scoreF2 + scoreF3 + 
##     novelscore4, data = filtered_data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.1995  -0.5088   0.1531   0.3869   1.9006  
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    5.062286   0.782507   6.469 8.64e-10 ***
## age           -0.055650   0.136738  -0.407   0.6845    
## countryYes    -0.346479   0.162445  -2.133   0.0343 *  
## formal_study1  0.180546   0.203258   0.888   0.3756    
## formal_study2  0.021445   0.230977   0.093   0.9261    
## formal_study3 -0.303102   0.238227  -1.272   0.2049    
## formal_study4  0.268665   0.486756   0.552   0.5817    
## parent_ed      0.019748   0.085083   0.232   0.8167    
## spoken1       -0.578087   0.196018  -2.949   0.0036 ** 
## spoken2       -0.484313   0.252507  -1.918   0.0567 .  
## R_speak_mean   0.001131   0.003461   0.327   0.7442    
## groupNIG      -0.855193   0.124803  -6.852 1.06e-10 ***
## scorepost      0.066245   0.139282   0.476   0.6349    
## scoreF2        0.150926   0.146161   1.033   0.3031    
## scoreF3        0.064002   0.126760   0.505   0.6142    
## novelscore4    0.087863   0.123012   0.714   0.4760    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.6773176)
## 
##     Null deviance: 186.32  on 199  degrees of freedom
## Residual deviance: 124.63  on 184  degrees of freedom
##   (800 observations deleted due to missingness)
## AIC: 506.98
## 
## Number of Fisher Scoring iterations: 2

END