1a)

Barbara! What we are looking at items loading on a factor, we are basically measuring “meaningful” variance from a set of items. You are not getting a 0.40 loading because there are more factors than one and these items are not unidimensional and are more appropriately loaded on a 2 factor model. You have to conduct the following steps:

1b)

Example:

set.seed(123)

Cogtasks.listwiseMCD75 = aq.plot(Cogtasks.listwise, quan = .75, alpha = .001)$outliers

Cogtasks.listwise.final = Cogtasks.listwise[-c(which(Cogtasks.listwise$MCD75==“TRUE”)),]

1C)

You can examine the Factor Correlation Matrix. This matrix shows the correlations between each pair of factors. When you run the the Oblimin rotation in R the code additionally provides factor inter-correlations as well.

Example: ML. CogTasks = fa(CogTasks. listwise, nfactors = 5, rotate = “oblimin”, fm = “ml”, max.iter = 1000)

Printing the factor loadings:

print.psych(ML.CogTasks, cut = 0.30, sort = T)

2a)

The reason that Mike’s univariate boxplots method is sun-par is actually written in the problem. It’s because that would “treat all predictors at the same level”.

Univariate boxplots, while useful for detecting outliers within each variable, are sub-optimal in Mike’s case because they don’t consider the hierarchical structure of the data. In Hierarchical Linear Modeling (HLM), data are nested (employees within companies), and univariate analyses don’t account for this nesting. This can lead to misleading conclusions about variability and outlier detection.

Mike could perform a multilevel EDA. For instance, he could create boxplots of employee-level variables within each company to observe within- and between-company variability. Additionally, using packages like lmerTest and lmer() and influence() function, Mike could create the model somewhat like the ones below:

Exploratory Data Analysis (EDA) also known as Assumption Checking based on

ProductivityComp = conscientiousness + extraversion + grit + nfc + ahw + grit*ahw

HLM_model = lmer(ProductivityComp ~ conscientiousness + extraversion + grit + nfc + ahw + grit:ahw + (1 | Company), data = datafile)

estimated.HLM_model = influence(HLM_model)

2b)

Same mentioned reason as 2a, because Mike’s collinearity analysis is treating all predictors at the same level and ignoring the multilevel nature of the model.

The potential solution for Mike’s collinearity issue is centering the terms that are part of the interaction effect can indeed help in reducing non-essential collinearity, which often arises from scaling or the inclusion of interaction terms. This involves subtracting the mean of each predictor from its values. By centering the predictors around zero, it reduces multicollinearity between the main effects and their interaction terms. This is because the main effects and the interaction terms are no longer artificially inflated due to the scale of the original variables.

datafilecentered_conscientiousness = datafileSconscientiousness - mean(datafile$conscientiousness, na.rm = TRUE)

datafilecentered_grit = datafilegrit - mean(datafile$grit, na.rm = TRUE)

And then create the interaction term with these centered variables:

datafileinteraction_term = datafilecentered_grit * datafile$ahw

2c)

Low tolerance values suggest that there is a high degree of multicollinearity. This means that conscientiousness and grit are highly correlated with each other. In regression analysis, high multicollinearity can distort the estimates of the regression coefficients and make them unstable or unreliable. It becomes difficult to isolate the individual effect of each predictor on the outcome variable due to this multicollinearity. This could be why Mike is seeing non-significant results even for variables expected to be significant (like conscientiousness).

Options to Proceed:

-Centering Variables: As discussed earlier, centering the variables (either grand mean or group mean centering) can help reduce multicollinearity that arises due to scaling and the inclusion of interaction terms.

-Remove Highly Correlated Predictors: If certain predictors are very highly correlated, consider removing one to reduce multicollinearity. This decision should be based on theoretical justifications and the research questions at hand. Or he can combine the two correlated predictors.

-Regularization Techniques: If retaining all predictors is important, Mike might consider using regularization techniques like Ridge Regression or Lasso, which are designed to handle multicollinearity by penalizing the regression coefficients.

-Partial Least Squares Regression (PLS): If the primary goal is prediction and not inference, PLS can be an option. It’s a technique that reduces the predictors to a smaller set of uncorrelated components.

3a)

When it comes to participants as a random factor the 5:1 cases to variables ratio is appropriate, but when it comes to experimenter 3 does not meet the minimum standard.

b)

C.R. can run the experimenters’ gender identity as a potential moderator in the relationship between the conditions as the predictor and the participants’ self-reported attraction to the human target as the outcome variable. This will allow dummy coding of the three categories of gender identity and any potential interactions.