Métodos Experimentais: Análise de Dados

knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE
)
knitr::opts_knit$set(root.dir = normalizePath(".."))

Introdução

Nesse notebook, trabalharemos com os conceitos de mediação e mediação moderada. Juntos, eles compreendem boa parte dos modelos de mediação observados e medidos em comportamento do consumidor. Apresentarei diversas formas de testar efeitos de mediação, mas todos eles partirão do paradigma apresentado por Hayes.

Sobre os dados

Por questões de consistência e para facilitar o aprendizado, estou usando as bases de dados do Hayes. Do livro dele:

“To illustrate the estimation of direct and indirect effects in a simple mediation model, I use data from a study conducted in Israel by Tal-Or, Cohen,Tsfati, and Gunther (2010). The data file is named PMI and can be downloaded from www.afhayes.com. The participants in this study (43 male and 80 female students studying political science or communication at a large university in Israel) read one of two newspaper articles describing an economic crisis that purportedly may affect the price and supply of sugar in Israel. Approximately half of the participants (n = 58) were given an article they were told would be appearing on the front page of a major Israeli newspaper (henceforth referred to as the front page condition). The remaining participants (n = 65) were given the same article but were told it would appear in the middle of an economic supplement of this newspaper (referred to here as the internal page condition). Which of the two articles any participant read was determined by random assignment. In all other respects, the participants in the study were treated equivalently, the instructions they were given were the same, and all measurement procedures were identical in both experimental conditions. After the participants read the article, they were asked a number of questions about their reactions to the story. Some questions asked participants how soon they planned on buying sugar and how much they intended to buy. Their responses were aggregated to form an intention to buy sugar measure (REACTION in the data file), such that higher scores reflected greater intention to buy sugar (soon and in larger quantities). They were also asked questions used to quantify how much they believed that others in the community would be prompted to buy sugar as a result of exposure to the article, a measure referred to as presumed media influence (PMI in the data file). (p 93-94)”

Ou seja, os pesquisadores testaram o efeito da posição de um artigo sobre a crise econômica e seus possíveis efeitos no preço de açúcar (COND) na intenção de compra de açúcar (REACTION). Ele testa o suposto efeito de mediação da crença sobre a influência da mídia nos hábitos de compra dos consumidores (PMI). Esse é o modelo de mensuração proposto:

Modelo Exemplo Mediação Hayes Assim como nos exemplos anteriores, o primeiro passo é importar os dados. Aqui, entretanto, o nosso trabalho é mais simples já que a formatação dos dados é mínima. Esse é um caso mais clássico de importação dos dados em CSV.

require(readr)
setwd("/home/lnicolao/Google Drive/Drive/UFRGS/Ensino/20211/ADP - Experimentos/Scripts Aulas/")
hayes4<-read_csv("hayes_cap 4.csv",
             col_names = TRUE)

Como o exemplo é de mediação, eu não vou fazer nenhuma inspeção mais detalhada das propriedades dos dados, além de médias por condição, o que já foi feito pelo Hayes também no livro. Para isso, eu vou criar uma outra variável com termos mais “amigáveis”.

require(psych)
require(knitr)
hayes4$condicao <- factor (hayes4$cond, levels = c(0,1), labels = c("Interior", "Front"))
desc.reaction<-describeBy(hayes4$reaction, hayes4$condicao, mat = T, digits = 2)
rownames(desc.reaction)<-NULL
names(desc.reaction)[2]<-"Condições"

kable(desc.reaction[,c(2,4,5,6)], caption = "Descritivas de Reaction por Condição")

Descritivas de Reaction por Condição
Condições	n	mean	sd
Interior	65	3.25	1.61
Front	58	3.75	1.45

Farei a mesma tabela de medidas para a variável presumed media influence - PMI, candidata a mediadora.

desc.pmi<-describeBy(hayes4$pmi, hayes4$condicao, mat = T, digits = 2)
rownames(desc.pmi)<-NULL
names(desc.pmi)[2]<-"Condições"

kable(desc.pmi[,c(2,4,5,6)], caption = "Descritivas de PMI por Condição")

Descritivas de PMI por Condição
Condições	n	mean	sd
Interior	65	5.38	1.34
Front	58	5.85	1.27

Porque não custa nada, vou fazer um gráfico também!

require(ggplot2)
require(wesanderson)
ggplot(hayes4, aes(x=condicao,y=reaction, fill=condicao)) +
  geom_violin(trim=FALSE) +
  stat_summary(fun.data=mean_sdl, fun.args = list(mult = 1),
                 geom="pointrange", color="white") +
  scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest1")) +
  labs(title = "Reaction per Page Positioning", x = "Condições Experimentais", 
       y = "Reaction", fill = "Condições")

ggplot(hayes4, aes(x=condicao,y=pmi, fill=condicao)) +
  geom_violin(trim=FALSE) +
  stat_summary(fun.data=mean_sdl, fun.args = list(mult = 1),
                 geom="pointrange", color="white") +
  scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest2")) +
  labs(title = "PMI per Page Positioning", x = "Condições Experimentais", 
       y = "PMI", fill = "Condições")

Mediação

Nessa seção da aula, faremos análises de mediação na base de dados do Hayes descrita acima seguindo diferentes métodos e recomendações. Primeiro, faremos uma análise através do teste de Sobel, sugerido por Baron e Kenny (1986), depois faremos testes mais modernos, pela lógica do teste normal e pela lógica de intervalos de confiança, por modelos condicionais. Esses últimos serão feitos de diferentes formas, culminando no uso do PROCESS do Hayes.

Análise de Mediação Tradicional - Sobel Test

Aqui seguiremos o Sobel Test, método de estimação utilizado por muito tempo para análise de mediação.

require(multilevel)
sobel.h4<-sobel(hayes4$cond, hayes4$pmi, hayes4$reaction)
2*pnorm(abs(sobel.h4$z.value), lower.tail=FALSE) #para encontrar o valor de p

## [1] 0.05939319

Análise de Mediação Tradicional pela Normal Theory - Passo a Passo

Lembrando a “figura” do modelo de mediação simples. Ela é, independente do método de estimação, assim:

## Esse código do "passo-a-passo" foi criado pelo Rodrigo Heldt

# Calcular o efeito do caminho "a"
esta <- lm(pmi ~ cond, data = hayes4)
summary(esta)

## 
## Call:
## lm(formula = pmi ~ cond, data = hayes4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.8534 -0.8534  0.1466  1.1231  1.6231 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.3769     0.1618  33.222   <2e-16 ***
## cond          0.4765     0.2357   2.022   0.0454 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.305 on 121 degrees of freedom
## Multiple R-squared:  0.03268,    Adjusted R-squared:  0.02468 
## F-statistic: 4.088 on 1 and 121 DF,  p-value: 0.0454

a <- summary(esta)$coefficients[2,1]
se.a <- summary(esta)$coefficients[2,2]

# Calcular o efeito dos caminhos "b" e "c'"
estc_b  <- lm(reaction ~ cond + pmi, data = hayes4)
summary(estc_b)

## 
## Call:
## lm(formula = reaction ~ cond + pmi, data = hayes4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.07636 -1.06128 -0.06346  0.94573  2.94299 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.52687    0.54968   0.958    0.340    
## cond         0.25435    0.25582   0.994    0.322    
## pmi          0.50645    0.09705   5.219 7.66e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.393 on 120 degrees of freedom
## Multiple R-squared:  0.2059, Adjusted R-squared:  0.1927 
## F-statistic: 15.56 on 2 and 120 DF,  p-value: 9.83e-07

b <- summary(estc_b)$coefficients[3,1]
c <- summary(estc_b)$coefficients[2,1]

se.b <- summary(estc_b)$coefficients[3,2]

indirect.effect <- a*b
total.effect <- c + indirect.effect

summary(lm(reaction~cond,data=hayes4)) #outra forma de calcular o efeito total.

## 
## Call:
## lm(formula = reaction ~ cond, data = hayes4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4957 -1.0000 -0.2457  1.2543  3.5000 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.2500     0.1906  17.052   <2e-16 ***
## cond          0.4957     0.2775   1.786   0.0766 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.537 on 121 degrees of freedom
## Multiple R-squared:  0.02568,    Adjusted R-squared:  0.01763 
## F-statistic:  3.19 on 1 and 121 DF,  p-value: 0.07661

### Calcular a significância pela normal theory

# Calcular o se.ab
se.ab <- sqrt(a^2*se.b^2 + b^2*se.a^2 + se.a^2*se.b^2)

# Calcular o Z
z = a*b/se.ab  # Se z = 1.85.... 
p.value = 2*pnorm(abs(z), lower.tail=FALSE)  # ....então p.valor = 0.064

# Calcular o intervalo de confiança (alpha = 0.05 z = 1,96)

## opção 1: calculando direto a partir do valor 1,96 da tabela. 
# Assim fica-se obrigado a entrar o valor da tabela
z.ci <- 1.96
ci <- z.ci*se.ab

ci.lower <- a*b - ci
ci.upper <- a*b + ci #passa pelo zero


## opção 2: calculando manualmente o valor do ci. 
# Assim a vantagem é que é possível deixar uma programação automática para qualquer valor de "conf"
conf = .95
alphaL <- (1-conf)/2
alphaU <- 1- alphaL
ci.lower.manual <- a*b+qnorm(alphaL)*se.ab
ci.upper.manual <- a*b+qnorm(alphaU)*se.ab

## Calcular o intervalo de confiança por bootstrap

# function para a*b para cada amostragem

boot.ab <- function(formula.a, formula.b, data, indices) {
  d <- data[indices,] # allows boot to select sample
  esta <- lm(formula.a, data=d)
  estc_b <- lm(formula.b, data=d)
  
  a <- coefficients(esta)[2]  
  b <- coefficients(estc_b)[3]  
  a*b                           

}


# Fazer o bootstrap e identificar o intervalo de confiança
require(boot)
med.boot <- boot(data=hayes4, statistic=boot.ab,
                R=5000, formula.a = pmi ~ cond, formula.b = reaction ~ cond + pmi)


print(med.boot) # aqui "original" é o valor da estatística alvo para a base original.

## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = hayes4, statistic = boot.ab, R = 5000, formula.a = pmi ~ 
##     cond, formula.b = reaction ~ cond + pmi)
## 
## 
## Bootstrap Statistics :
##      original       bias    std. error
## t1* 0.2413355 -0.000668284   0.1287751

#"bias" é a diferença entre a média da estatística de cada amostragem (mean(med.boot$t)) e a variável "original (ou seja, quanto a média da distribuição amostrada para a estatística é diferente do valor dessa estatítica na base original)

med.boot.ci <- boot.ci(med.boot, index=1,conf=(.95), type= c("norm", "basic", "bca", "perc" )) # encontrar o intervalo de confiança. # "Percentile" diz respeito ao intervalo entre o percentil 0.025 e .975. med.boot.ci
print (med.boot.ci)

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 5000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = med.boot, conf = (0.95), type = c("norm", 
##     "basic", "bca", "perc"), index = 1)
## 
## Intervals : 
## Level      Normal              Basic         
## 95%   (-0.0104,  0.4944 )   (-0.0306,  0.4796 )  
## 
## Level     Percentile            BCa          
## 95%   ( 0.0031,  0.5133 )   ( 0.0141,  0.5237 )  
## Calculations and Intervals on Original Scale

plot(med.boot)

Mais tarde, notaremos que essa abordagem já é muito semelhante à abordagem do Hayes.

Modelos por Intervalos de Confiança - Pacote Mediation

O legal do R é que tu podes estimar muitas funções de formas diferentes. Existem funções para mediações simples, por exemplo, onde tu não dependes do algoritmo do Hayes e pode fazer análises “unbranded”. Isso te qualifica como pesquisador(a).

require(mediation)
mediacao<-mediate(esta, estc_b, treat = "cond", mediator = "pmi",
                  boot = TRUE, sims = 5000)
summary(mediacao)

## 
## Causal Mediation Analysis 
## 
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
## 
##                Estimate 95% CI Lower 95% CI Upper p-value  
## ACME            0.24134      0.00886         0.52   0.043 *
## ADE             0.25435     -0.25982         0.76   0.322  
## Total Effect    0.49569     -0.03990         1.04   0.064 .
## Prop. Mediated  0.48687     -0.48863         2.74   0.087 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 123 
## 
## 
## Simulations: 5000

Estamos nos aproximando cada vez mais do resultado do Hayes. Vamos dar mais um passo, com um outro método de estimação.

Modelos Condicionais - Sistema de Equações Estruturais.

Tudo que pode ser feito no PROCESS do Hayes pode ser feito por sistema de equações estruturais. O pacote de SEM do R (lavaan) é extremamente poderoso, mas vocês verão que a interface / linha de comando não é das mais amigáveis. Ainda assim, dá de 10 x 0 em sistemas proprietários e pagos. Tudo dentro do mesmo programa onde tu faz teste t, gráficos e literalmente qualquer outro teste estatístico.

require(lavaan)
m.hayes4 <- ' 
            # direct effect
              reaction ~ c*cond
              direct := c

            # regressions
              pmi ~ a*cond
              reaction ~ b*pmi

            # indirect effect (a*b)
              indirect := a*b

            # total effect
              total := c + (a*b)
            '

sem2 <- sem(model = m.hayes4,
            data = hayes4,
            se = "bootstrap",
            bootstrap = 5000)
# fit measures
summary(sem2,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)

## lavaan 0.6-7 ended normally after 21 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                          5
##                                                       
##   Number of observations                           123
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                                32.444
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -420.028
##   Loglikelihood unrestricted model (H1)       -420.028
##                                                       
##   Akaike (AIC)                                 850.057
##   Bayesian (BIC)                               864.117
##   Sample-size adjusted Bayesian (BIC)          848.308
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value RMSEA <= 0.05                             NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             5000
##   Number of successful bootstrap draws            5000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   reaction ~                                                            
##     cond       (c)    0.254    0.258    0.986    0.324    0.254    0.082
##   pmi ~                                                                 
##     cond       (a)    0.477    0.232    2.050    0.040    0.477    0.181
##   reaction ~                                                            
##     pmi        (b)    0.506    0.081    6.224    0.000    0.506    0.432
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .reaction          1.893    0.193    9.829    0.000    1.893    0.794
##    .pmi               1.675    0.283    5.911    0.000    1.675    0.967
## 
## R-Square:
##                    Estimate
##     reaction          0.206
##     pmi               0.033
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     direct            0.254    0.258    0.986    0.324    0.254    0.082
##     indirect          0.241    0.129    1.869    0.062    0.241    0.078
##     total             0.496    0.276    1.795    0.073    0.496    0.160

Modelo 4 de Mediação do Hayes

Agora, finalmente, vamos ao modelo 4 de mediação do Hayes. A grande vantagem desse algoritmo é a facilidade de acesso e da sintaxe, principalmente para usuários do SPSS e SAS. Além disso, o output é parecido com o do utilizado no livro, o que facilita do aprendizado.

Infelizmente, o algoritmo do Hayes (ainda) não é um pacote, mas sim um script. Por isso, chamaremos esse script, que deve estar no mesmo diretório da base de dados hayes4, e o executaremos de dentro dessa função. Notem que esse script do Hayes é MUITO longo e pode demorar um pouco para rodar, dependendo da máquina de vocês. Aliás, o PROCESS do Hayes vai gerar um output super longo por causa da animação do “progresso” do bootstrap. Não tem quase nada que possamos fazer a respeito.

setwd("/home/lnicolao/Google Drive/Drive/UFRGS/Ensino/20211/ADP - Experimentos/Scripts Aulas")
source("process.R")

## 
## ***************** PROCESS for R Version 3.5.3 beta0.6 ***************** 
##  
##            Written by Andrew F. Hayes, Ph.D.  www.afhayes.com              
##    Documentation available in Hayes (2018). www.guilford.com/p/hayes3   
##  
## *********************************************************************** 
##  
## PROCESS is now ready for use.
## Copyright 2020 by Andrew F. Hayes ALL RIGHTS RESERVED
##  
## Distribution of this beta release of PROCESS is prohibited
## without written authorization from the copyright holder.

process (data = hayes4, y = "reaction", x = "cond", m ="pmi", model = 4, progress = 0)

## 
## ***************** PROCESS for R Version 3.5.3 beta0.6 ***************** 
##  
##            Written by Andrew F. Hayes, Ph.D.  www.afhayes.com              
##    Documentation available in Hayes (2018). www.guilford.com/p/hayes3   
##  
## *********************************************************************** 
##                 
## Model : 4       
##     Y : reaction
##     X : cond    
##     M : pmi     
## 
## Sample size: 123
## 
## Random seed: 826684
## 
## 
## *********************************************************************** 
## Outcome Variable: pmi
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.1808    0.0327    1.7026    4.0878    1.0000  121.0000    0.0454
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    5.3769    0.1618   33.2222    0.0000    5.0565    5.6973
## cond        0.4765    0.2357    2.0218    0.0454    0.0099    0.9431
## 
## *********************************************************************** 
## Outcome Variable: reaction
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.4538    0.2059    1.9404   15.5571    2.0000  120.0000    0.0000
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    0.5269    0.5497    0.9585    0.3397   -0.5615    1.6152
## cond        0.2544    0.2558    0.9943    0.3221   -0.2522    0.7609
## pmi         0.5064    0.0970    5.2185    0.0000    0.3143    0.6986
## 
## *********************************************************************** 
## Bootstrapping in progress. Please wait.
## 
## **************** DIRECT AND INDIRECT EFFECTS OF X ON Y ****************
## 
## Direct effect of X on Y:
##      effect        se         t         p      LLCI      ULCI
##      0.2544    0.2558    0.9943    0.3221   -0.2522    0.7609
## 
## Indirect effect(s) of X on Y:
##        Effect    BootSE  BootLLCI  BootULCI
## pmi    0.2413    0.1314    0.0062    0.5253
## 
## ******************** ANALYSIS NOTES AND ERRORS ************************ 
## 
## Level of confidence for all confidence intervals in output: 95
## 
## Number of bootstraps for percentile bootstrap confidence intervals: 5000

Moderação

Teremos um módulo (Análise de Dados 3) específico sobre moderação. Por enquanto, eu vou replicar o primeiro exemplo do capítulo de moderação do Hayes (Capítulo 7) para que compreendamos como isso funciona no R.

# importando os dados
setwd("/home/lnicolao/Google Drive/Drive/UFRGS/Ensino/20211/ADP - Experimentos/Scripts Aulas")
hayes7<-read.csv("protest.csv", header = T)
hayes11 <- hayes7 #eu estou fazendo uma cópia porque vou usar depois
head(hayes7)

Sobre os dados:

“In this study, 129 participants, all of whom were female, received a written account of the fate of a female attorney (Catherine) who lost a promotion to a less qualified male as a result of discriminatory actions of the senior partners. After reading this story, the participants were given a description of how Catherine responded to this act of sexual discrimination. Those randomly assigned to the no protest condition (coded PROTEST = 0 in the data file) learned that though very disappointed by the decision, Catherine decided not to take any action against this discrimination and continued working at the firm. The remainder of the participants, assigned to a protest condition (coded PROTEST = 1 in the data file), were told that Catherine approached the partners with the request that they reconsider the decision, while giving various explanations as to why the decision was unfair. Following this procedure, the participants were asked to respond to six questions evaluating Catherine (e.g., “Catherine has many positive traits,” “Catherine is the type of person I would like to be friends with”). Their responses were aggregated into a measure of liking, such that participants with higher scores liked her relatively more (LIKING in the data file). In addition to this measure of liking, each participant was scored on the Modern Sexism Scale, used to measure how pervasive a person believes sex discrimination is in society. The higher a person’s score, the more pervasive he or she believes sex discrimination is in society (SEXISM in the data file).

Como a interação é entre uma variável dicotômica e uma variável (presumidamente) contínua, não faz sentido fazer um gráfico de barras, tampouco fazer uma tabela de médias. Faremos, então, gráficos com retas.

#para essas análises, o Hayes agrupou duas condições de protesto em uma só.
hayes7$protest[hayes7$protest==2]<-1

describeBy(hayes7$liking, hayes7$protest, mat = T, digits = 3)

#os dados batem com o livro!
hayes7$protest.f <- factor (hayes7$protest, levels = c(0,1), labels = c("No Protest", "Protest"))

ggplot(hayes7) +
  aes(x = sexism, y = liking, color = protest.f) +
  geom_smooth(method = "lm", se = FALSE)+
  xlim(4,6)

  labs (title = "Liking per Protest Condition and Sexism", x = "Sexism", 
        y = "Liking", color = "Condições")

## $x
## [1] "Sexism"
## 
## $y
## [1] "Liking"
## 
## $colour
## [1] "Condições"
## 
## $title
## [1] "Liking per Protest Condition and Sexism"
## 
## attr(,"class")
## [1] "labels"

Testando a interação com uma regressão normal / direta.

reg.raw<-lm(liking ~ protest*sexism, data = hayes7)
summary(reg.raw)

## 
## Call:
## lm(formula = liking ~ protest * sexism, data = hayes7)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9894 -0.6381  0.0478  0.7404  2.3650 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      7.7062     1.0449   7.375 1.99e-11 ***
## protest         -3.7727     1.2541  -3.008  0.00318 ** 
## sexism          -0.4725     0.2038  -2.318  0.02205 *  
## protest:sexism   0.8336     0.2436   3.422  0.00084 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9888 on 125 degrees of freedom
## Multiple R-squared:  0.1335, Adjusted R-squared:  0.1127 
## F-statistic: 6.419 on 3 and 125 DF,  p-value: 0.0004439

Embora não seja necessário, vamos centrar todos os termos na média para ajudar na interpretabilidade do modelo. Isso fica claro no Capítulo 7 do Hayes e ficará ainda mais claro quando falarmos de spotlight e floodlight na aula que vem, sobre moderação.

hayes7$protest.mc<-hayes7$protest-mean(hayes7$protest,na.rm = T)
hayes7$sexism.mc<-hayes7$sexism-mean(hayes7$sexism,na.rm = T)

Retestando o modelo centrado na média. Não é preciso centrar a variável dependente e nem variáveis que não fazem parte da interação (como covariáveis).

reg.mc <- lm(liking ~ protest.mc*sexism.mc, data = hayes7)
summary(reg.mc)

## 
## Call:
## lm(formula = liking ~ protest.mc * sexism.mc, data = hayes7)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9894 -0.6381  0.0478  0.7404  2.3650 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           5.62456    0.08713  64.553  < 2e-16 ***
## protest.mc            0.49262    0.18722   2.631  0.00958 ** 
## sexism.mc             0.09613    0.11169   0.861  0.39102    
## protest.mc:sexism.mc  0.83355    0.24356   3.422  0.00084 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9888 on 125 degrees of freedom
## Multiple R-squared:  0.1335, Adjusted R-squared:  0.1127 
## F-statistic: 6.419 on 3 and 125 DF,  p-value: 0.0004439

Estimando o mesmo modelo pelo PROCESS. A única razão para fazer isso é uma forma mais amigável de “pegar” os intervalos de confiança dos coeficientes do modelo, ao invés do valor de p. Tem a vantagem, também, de centrar os componentes do modelo na média, sem precisar criar outras variáveis.

process(hayes7, y = "liking", x = "protest", w = "sexism", model = 1)

## 
## ***************** PROCESS for R Version 3.5.3 beta0.6 ***************** 
##  
##            Written by Andrew F. Hayes, Ph.D.  www.afhayes.com              
##    Documentation available in Hayes (2018). www.guilford.com/p/hayes3   
##  
## *********************************************************************** 
##                
## Model : 1      
##     Y : liking 
##     X : protest
##     W : sexism 
## 
## Sample size: 129
## 
## 
## *********************************************************************** 
## Outcome Variable: liking
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.3654    0.1335    0.9777    6.4190    3.0000  125.0000    0.0004
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    7.7062    1.0449    7.3750    0.0000    5.6382    9.7743
## protest    -3.7727    1.2541   -3.0084    0.0032   -6.2546   -1.2907
## sexism     -0.4725    0.2038   -2.3184    0.0220   -0.8758   -0.0692
## Int_1       0.8336    0.2436    3.4224    0.0008    0.3515    1.3156
## 
## Product terms key:
## Int_1  :  protest  x  sexism      
## 
## Test(s) of highest order unconditional interaction(s):
##       R2-chng         F       df1       df2         p
## X*W    0.0812   11.7126    1.0000  125.0000    0.0008
## ----------
## Focal predictor: protest (X)
##       Moderator: sexism (W)
## 
## Conditional effects of the focal predictor at values of the moderator(s):
##      sexism    effect        se         t         p      LLCI      ULCI
##      4.2500   -0.2301    0.2775   -0.8291    0.4086   -0.7792    0.3191
##      5.1200    0.4951    0.1872    2.6443    0.0092    0.1245    0.8657
##      5.8960    1.1420    0.2710    4.2141    0.0000    0.6056    1.6783
## 
## ******************** ANALYSIS NOTES AND ERRORS ************************ 
## 
## Level of confidence for all confidence intervals in output: 95
## 
## W values in conditional tables are the 16th, 50th, and 84th percentiles.

Refazendo o modelo, mas centrando na média.

process(hayes7, y = "liking", x = "protest", w = "sexism", model = 1, center = 1)

## 
## ***************** PROCESS for R Version 3.5.3 beta0.6 ***************** 
##  
##            Written by Andrew F. Hayes, Ph.D.  www.afhayes.com              
##    Documentation available in Hayes (2018). www.guilford.com/p/hayes3   
##  
## *********************************************************************** 
##                
## Model : 1      
##     Y : liking 
##     X : protest
##     W : sexism 
## 
## Sample size: 129
## 
## 
## *********************************************************************** 
## Outcome Variable: liking
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.3654    0.1335    0.9777    6.4190    3.0000  125.0000    0.0004
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    5.6246    0.0871   64.5535    0.0000    5.4521    5.7970
## protest     0.4926    0.1872    2.6312    0.0096    0.1221    0.8632
## sexism      0.0961    0.1117    0.8608    0.3910   -0.1249    0.3172
## Int_1       0.8336    0.2436    3.4224    0.0008    0.3515    1.3156
## 
## Product terms key:
## Int_1  :  protest  x  sexism      
## 
## Test(s) of highest order unconditional interaction(s):
##       R2-chng         F       df1       df2         p
## X*W    0.0812   11.7126    1.0000  125.0000    0.0008
## ----------
## Focal predictor: protest (X)
##       Moderator: sexism (W)
## 
## Conditional effects of the focal predictor at values of the moderator(s):
##      sexism    effect        se         t         p      LLCI      ULCI
##     -0.8670   -0.2301    0.2775   -0.8291    0.4086   -0.7792    0.3191
##      0.0030    0.4951    0.1872    2.6443    0.0092    0.1245    0.8657
##      0.7790    1.1420    0.2710    4.2141    0.0000    0.6056    1.6783
## 
## ******************** ANALYSIS NOTES AND ERRORS ************************ 
## 
## Level of confidence for all confidence intervals in output: 95
## 
## W values in conditional tables are the 16th, 50th, and 84th percentiles.
##  
## NOTE: The following variables were mean centered prior to analysis: 
##          sexism protest

Mediação Moderada

No capítulo 11 do Hayes, ele usa essa mesma base de dados para testar uma mediação moderada. Essa é a explicação para os dados que ele apresenta.

“In Chapter 7, I introduced a study by Garcia et al. (2010) in which participants evaluated an attorney (Catherine) subjected to sexual discrimination in her law firm. Recall that some of the participants in this study were told that following news that she lost a promotion to a less qualified male attorney at the firm, Catherine protested the decision by approaching the partners and making a case as to why it was unfair and that she should have been promoted. But some participants were told that she chose not to protest and, instead, accepted the decision, kept quiet, and went about her job at the firm. Of interest in that study was how Catherine would be perceived as a function of her decision to protest the discrimination rather than remaining silent. In turned out that different people perceived her differently, depending on their beliefs about the pervasiveness of sex discrimination in society. In this section, I expand on these findings as well as the discussion of conditional process analysis in Chapter 10 by introducing a new variable to the model that Garcia et al. (2010) proposed as a potential mediator of the effect of Catherine’s behavior on how she was evaluated. After reading the vignette describing Catherine’s fate and how she responded, the participants were asked a series of questions, the responses to which were agg regated into a measure of perceived appropriateness of the response (RESPAPPR in the PROTEST data file, with higher scores reflecting a perception that her response was relatively more appropriate). Garcia et al. (2010) argued that how Catherine is evaluated would depend on the extent to which her action was deemed appropriate for the circumstance. More specifically, insofar as protesting was perceived as an appropriate response to an apparent act of sex discrimination, she would be perceived more favorably. Of course, not everyone is expected to perceive her decision to protest as appropriate, but, on average, the expectation was that protesting would be seen as a more appropriate response than accepting the decision and remaining silent. This line of reasoning argues that perceived appropriateness of her response functions as mediator of the effect of her chosen course of action on how she was perceived.”

Esse é o modelo proposto (Modelo 8): “Exemplo com o Modelo 8”

Eu gostaria de escrever que sempre tivemos uma forma de testar esse modelo no R. A verdade é que sim e não. Claro que sim. Uma vez que os modelos do Hayes são modelos de equações estruturais, poderíamos montar o sistema de equações que nos levasse a estimação desse modelo no pacote de SEM lavaan. Entretanto, isso não é tão simples. A estimação do modelo de mediação simples (equivalente ao modelo 4 do Hayes) já foi complicada. Esse modelo seria ainda mais complicado. Até o final do ano passado, tínhamos dois pacotes não oficiais (processR e processr) e mais um tanto de pacotes para apenas alguns modelos. Todos com sintaxes diferentes e com suas peculiaridades. No início de 2021, o Hayes lançou o PROCESS para o R. “Rejoice”… pelo menos temos uma vida mais tranquila agora. Vamos testar o modelo 8 abaixo.

hayes11 <- hayes7
process(hayes11, y = "liking", x = "protest", m = "respappr", w = "sexism", 
        model = 8, boot=10000, progress = 0)

## 
## ***************** PROCESS for R Version 3.5.3 beta0.6 ***************** 
##  
##            Written by Andrew F. Hayes, Ph.D.  www.afhayes.com              
##    Documentation available in Hayes (2018). www.guilford.com/p/hayes3   
##  
## *********************************************************************** 
##                 
## Model : 8       
##     Y : liking  
##     X : protest 
##     M : respappr
##     W : sexism  
## 
## Sample size: 129
## 
## Random seed: 425247
## 
## 
## *********************************************************************** 
## Outcome Variable: respappr
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.5442    0.2962    1.3098   17.5340    3.0000  125.0000    0.0000
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    6.5667    1.2095    5.4295    0.0000    4.1730    8.9604
## protest    -2.6866    1.4515   -1.8509    0.0665   -5.5594    0.1861
## sexism     -0.5290    0.2359   -2.2426    0.0267   -0.9959   -0.0622
## Int_1       0.8100    0.2819    2.8732    0.0048    0.2520    1.3679
## 
## Product terms key:
## Int_1  :  protest  x  sexism      
## 
## Test(s) of highest order unconditional interaction(s):
##       R2-chng         F       df1       df2         p
## X*W    0.0465    8.2551    1.0000  125.0000    0.0048
## ----------
## Focal predictor: protest (X)
##       Moderator: sexism (W)
## 
## Conditional effects of the focal predictor at values of the moderator(s):
##      sexism    effect        se         t         p      LLCI      ULCI
##      4.2500    0.7558    0.3212    2.3533    0.0202    0.1202    1.3914
##      5.1200    1.4605    0.2167    6.7386    0.0000    1.0315    1.8894
##      5.8960    2.0890    0.3137    6.6601    0.0000    1.4682    2.7098
## 
## *********************************************************************** 
## Outcome Variable: liking
## 
## Model Summary: 
##           R      R-sq       MSE         F       df1       df2         p
##      0.5323    0.2833    0.8152   12.2551    4.0000  124.0000    0.0000
## 
## Model: 
##              coeff        se         t         p      LLCI      ULCI
## constant    5.3471    1.0607    5.0412    0.0000    3.2477    7.4465
## protest    -2.8075    1.1607   -2.4188    0.0170   -5.1047   -0.5102
## respappr    0.3593    0.0706    5.0916    0.0000    0.2196    0.4989
## sexism     -0.2824    0.1898   -1.4882    0.1392   -0.6581    0.0932
## Int_1       0.5426    0.2296    2.3628    0.0197    0.0881    0.9970
## 
## Product terms key:
## Int_1  :  protest  x  sexism      
## 
## Test(s) of highest order unconditional interaction(s):
##       R2-chng         F       df1       df2         p
## X*W    0.0323    5.5830    1.0000  124.0000    0.0197
## ----------
## Focal predictor: protest (X)
##       Moderator: sexism (W)
## 
## Conditional effects of the focal predictor at values of the moderator(s):
##      sexism    effect        se         t         p      LLCI      ULCI
##      4.2500   -0.5016    0.2589   -1.9373    0.0550   -1.0140    0.0109
##      5.1200   -0.0296    0.1996   -0.1480    0.8826   -0.4247    0.3656
##      5.8960    0.3915    0.2880    1.3592    0.1766   -0.1786    0.9615
## 
## *********************************************************************** 
## Bootstrapping in progress. Please wait.
## 
## **************** DIRECT AND INDIRECT EFFECTS OF X ON Y ****************
## 
## Conditional direct effect(s) of X on Y:
##      sexism    effect        se         t         p      LLCI      ULCI
##      4.2500   -0.5016    0.2589   -1.9373    0.0550   -1.0140    0.0109
##      5.1200   -0.0296    0.1996   -0.1480    0.8826   -0.4247    0.3656
##      5.8960    0.3915    0.2880    1.3592    0.1766   -0.1786    0.9615
## 
## Conditional indirect effects of X on Y:
## 
## INDIRECT EFFECT:
## 
## protest    ->    respappr    ->    liking
## 
##      sexism    Effect    BootSE  BootLLCI  BootULCI
##      4.2500    0.2715    0.1355    0.0302    0.5657
##      5.1200    0.5247    0.1312    0.2800    0.7997
##      5.8960    0.7505    0.2019    0.3700    1.1673
## 
##      Index of moderated mediation:
##            Index    BootSE  BootLLCI  BootULCI
## sexism    0.2910    0.1387    0.0303    0.5738
## 
## ---
## 
## ******************** ANALYSIS NOTES AND ERRORS ************************ 
## 
## Level of confidence for all confidence intervals in output: 95
## 
## Number of bootstraps for percentile bootstrap confidence intervals: 10000
## 
## W values in conditional tables are the 16th, 50th, and 84th percentiles.

A versão do PROCESS do Hayes para o R ainda está em beta e tem algumas funções não disponíveis. Por exemplo, o Hayes estima os efeitos por quantiles, mas essa função não está disponível. Consequentemente, ele reverte para o default de média, -1 SD e +1 SD para os efeitos indiretos.

Métodos Experimentais: Análise de Dados - Parte 2

Leonardo Nicolao