For today’s lab, we’ll be using the data set called disagreeableness.csv on Canvas.

Example: Let’s say a researcher is interested in whether people’s relationship statuses are related to how disagreeable they are. The researcher asks 162 participants whether they are single (1), dating (2), or seriously dating (3) and also measures their disagreeableness (sample item: “I tend to find fault with others.”).

One of the researcher’s hypotheses that they would like to test is that people who are single will score significantly differently on disagreeableness compared to people who are not single.

Import the data

data <- import("disagreeableness.csv")

Data Cleaning

First, let’s inspect each variable’s measure type & correct any, if needed.

str(data)

## 'data.frame':    162 obs. of  3 variables:
##  $ ID                : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ RelationshipStatus: int  3 3 2 3 3 3 2 2 2 2 ...
##  $ Disagreeableness  : int  5 4 7 6 5 6 6 5 5 5 ...

ID & RelationshipStatus should be transformed into factors

data <- data %>%
  mutate(ID = as.factor(ID),
         RelationshipStatus = factor(RelationshipStatus, labels = c("Single", "Dating", "Seriously Dating")))

We can use descriptive statistics to check for any data entry errors.

describe(data)

##                     vars   n  mean    sd median trimmed   mad min max range
## ID*                    1 162 81.50 46.91   81.5   81.50 60.05   1 162   161
## RelationshipStatus*    2 162  1.88  0.88    2.0    1.85  1.48   1   3     2
## Disagreeableness       3 162  5.29  1.18    6.0    5.38  1.48   1   7     6
##                      skew kurtosis   se
## ID*                  0.00    -1.22 3.69
## RelationshipStatus*  0.24    -1.68 0.07
## Disagreeableness    -0.93     1.10 0.09

tab1(data$RelationshipStatus)

## data$RelationshipStatus : 
##                  Frequency Percent Cum. percent
## Single                  74    45.7         45.7
## Dating                  34    21.0         66.7
## Seriously Dating        54    33.3        100.0
##   Total                162   100.0        100.0

hist(data$Disagreeableness)

Everything looks good! The data is also already set up the way we need it to be for the analysis (the IV and DV should each be their own columns).

Descriptive Statistics

Let’s start with descriptively examining the M and SD on disagreeableness across the three levels of RelationshipStatus.

descriptives <- data %>%
  group_by(RelationshipStatus) %>%
  summarize(M = mean(Disagreeableness, na.rm = TRUE),
            SD = sd(Disagreeableness, na.rm = TRUE))

# Making the table look a little nicer
descriptives %>%
  knitr::kable(digits = 5, col.names = c("Relationship Status", "Mean", "SD"), caption = "Descriptive Statistics for Disagreeableness Across Each Relationship Status", format.args = list(nsmall = 2))

Descriptive Statistics for Disagreeableness Across Each Relationship Status
Relationship Status	Mean	SD
Single	5.16216	1.23895
Dating	5.70588	0.87141
Seriously Dating	5.20370	1.23441

# And we can plot the descriptives
ggplot(descriptives, aes(x = RelationshipStatus, y = M, fill = RelationshipStatus)) +
  geom_bar(stat = "identity") +
  ggtitle("Average Disagreeableness Across Relationship Statuses") +
  labs(x = "Relationship Status",
       y = "Mean Disagreeableness") +
  theme(plot.title = element_text(hjust = 0.5))

Coding the Categorical Predictor

Recall that the number of codes that need to be included in the model is equal to m-1, or the number of levels of the categorical predictor minus 1. In this case, RelationshipStatus has three levels, thus, we need to construct 2 codes to represent it in the model. We will go with the recommended contrast coding method.

Recall that the researcher had a hypothesis that people who are single will score significantly differently on disagreeableness compared to people who are not single. To test this hypothesis, let’s use contrast codes that compare the mean of the Single condition to the average across the Dating and Seriously Dating conditions. For instance, we could use:

RelCode1: Single = 2/3, Dating = -1/3, Seriously Dating = -1/3

Then, we just need to construct RelCode2 so that it meets the rules of contrast coding in combination with RelCode1. One option would be:

RelCode2 = Single = 0, Dating = 1/2, Seriously Dating = -1/2

RelCode1 <- c(2/3, -1/3, -1/3)
RelCode2 <- c(0, 1/2, -1/2)

contrasts(data$RelationshipStatus) <- cbind(RelCode1, RelCode2)

contrasts(data$RelationshipStatus)

##                    RelCode1 RelCode2
## Single            0.6666667      0.0
## Dating           -0.3333333      0.5
## Seriously Dating -0.3333333     -0.5

Note that RelCode1 makes the comparison that the researcher would like to test. Knowing this, we can construct our desired model comparison.

Testing a Single Predictor

The model corresponding to the null hypothesis, Model C, is:

\[Model C: Disagreeableness_i = \beta_0 + \beta_2*RelCode2 + \epsilon_i\]

The model corresponding to the alternative hypothesis, Model A, is: \[Model A: Disagreeableness_i = \beta_0 + \beta_1*RelCode1 + \beta_2*RelCode2 + \epsilon_i\]

Hypotheses

The null hypothesis is that the model parameter corresponding to RelCode1 will be equal to zero.

\[H_0: \beta_1\ = 0\] The alternative hypothesis is that the model parameter corresponding to RelCode1 will not be equal to zero.

\[H_1: \beta_1\ \neq 0\]

Fit the Model

Let’s fit the model predicting Disagreebleness from RelationshipStatus, which we have just made custom contrast codes for.

model <- lm(Disagreeableness ~ RelationshipStatus, data = data)

Check whether model assumptions were met

Checking whether errors are normally distributed

augment_model <- augment(model)
augment_model

## # A tibble: 162 × 8
##    Disagreeableness RelationshipStatus .fitted .resid   .hat .sigma  .cooksd
##               <int> <fct>                <dbl>  <dbl>  <dbl>  <dbl>    <dbl>
##  1                5 Seriously Dating      5.20 -0.204 0.0185   1.17 0.000194
##  2                4 Seriously Dating      5.20 -1.20  0.0185   1.17 0.00678 
##  3                7 Dating                5.71  1.29  0.0294   1.17 0.0127  
##  4                6 Seriously Dating      5.20  0.796 0.0185   1.17 0.00297 
##  5                5 Seriously Dating      5.20 -0.204 0.0185   1.17 0.000194
##  6                6 Seriously Dating      5.20  0.796 0.0185   1.17 0.00297 
##  7                6 Dating                5.71  0.294 0.0294   1.17 0.000657
##  8                5 Dating                5.71 -0.706 0.0294   1.17 0.00378 
##  9                5 Dating                5.71 -0.706 0.0294   1.17 0.00378 
## 10                5 Dating                5.71 -0.706 0.0294   1.17 0.00378 
## # ℹ 152 more rows
## # ℹ 1 more variable: .std.resid <dbl>

ggplot(data = augment_model, aes(x = .resid)) + 
  geom_density(fill = "purple") + 
  stat_function(linetype = 2, 
                fun = dnorm, 
                args = list(mean = mean(augment_model$.resid), 
                                sd   =   sd(augment_model$.resid)))

The residuals are somewhat negatively skewed, but recall that this is a robust assumption.

ggplot(model) +
  geom_abline() + 
  stat_qq(aes(sample = .stdresid))

The QQ-plot also suggests some degree of violation of the normality assumption, but we will proceed since this is a robust assumption.

Checking independence of errors

data$ID <- c(1:nrow(data))

augment_model$ID <- data$ID

ggplot(data = augment_model, aes(x = ID, y = .resid)) + 
  geom_point() +  
  geom_smooth(se = F) +
  geom_hline(yintercept = 0)

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Good - there doesn’t look to be a systematic pattern between ID number of the model’s errors.

Checking homogeneity of variances

plot(model, 1)

Remember that we are looking to see that the amount of spread in the data at each level of condition is approximately the same. If you can’t tell visually, you can also run Levene’s test.

leveneTest(Disagreeableness ~ RelationshipStatus, data = data)

## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value  Pr(>F)  
## group   2  2.7548 0.06666 .
##       159                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Good - levene’s test for homogeneity of variances is non-significant (suggesting the variability in scores is not significantly different across conditions).

Interpreting the Model Output

First, let’s pass the model to the summary() function.

summary(model)

## 
## Call:
## lm(formula = Disagreeableness ~ RelationshipStatus, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.2037 -0.7059  0.2941  0.8378  1.8378 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 5.35725    0.09672  55.389   <2e-16 ***
## RelationshipStatusRelCode1 -0.29263    0.18691  -1.566   0.1194    
## RelationshipStatusRelCode2  0.50218    0.25628   1.960   0.0518 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.171 on 159 degrees of freedom
## Multiple R-squared:  0.03324,    Adjusted R-squared:  0.02108 
## F-statistic: 2.734 on 2 and 159 DF,  p-value: 0.06803

Question: Can you interpret the meaning of each parameter estimate in the context of the current research scenario?

Estimate
- b0 = 5.36
  - Interpretation:
- b1 = -0.29
  - Interpretation:
- b2 = 0.50
  - Interpretation:

Question: Is the parameter estimate corresponding to a test of the difference between the mean of the Single condition and the average of the Dating and Seriously Dating conditions significant or non-significant?

Testing Multiple Predictors

We might also be interested in whether RelationshipStatus overall mattered for predicting scores on Disagreeableness. In other words, we could also wish to examine the following model comparison:

\[Model C: Disagreeableness_i = \beta_0 + \epsilon_i\]

\[Model A: Disagreeableness_i = \beta_0 + \beta_1*RelCode1 + \beta_2*RelCode2 + \epsilon_i\]

Null Hypothesis

The null hypothesis would be that \(\beta_1\) and \(\beta_2\) are both equal to zero.

\[H_0: \beta_1\ = \beta_2\ = 0\]

We can test this model comparison by looking at our model output using the anova() function:

anova(model)

## Analysis of Variance Table
## 
## Response: Disagreeableness
##                     Df  Sum Sq Mean Sq F value  Pr(>F)  
## RelationshipStatus   2   7.492  3.7460  2.7338 0.06803 .
## Residuals          159 217.872  1.3703                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Overall, relationship status did not significantly predict people’s disagreeableness, F(2, 159) = 2.73, p = .068.

Effect Size & Confidence Intervals

Effect Size

Remember that the significance of the findings does not speak to the size of the findings. Effect sizes are important to report to give the audience an idea of the practical importance of the effect or relationship being discussed, as well as for future researchers who may want to include your study in a meta-analysis.

# Eta-Squared for relationship status
etaSquared(model)

##                        eta.sq eta.sq.part
## RelationshipStatus 0.03324424  0.03324424

Relationship status accounted for only 3.32% of the variability in people’s disagreeableness scored.

# Cohen's d for specific group comparisons
means <- emmeans(model, ~RelationshipStatus)

eff_size(means, sigma = sigma(model), edf = df.residual(model))

##  contrast                  effect.size    SE  df lower.CL upper.CL
##  Single - Dating               -0.4645 0.209 159 -0.87690  -0.0521
##  Single - Seriously Dating     -0.0355 0.179 159 -0.38898   0.3180
##  Dating - Seriously Dating      0.4290 0.220 159 -0.00599   0.8640
## 
## sigma used for effect sizes: 1.171 
## Confidence level used: 0.95

The largest standardized difference was between the Single and Dating conditions.

# Obtaining Cohen's d for Single vs the combination of the Dating and Seriously Dating conditions
single <- c(1,0,0)
dating <- c(0,1,0)
serious <- c(0,0,1)

dating_and_serious <- (dating + serious)/2

eff_size(means, method = list("Single vs Not single" = single - dating_and_serious), sigma = sigma(model), edf = df.residual(model))

##  contrast             effect.size   SE  df lower.CL upper.CL
##  Single vs Not single       -0.25 0.16 159   -0.567   0.0666
## 
## sigma used for effect sizes: 1.171 
## Confidence level used: 0.95

See this post (https://aosmith.rbind.io/2019/04/15/custom-contrasts-emmeans/#the-contrast-function-for-custom-comparisons) for further elaboration on how to make custom comparisons using the emmeans package.

Confidence Intervals

confint(model)

##                                   2.5 %     97.5 %
## (Intercept)                 5.166226117 5.54827270
## RelationshipStatusRelCode1 -0.661782925 0.07652119
## RelationshipStatusRelCode2 -0.003964717 1.00832202

For b0: 95%CI[5.17, 5.55]
For b1: 95%CI[-0.66, 0.08]
For b2: 95%CI[-0.00, 1.01]

APA-Style Summary

In this study, we examined whether relationship status predicted people’s disagreeableness. Specifically, we wanted to test whether people who were single had significantly different disagreeableness compared to people who were dating or seriously dating.

Using a linear regression analysis, we found that people who were single did not score significantly differently on disagreeableness compared to people who were dating or seriously dating, t(159) = -1.57, p = .119, d = 0.25, b1 = -0.29, 95%CI[-0.66, 0.08].

In fact, overall, relationship status did not significantly predict people’s disagreeableness, F(2, 159) = 2.73, p = .068, \(R^2\) = 0.03.

Linear Regression with a Single Categorical Predictor with 2+ Independent Levels