Read More

In data analysis using RStudio, understanding the intricacies of ANCOVA (Analysis of Covariance) is essential.

Before We start, Make sure you Have the following:

Loading Essential Packages

Loading these packages sets the stage for a comprehensive analysis, utilizing functions from ‘car’ for Anova, ‘emmeans’ for emmeans and contrast, and ‘ggplot2’ for visually representing data.

Generating Synthetic Data

##    id group pretest posttest1 posttest2
## 1   1     0      53        62        72
## 2   2     0      50        59        54
## 3   3     0      50        57        60
## 4   4     1      64        74        93
## 5   5     0      48        52        53
## 6   6     1      65        78        81
## 7   7     1      35        58        81
## 8   8     1      56        72        84
## 9   9     0      51        61        54
## 10 10     0      52        55        73

This section generates synthetic data, simulating a scenario for ANCOVA analysis. Understanding the structure of the data is crucial before delving into statistical procedures.

Exploring Data Statistics

##        id             group         pretest        posttest1    
##  Min.   :  1.00   Min.   :0.00   Min.   :27.00   Min.   :35.00  
##  1st Qu.: 25.75   1st Qu.:0.00   1st Qu.:43.00   1st Qu.:50.75  
##  Median : 50.50   Median :0.00   Median :49.50   Median :59.00  
##  Mean   : 50.50   Mean   :0.43   Mean   :49.45   Mean   :59.23  
##  3rd Qu.: 75.25   3rd Qu.:1.00   3rd Qu.:55.00   3rd Qu.:65.00  
##  Max.   :100.00   Max.   :1.00   Max.   :72.00   Max.   :94.00  
##    posttest2     
##  Min.   : 41.00  
##  1st Qu.: 55.00  
##  Median : 68.00  
##  Mean   : 68.56  
##  3rd Qu.: 78.25  
##  Max.   :111.00
## 'data.frame':    100 obs. of  5 variables:
##  $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ group    : num  0 0 0 1 0 1 1 1 0 0 ...
##  $ pretest  : num  53 50 50 64 48 65 35 56 51 52 ...
##  $ posttest1: num  62 59 57 74 52 78 58 72 61 55 ...
##  $ posttest2: num  72 54 60 93 53 81 81 84 54 73 ...

Descriptive statistics provide a quick overview of the central tendency and spread of the data. The ‘summary()’ function offers insights into key statistical measures, while ‘str()’ reveals the data structure.

Visualizing Data Distribution

Visualizing data distribution through histograms helps in understanding the spread and shape of variables across different groups.

Analyzing Covariate Relationships

These scatterplots with regression lines illustrate how pretest variables relate to posttest1 and posttest2, providing insights into covariate relationships.

Statistical Testing

## 
##  Shapiro-Wilk normality test
## 
## data:  data$posttest1
## W = 0.98581, p-value = 0.3625
## 
##  Shapiro-Wilk normality test
## 
## data:  data$posttest2
## W = 0.98185, p-value = 0.1847
## 
##  Pearson's product-moment correlation
## 
## data:  data$pretest and data$posttest1
## t = 11.56, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6618677 0.8318580
## sample estimates:
##       cor 
## 0.7595431
## 
##  Pearson's product-moment correlation
## 
## data:  data$pretest and data$posttest2
## t = 5.7956, df = 98, p-value = 8.284e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3428313 0.6383036
## sample estimates:
##       cor 
## 0.5052282
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  0.1936 0.6609
##       98
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  1.2911 0.2586
##       98

These statistical tests assess normality, linearity, and homogeneity of variances, crucial assumptions for ANCOVA.

Homogeneity of Covariances

## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  Y
## Chi-Sq (approx.) = 10.337, df = 3, p-value = 0.01591

The ‘heplots’ package aids in assessing homogeneity of covariances, a critical step in ensuring the reliability of the analysis.

Repeated Measures ANCOVA Model

## 
## Error: id
##  Response posttest1 :
##         Df Sum Sq Mean Sq
## pretest  1 324.45  324.45
## 
##  Response posttest2 :
##         Df Sum Sq Mean Sq
## pretest  1 228.66  228.66
## 
## 
## Error: Within
##  Response posttest1 :
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## pretest    1 7638.1  7638.1 216.968 < 2.2e-16 ***
## group      1 2425.6  2425.6  68.901 6.538e-13 ***
## Residuals 96 3379.6    35.2                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response posttest2 :
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## pretest    1 5479.5  5479.5  76.583 7.142e-14 ***
## group      1 9733.6  9733.6 136.039 < 2.2e-16 ***
## Residuals 96 6868.8    71.6                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This section dives into the core of the analysis, fitting the repeated measures ANCOVA model and interpreting the results.

Related Posts

Corrected ANOVA

## 
## Type III MANOVA Tests: Pillai test statistic
##             Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)  1   0.18153   10.646      2     96 6.671e-05 ***
## pretest      1   0.70294  113.586      2     96 < 2.2e-16 ***
## group        1   0.58200   66.832      2     96 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Explore the corrected ANOVA using the ‘car’ package, refining the analysis and enhancing result accuracy.

Conclusion

In this comprehensive guide, we traversed the landscape of RStudio data analysis with ANCOVA. Each step, from data generation to model fitting, contributes to a robust and insightful analysis.