Satterthwaite based confidence intervals for total standard deviation (SD) demonstrated using CLSI EP05-A2 and R package VCA

Introduction

Demonstrate agreement in Satterthwaite based confidence interval calculation for total SD in EP05-A2 Appendix B Page 27 and that from a very useful R package, VCA.

Set up environment

      rm(list = ls()) 
      startTime <- proc.time()
      library(knitr)
      options(width = 120)
      opts_chunk$set(comment = "", warning = FALSE, message = FALSE, 
                     echo = TRUE, tidy = FALSE, size = "tiny",  
                     cache = F, progress = TRUE,
                      cache.path = 'Satterthwaite_Cache/', fig.path = 'figure/') 
      
      knitr::knit_hooks$set(inline = function(x) {
        knitr:::format_sci(x, 'md')
      })

Set working directory

      where <- "home" 
       
      path <- "DEVELOPMENT/VARIANCE COMPONENTS CI FOR LINEAR COMBINATIONS"   
       
      work <-     paste("X:/", path, sep = "")
      nonwork <-  paste("~/X/", path, sep = "")
       
      if (where=="home") {wd <-  nonwork} else {wd <- work}
       
      work2 <-     paste("X:/", path, sep = "")
      nonwork2 <-  paste("~/X/", path, sep = "")
        
      if (where=="home") {wd2 <-  nonwork2} else {wd2 <- work2}
       
      work3 <-     paste("X:/FUNCTIONS/R", sep = "")
      nonwork3 <-  paste("~/X/FUNCTIONS/R", sep = "")
       
      if (where=="home") {wd3 <-  nonwork3} else {wd3 <- work3}
       
      setwd(wd)
      opts_knit$set(root.dir = wd)      ##THIS SETS YOUR WORKING DIRECTORY

Load required packages (plus some functions that may or may not be used)

      list.of.packages <- c("lme4","VCA" )
      
      new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
      if(length(new.packages)) install.packages(new.packages)
      
      sapply(X = list.of.packages, require, character.only = TRUE)

lme4  VCA 
TRUE TRUE

      p1x <- function(x) {print(formatC(x, format = "f", digits = 1),quote = FALSE)}
      p2x <- function(x) {print(formatC(x, format = "f", digits = 2),quote = FALSE)}
      p3x <- function(x) {print(formatC(x, format = "f", digits = 3),quote = FALSE)}
      p4x <- function(x) {print(formatC(x, format = "f", digits = 4),quote = FALSE)}

The VCA package contains the EP05-A2 Glucose dataset

         data(Glucose, package="VCA") #be specific, nlme also has a Glucose dataset, see 'data()'
         head(Glucose)

  day run result
1   1   1    242
2   1   1    246
3   1   2    245
4   1   2    246
5   2   1    243
6   2   1    242

         tail(Glucose)

   day run result
75  19   2    247
76  19   2    245
77  20   1    247
78  20   1    240
79  20   2    245
80  20   2    242

Care is required using R package ‘lme4’ function ‘lmer’ as the VCA Glucose dataset ‘run’ variable is not explicitly coded (see above) to reflect nesting in day. Model f0 therefore gives erroneous results.

        f0 <- lmer(result ~ (1 | day) + (1| run), data = Glucose, 
                 REML = TRUE, na.action = "na.omit")

Correctly specified model under the circumstances

        f <- lmer(result ~ (1 | day/run), data = Glucose, 
                 REML = TRUE, na.action = "na.omit")

Calculate the total SD

        sd <- ((VarCorr(f)$`day`[1]) +
                    (VarCorr(f)$`run`[1]) +
                       as.data.frame(VarCorr(f))[3,5]^2)^0.5

Run ANOVA using function ‘aov’ on the balanced dataset to obtain Mean Squares

        foo <- aov(result ~ day/run, data = Glucose)
        foo <- as.matrix(anova(foo))

Now duplicate the calculation in EP05-A2 Appendix B Page 27

        MD <- foo[1,3]               # MS Day
        MR <- foo[2,3]               # MS Run
        ME <- foo[3,3]               # Error variance 
        I <- nlevels(Glucose$day)    # Number of days in experiment  
        top <- I*((2*ME+MR+MD)^2)
        bottom <- (2*ME*ME)+MR*MR+((I/(I-1))*MD*MD)
        df <- top/bottom

Present the Satterthwaite total degrees of freedom and calculate the total SD 95% confidence limits

        alpha = 0.05
        crupp <- qchisq(alpha/2, df)
        crlow <- qchisq(1-(alpha/2),df)
        c(df,  sqrt((df)*(sd*sd)/(crlow)),  sqrt((df)*(sd*sd)/(crupp)) )

[1] 64.777320  3.069590  4.342976

Compare with VCA package

        fit <- anovaVCA(result~day/run, Glucose)
        VCAinference(fit, VarVC=TRUE, ci.method="satt")




Inference from (V)ariance (C)omponent (A)nalysis
------------------------------------------------

> VCA Result:
-------------

  Name    DF      SS    MS      VC      %Total  SD     CV[%]  Var(VC)
1 total   64.7773               12.9336 100     3.5963 1.4727        
2 day     19      415.8 21.8842 1.9586  15.1432 1.3995 0.5731 4.3845 
3 day:run 20      281   14.05   3.075   23.7754 1.7536 0.7181 5.7152 
4 error   40      316   7.9     7.9     61.0814 2.8107 1.151  3.1205 

Mean: 244.2 (N = 80) 

Experimental Design: balanced


> VC:
-----
        Estimate      DF CI LCL   CI UCL One-Sided LCL One-Sided UCL
total    12.9336 64.7773 9.4224  18.8614        9.9071       17.7278
day       1.9586  1.7497 0.5010 121.7586        0.6234       54.6295
day:run   3.0750  3.3089 1.0260  35.2063        1.2195       22.4658
error     7.9000 40.0000 5.3251  12.9333        5.6673       11.9203

> SD:
-----
        Estimate      DF CI LCL  CI UCL One-Sided LCL One-Sided UCL
total     3.5963 64.7773 3.0696  4.3430        3.1476        4.2104
day       1.3995  1.7497 0.7078 11.0344        0.7896        7.3912
day:run   1.7536  3.3089 1.0129  5.9335        1.1043        4.7398
error     2.8107 40.0000 2.3076  3.5963        2.3806        3.4526

> CV[%]:
--------
        Estimate      DF CI LCL CI UCL One-Sided LCL One-Sided UCL
total     1.4727 64.7773 1.2570 1.7785        1.2889        1.7242
day       0.5731  1.7497 0.2899 4.5186        0.3233        3.0267
day:run   0.7181  3.3089 0.4148 2.4298        0.4522        1.9410
error     1.1510 40.0000 0.9450 1.4727        0.9749        1.4138


95% Confidence Level  
Satterthwaite methodology used for computing CIs

Variability Chart for Hierarchical Models

        varPlot(result~day/run, Glucose, type=3,
                Title=list(list(main="Variability Chart"),
                list(main="Plot of SD Values")))

Reference

Clinical and Laboratory Standards Institute. 2004. Evaluation of Precision Performance of Quantitative Measurement Methods; Approved Guideline - Second Edition (EP05-A2) http://shop.clsi.org/site/Sample_pdf/EP5A2_sample.pdf

Computing Environment

    opts_knit$set(root.dir = wd)   ##THIS SETS YOUR WORKING DIRECTORY
    sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] VCA_1.2.1    lme4_1.1-10  Matrix_1.2-2 knitr_1.11  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2       lattice_0.20-33   digest_0.6.8      MASS_7.3-45       grid_3.2.2        nlme_3.1-122     
 [7] formatR_1.2.1     magrittr_1.5      evaluate_0.8      stringi_1.0-1     minqa_1.2.4       nloptr_1.0.4     
[13] rmarkdown_0.8.1   splines_3.2.2     tools_3.2.2       stringr_1.0.0     numDeriv_2014.2-1 yaml_2.1.13      
[19] htmltools_0.2.6

    print(wd)

[1] "~/X/DEVELOPMENT/VARIANCE COMPONENTS CI FOR LINEAR COMBINATIONS"

Executed on 07:27:35, Tue, Dec 15 2015. This took 3.84 seconds to execute.