Testing (Contd.)

Significance Tests Comparing Means

Example: Comparing a Therapy to a Control Group

Anor<-read.table("http://stat4ds.rwth-aachen.de/data/Anorexia.dat", header=TRUE)
cogbehav<-Anor$after[Anor$therapy=="cb"]-Anor$before[Anor$therapy=="cb"]
control<-Anor$after[Anor$therapy=="c"]-Anor$before[Anor$therapy=="c"]

t.test(cogbehav,control,var.equal=TRUE)

    Two Sample t-test

data:  cogbehav and control
t = 1.676, df = 53, p-value = 0.09963
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.680137  7.593930
sample estimates:
mean of x mean of y 
 3.006897 -0.450000 
t.test(cogbehav,control)

    Welch Two Sample t-test

data:  cogbehav and control
t = 1.6677, df = 50.971, p-value = 0.1015
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.7044632  7.6182563
sample estimates:
mean of x mean of y 
 3.006897 -0.450000 

Quick Summarization

## Bayesian Comparison of Therapy and Control Groups
fit<-lm(after-before~factor(therapy), data=Anor[-(30:46),]) #deletingobs's30-46
summary(fit)

Call:
lm(formula = after - before ~ factor(therapy), data = Anor[-(30:46), 
    ])

Residuals:
    Min      1Q  Median      3Q     Max 
-12.107  -4.678  -1.307   3.500  17.893 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)  
(Intercept)         -0.450      1.498  -0.300   0.7650  
factor(therapy)cb    3.457      2.063   1.676   0.0996 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.637 on 53 degrees of freedom
Multiple R-squared:  0.05033,   Adjusted R-squared:  0.03241 
F-statistic: 2.809 on 1 and 53 DF,  p-value: 0.09963
library(MCMCpack)
Loading required package: coda
Loading required package: MASS
##
## Markov Chain Monte Carlo Package (MCMCpack)
## Copyright (C) 2003-2024 Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park
##
## Support provided by the U.S. National Science Foundation
## (Grants SES-0350646 and SES-0350613)
##
fit.bayes<-MCMCregress(after-before ~ factor(therapy), mcmc=10000000, b0=0, B0=10^{-15}, c0=10^{-15}, d0=10^{-15}, data=Anor[-(30:46),])
# mean has normal prior dist. with mean b0=0, variance 1/B0(std.dev.>31million)
# variance has highly disperse inverse gamma prior distribution (tiny c0 and d0)
summary(fit.bayes)

Iterations = 1001:10001000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 1e+07 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                     Mean     SD  Naive SE Time-series SE
(Intercept)       -0.4494  1.526 0.0004827      0.0004824
factor(therapy)cb  3.4564  2.102 0.0006647      0.0006647
sigma2            60.6091 12.242 0.0038714      0.0040186

2. Quantiles for each variable:

                     2.5%    25%     50%     75%  97.5%
(Intercept)       -3.4514 -1.467 -0.4501  0.5676  2.553
factor(therapy)cb -0.6794  2.056  3.4562  4.8565  7.591
sigma2            41.2108 51.924 59.0672 67.5765 88.874
mean(fit.bayes[,2]<=0)
[1] 0.049846

Significance Tests Comparing Proportions

Example: Comparing Prayer and Non-Prayer Surgery Patients

prop.test(c(315,304),c(604,597),correct=FALSE)

    2-sample test for equality of proportions without continuity correction

data:  c(315, 304) out of c(604, 597)
X-squared = 0.18217, df = 1, p-value = 0.6695
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.04421536  0.06883625
sample estimates:
   prop 1    prop 2 
0.5215232 0.5092127 
# Bayesian Inference for Comparing Two Proportions
pi1=rbeta(10000000,316,290); pi2 =rbeta(10000000,305,294)
quantile(pi1-pi2,c(0.025,0.975))
       2.5%       97.5% 
-0.04415002  0.06861699 
mean(pi1<pi2);mean(pi1> pi2) #approximateposteriorP(pi1<pi2),P(pi1>pi2)
[1] 0.3347285
[1] 0.6652715

Chi-Squared Tests for Multiple Proportions in Contingency Tables

Pearson’s chi-squared test - Wikipedia

Example: Happiness and Marital Status

Happy<-read.table("http://stat4ds.rwth-aachen.de/data/Happy.dat", header=TRUE)
# To construct contingency tables, define variables as factors
Happiness<-factor(Happy$happiness);Marital<-factor(Happy$marital)
levels(Happiness)<-c("Veryhappy","Prettyhappy","Not too happy")
levels(Marital)<-c("Married","Divorced/Separated","Nevermarried")

table(Marital,Happiness) # forms contingency table
                    Happiness
Marital              Veryhappy Prettyhappy Not too happy
  Married                  432         504            61
  Divorced/Separated        92         282           103
  Nevermarried             124         409           135
prop.table(table(Marital,Happiness),1) # proportions within rows (each row total=1)
                    Happiness
Marital               Veryhappy Prettyhappy Not too happy
  Married            0.43329990  0.50551655    0.06118355
  Divorced/Separated 0.19287212  0.59119497    0.21593291
  Nevermarried       0.18562874  0.61227545    0.20209581
chisq.test(Marital,Happiness)$expected # expected frequencies under H0: independence
                    Happiness
Marital              Veryhappy Prettyhappy Not too happy
  Married             301.6134    556.2162     139.17040
  Divorced/Separated  144.3025    266.1134      66.58403
  Nevermarried        202.0840    372.6704      93.24556
chisq.test(Marital,Happiness) # chi-squared test of independence

    Pearson's Chi-squared test

data:  Marital and Happiness
X-squared = 197.41, df = 4, p-value < 2.2e-16

Quick Summarization

Standardized Residuals: Describing the Nature of an Association

stdres1 <- chisq.test(Marital, Happiness)$stdres
stdres1 # standardized residuals
                    Happiness
Marital              Veryhappy Prettyhappy Not too happy
  Married            12.295576   -4.554333     -9.770639
  Divorced/Separated -5.913202    1.661245      5.457032
  Nevermarried       -7.928512    3.411881      5.619486
levels(Happiness)<-c("Very", "Pretty", "Nottoo")
levels(Marital)<-c("Married", "Div/Sep", "Never")

#library(vcd)
#mosaic(table(Marital, Happiness), gp=shading_Friendly, residuals=stdres1, residuals_type="Standardized\nresiduals", labeling=labeling_residuals)

Significance Test Decisions and Errors

Type I and Type II Errors

Power of a Test

Exercises