Calculating a p-value from Confidence Intervals

Jared Parrish
2020-10-07

Future methods groups

We will be reviewing the online modules from BU

BU Online MPH Learning Modules

We will first start with the Epidemiology and then move to the Biostatistics Modules.

  • We invite you to review the modules prior to our group meeting
  • We will send out a schedule
  • We will be looking for volunteers to cover a section

Confidence Interval overlap

Confidence Intervals are incredibly informative by quantifying the degree of uncertainty as well as facilitate making comparisons.

When used to compare estimates between strata/groups as a “crude” homogeneity test the general rules are:

  1. Two 95% Confidence intervals that do NOT overlap: P < 0.05
  2. Two 95% Confidence intervals overlap but neither interval contains the other point estimate: P to 0.05
  3. Two 95% Confidence intervals and at least one include the other point estimate: P > 0.05

The Problem

The visual “test of homogeneity” comes at a cost

  • Reduced ability to detect differences (type II error)
  • We're comparing CI"s for estimates in each group opposed to differences between the point estimate of interest.

  • Goldstein and Healy (1995) suggest using 83% Confidence intervals for each group to represent a 95% significant difference between two estimates.

A better solution

If we only have the point estimate and confidence interval for each group we can use the CI's to estimate the standard error for each group and conduct a homogeneity test under the form:

\[ \mathbf{Z}_{homog} = \frac{abs{(\hat{\mu}_1 - \hat{\mu}_2)}}{\sqrt{S^2_1 + S^2_2}} \]

  • Two sided P-value = 2x tail area of the Z distribution
  • Z-test has a single critical value (1.96 for 5% two tailed)

A quick refresh - normal distribution

plot of chunk unnamed-chunk-1

  • p = 0.025 for each side of the distribution
  • 2*round(pnorm(1.96, lower.tail = FALSE), 3) = 0.05

A worked example

These data are taken from a preterm birth report that WCFH is working on.

Year PretermBirths NonPretermBirths Births proportion lower upper conf.level
2010 944 10560 11504 0.08206 0.07704 0.08707 0.95
2011 1010 10457 11467 0.08808 0.08289 0.09327 0.95
2012 853 10352 11205 0.07613 0.07122 0.08104 0.95
2013 973 10498 11471 0.08482 0.07972 0.08992 0.95
2014 966 10462 11428 0.08453 0.07943 0.08963 0.95
2015 1010 10315 11325 0.08918 0.08393 0.09443 0.95
2016 1007 10238 11245 0.08955 0.08427 0.09483 0.95
2017 946 9550 10496 0.09013 0.08465 0.09561 0.95
2018 937 9183 10120 0.09259 0.08694 0.09824 0.95
2019 956 8906 9862 0.09694 0.09110 0.10278 0.95

Credit: Rachel Gallegos - PHAP fellow assigned to WCFH/MCH-Epidemiology

Trendline

plot of chunk unnamed-chunk-3

Compare 2019 to 2018

plot of chunk unnamed-chunk-4

Compare 2019 to 2018

$data
      NonPretermBirths PretermBirths Total
2018              9183           937 10120
2019              8906           956  9862
Total            18089          1893 19982

$measure
                        NA
risk ratio with 95% C.I. estimate     lower   upper
                    2018 1.000000        NA      NA
                    2019 1.046969 0.9609563 1.14068

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2018         NA           NA         NA
     2019  0.2940939    0.2989286  0.2939509

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

CI for proportion

Remember that for a proportion, the CI is calculated as approximately:

\[ \hat{p} \pm z^{*} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

Note: proportions are bound between 0 and 1 and rates 0 and infinity.

so we need to adjust our z formula as:

\[ \mathbf{Z}_{homog} = \frac{abs{(\log{\hat{\mu}_1} - \log{\hat{\mu}_2)}}}{\sqrt{S^2_1 + S^2_2}} \]

were S =

\[ S_{0} = \frac{\log{(upperCI)} - \log{(lowerCI)}}{1.96*2} \]

\[ S_{1} = \frac{\log{(upperCI)} - \log{(lowerCI)}}{1.96*2} \]

Given we had this level precision

   Year proportion      lower      upper
9  2018 0.09258893 0.08694165 0.09823621
10 2019 0.09693774 0.09109831 0.10277718
s0 <- (log(0.09823621) - log(0.08694165))/3.92 #1.96*2
s1 <- (log(0.1027772) - log(0.09109831))/3.92

zh <- abs(log(0.09258893) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)
[1] 0.2945752
         NA
two-sided midp.exact fisher.exact chi.square
     2018         NA           NA         NA
     2019  0.2940939    0.2989286  0.2939509

Compare 2019 to 2015

plot of chunk unnamed-chunk-9

Compare 2019 to 2015 cont.

$data
      NonPretermBirths PretermBirths Total
2015             10315          1010 11325
2019              8906           956  9862
Total            19221          1966 21187

$measure
                        NA
risk ratio with 95% C.I. estimate     lower    upper
                    2015  1.00000        NA       NA
                    2019  1.08695 0.9991566 1.182458

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2015         NA           NA         NA
     2019 0.05251294   0.05451793 0.05232018

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

Compare 2019 to 2015 cont.

t2p <- dat1[c(6,10),c(1,5:7)]
t2p 
   Year proportion      lower      upper
6  2015 0.08918322 0.08393411 0.09443234
10 2019 0.09693774 0.09109831 0.10277718
s0 <- (log(0.09443234) - log(0.08393411))/3.92 
s1 <- (log(0.10277718) - log(0.09109831))/3.92

zh <- abs(log(0.08918322) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)
[1] 0.052615
         NA
two-sided midp.exact fisher.exact chi.square
     2015         NA           NA         NA
     2019 0.05251294   0.05451793 0.05232018

Finally lets compare 2019 to 2011

plot of chunk unnamed-chunk-13

Compare 2019 to 2011 cont.

$data
      NonPretermBirths PretermBirths Total
2011             10457          1010 11467
2019              8906           956  9862
Total            19363          1966 21329

$measure
                        NA
risk ratio with 95% C.I. estimate    lower    upper
                    2011 1.000000       NA       NA
                    2019 1.100579 1.011659 1.197315

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2011         NA           NA         NA
     2019 0.02590005   0.02725046 0.02575092

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

Compare 2019 to 2011 cont.

t3p <- dat1[c(2,10),c(1,5:7)]
t3p
   Year proportion      lower      upper
2  2011 0.08807883 0.08289158 0.09326609
10 2019 0.09693774 0.09109831 0.10277718
s0 <- (log(0.09326609) - log(0.08289158))/3.92
s1 <- (log(0.10277718) - log(0.09109831))/3.92

zh <- abs(log(0.08807883) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)
[1] 0.02594365
         NA
two-sided midp.exact fisher.exact chi.square
     2011         NA           NA         NA
     2019 0.02590005   0.02725046 0.02575092

Wrap-up

We can't always rely on confidence interval overlap when making statements about “significance”

  • This is a simple method for testing when only provided point estimates and CI's
  • Will work with weighted estimates as well (we're back-calculating the weighted se)
  • Caution: We often have rounded point estimates so our p-values won't be as close as my examples

Comments/Questions?