Calculating a p-value from Confidence Intervals

Jared Parrish
2020-10-07

Future methods groups

We will be reviewing the online modules from BU

We will first start with the Epidemiology and then move to the Biostatistics Modules.

We invite you to review the modules prior to our group meeting
We will send out a schedule
We will be looking for volunteers to cover a section

Confidence Interval overlap

Confidence Intervals are incredibly informative by quantifying the degree of uncertainty as well as facilitate making comparisons.

When used to compare estimates between strata/groups as a “crude” homogeneity test the general rules are:

Two 95% Confidence intervals that do NOT overlap: P < 0.05
Two 95% Confidence intervals overlap but neither interval contains the other point estimate: P to 0.05
Two 95% Confidence intervals and at least one include the other point estimate: P > 0.05

The Problem

The visual “test of homogeneity” comes at a cost

Reduced ability to detect differences (type II error)
We're comparing CI"s for estimates in each group opposed to differences between the point estimate of interest.
Goldstein and Healy (1995) suggest using 83% Confidence intervals for each group to represent a 95% significant difference between two estimates.

A better solution

If we only have the point estimate and confidence interval for each group we can use the CI's to estimate the standard error for each group and conduct a homogeneity test under the form:

\[ \mathbf{Z}_{homog} = \frac{abs{(\hat{\mu}_1 - \hat{\mu}_2)}}{\sqrt{S^2_1 + S^2_2}} \]

Two sided P-value = 2x tail area of the Z distribution
Z-test has a single critical value (1.96 for 5% two tailed)

A quick refresh - normal distribution

plot of chunk unnamed-chunk-1

p = 0.025 for each side of the distribution
2*round(pnorm(1.96, lower.tail = FALSE), 3) = 0.05

A worked example

These data are taken from a preterm birth report that WCFH is working on.

Year	PretermBirths	NonPretermBirths	Births	proportion	lower	upper	conf.level
2010	944	10560	11504	0.08206	0.07704	0.08707	0.95
2011	1010	10457	11467	0.08808	0.08289	0.09327	0.95
2012	853	10352	11205	0.07613	0.07122	0.08104	0.95
2013	973	10498	11471	0.08482	0.07972	0.08992	0.95
2014	966	10462	11428	0.08453	0.07943	0.08963	0.95
2015	1010	10315	11325	0.08918	0.08393	0.09443	0.95
2016	1007	10238	11245	0.08955	0.08427	0.09483	0.95
2017	946	9550	10496	0.09013	0.08465	0.09561	0.95
2018	937	9183	10120	0.09259	0.08694	0.09824	0.95
2019	956	8906	9862	0.09694	0.09110	0.10278	0.95

Credit: Rachel Gallegos - PHAP fellow assigned to WCFH/MCH-Epidemiology

Trendline

plot of chunk unnamed-chunk-3

Compare 2019 to 2018

plot of chunk unnamed-chunk-4

Compare 2019 to 2018

$data
      NonPretermBirths PretermBirths Total
2018              9183           937 10120
2019              8906           956  9862
Total            18089          1893 19982

$measure
                        NA
risk ratio with 95% C.I. estimate     lower   upper
                    2018 1.000000        NA      NA
                    2019 1.046969 0.9609563 1.14068

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2018         NA           NA         NA
     2019  0.2940939    0.2989286  0.2939509

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

CI for proportion

Remember that for a proportion, the CI is calculated as approximately:

\[ \hat{p} \pm z^{*} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

Note: proportions are bound between 0 and 1 and rates 0 and infinity.

so we need to adjust our z formula as:

\[ \mathbf{Z}_{homog} = \frac{abs{(\log{\hat{\mu}_1} - \log{\hat{\mu}_2)}}}{\sqrt{S^2_1 + S^2_2}} \]

were S =

\[ S_{0} = \frac{\log{(upperCI)} - \log{(lowerCI)}}{1.96*2} \]

\[ S_{1} = \frac{\log{(upperCI)} - \log{(lowerCI)}}{1.96*2} \]

Given we had this level precision

   Year proportion      lower      upper
9  2018 0.09258893 0.08694165 0.09823621
10 2019 0.09693774 0.09109831 0.10277718

s0 <- (log(0.09823621) - log(0.08694165))/3.92 #1.96*2
s1 <- (log(0.1027772) - log(0.09109831))/3.92

zh <- abs(log(0.09258893) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)

[1] 0.2945752

         NA
two-sided midp.exact fisher.exact chi.square
     2018         NA           NA         NA
     2019  0.2940939    0.2989286  0.2939509

Compare 2019 to 2015

plot of chunk unnamed-chunk-9

Compare 2019 to 2015 cont.

$data
      NonPretermBirths PretermBirths Total
2015             10315          1010 11325
2019              8906           956  9862
Total            19221          1966 21187

$measure
                        NA
risk ratio with 95% C.I. estimate     lower    upper
                    2015  1.00000        NA       NA
                    2019  1.08695 0.9991566 1.182458

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2015         NA           NA         NA
     2019 0.05251294   0.05451793 0.05232018

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

Compare 2019 to 2015 cont.

t2p <- dat1[c(6,10),c(1,5:7)]
t2p

   Year proportion      lower      upper
6  2015 0.08918322 0.08393411 0.09443234
10 2019 0.09693774 0.09109831 0.10277718

s0 <- (log(0.09443234) - log(0.08393411))/3.92 
s1 <- (log(0.10277718) - log(0.09109831))/3.92

zh <- abs(log(0.08918322) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)

[1] 0.052615

         NA
two-sided midp.exact fisher.exact chi.square
     2015         NA           NA         NA
     2019 0.05251294   0.05451793 0.05232018

Finally lets compare 2019 to 2011

plot of chunk unnamed-chunk-13

Compare 2019 to 2011 cont.

$data
      NonPretermBirths PretermBirths Total
2011             10457          1010 11467
2019              8906           956  9862
Total            19363          1966 21329

$measure
                        NA
risk ratio with 95% C.I. estimate    lower    upper
                    2011 1.000000       NA       NA
                    2019 1.100579 1.011659 1.197315

$p.value
         NA
two-sided midp.exact fisher.exact chi.square
     2011         NA           NA         NA
     2019 0.02590005   0.02725046 0.02575092

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

Compare 2019 to 2011 cont.

t3p <- dat1[c(2,10),c(1,5:7)]
t3p

   Year proportion      lower      upper
2  2011 0.08807883 0.08289158 0.09326609
10 2019 0.09693774 0.09109831 0.10277718

s0 <- (log(0.09326609) - log(0.08289158))/3.92
s1 <- (log(0.10277718) - log(0.09109831))/3.92

zh <- abs(log(0.08807883) - log(0.09693774))/(sqrt(s0^2 + s1^2))
2*pnorm(zh, lower.tail = F)

[1] 0.02594365

         NA
two-sided midp.exact fisher.exact chi.square
     2011         NA           NA         NA
     2019 0.02590005   0.02725046 0.02575092

Wrap-up

We can't always rely on confidence interval overlap when making statements about “significance”

This is a simple method for testing when only provided point estimates and CI's
Will work with weighted estimates as well (we're back-calculating the weighted se)
Caution: We often have rounded point estimates so our p-values won't be as close as my examples

Calculating a p-value from Confidence Intervals

Future methods groups

Confidence Interval overlap

The Problem

A better solution

A quick refresh - normal distribution

A worked example

Trendline

Compare 2019 to 2018

Compare 2019 to 2018

CI for proportion

Given we had this level precision

Compare 2019 to 2015

Compare 2019 to 2015 cont.

Compare 2019 to 2015 cont.

Finally lets compare 2019 to 2011

Compare 2019 to 2011 cont.

Compare 2019 to 2011 cont.

Wrap-up

Comments/Questions?