Confidence intervals in R

Install library DescTools and load it:

install.packages("DescTools")

library(DescTools)

Confidence intervals for proportions

First, let us consider an abstract example so as to look at different effects connected with confidence intervals (sample size effect and confidence level effect). Suppose we tossed a coin 20 times and got 4 heads.

nheads <- 4 # number of heads
n1 <- 20  # total number of tosses

Now let’s calculate a 95% confidence interval for the proportion of heads in such an experiment.

BinomCI(nheads, n1) # 95% by default

##      est     lwr.ci    upr.ci
## [1,] 0.2 0.08065766 0.4160174

Calculate the length of a confidence interval:

ci.95 <- BinomCI(nheads, n1)
ci.95[3] - ci.95[2]

## [1] 0.3353598

Increase the number of tosses (number of heads remain the same):

n2 <- 40 # now 40 tosses
ci.95.2 <- BinomCI(nheads, n2)
ci.95.2[3] - ci.95.2[2]  # it shrinked

## [1] 0.1909382

Keep the number of tosses equal to 20, but increase the confidence level:

ci.99 <- BinomCI(nheads, n1, conf.level = 0.99)
ci.99[3] - ci.99[2] # it extended

## [1] 0.4263411

Now let’s proceed to real data and work with Verses data set.

verses <- read.csv("https://raw.githubusercontent.com/LingData2019/LingData/master/data/poetry_last_in_lines.csv", sep = "\t")
str(verses) # recall which variables are there

## 'data.frame':    364 obs. of  6 variables:
##  $ Decade      : Factor w/ 2 levels "1820s","1920s": 1 1 1 1 1 1 1 1 1 1 ...
##  $ RhymedNwords: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ RhymedNsyl  : int  1 1 1 1 1 1 1 2 2 2 ...
##  $ UPoS        : Factor w/ 11 levels "ADJ","ADP","ADV",..: 6 6 6 6 9 9 9 1 1 1 ...
##  $ LineText    : Factor w/ 364 levels "-- Воронский, Воронский в подряснике старом,",..: 7 36 63 172 178 275 310 59 66 209 ...
##  $ Author      : Factor w/ 364 levels "А. А. Ахматова",..: 97 202 122 299 61 149 6 24 219 209 ...

Calculate a confidence interval for the proportion of nouns at the end of lines:

nnouns <- nrow(verses[verses$UPoS == "NOUN", ])
total <- nrow(verses)

BinomCI(nnouns, total)

##            est    lwr.ci    upr.ci
## [1,] 0.6098901 0.5588825 0.6586025

Confidence intervals for means

Now let’s work with a data set on Icelandic language from our previuos class.

phono <- read.csv("http://math-info.hse.ru/f/2018-19/ling-data/icelandic.csv")

Choose aspirated and non-aspirated cases again:

asp <- phono[phono$aspiration == "yes", ]
nasp <- phono[phono$aspiration == "no", ]

Calculate confidence intervals for mean values of vowel duration in each group:

MeanCI(asp$vowel.dur)

##     mean   lwr.ci   upr.ci 
## 78.75772 76.68274 80.83270

MeanCI(nasp$vowel.dur)

##     mean   lwr.ci   upr.ci 
## 94.69124 92.15292 97.22957

Plot them using sciplot:

# install.packages("sciplot")
library(sciplot)
lineplot.CI(data = phono, 
            response = vowel.dur, 
            x.factor = aspiration)

Confidence intervals and significance of differences

If two CI’s for a population parameter (proportion, mean, median, etc) do not overlap, it means that true values of population parameters are significantly different.
If two CI’s for a population parameter overlap, true values of population parameters are likely to coincide (to be equal to each other), but not necessarily do so. For example, if two confidence intervals for means overlap, we cannot make a definite conclusion, more accurate testing is required (t-test). So, in general, comparison of confidence intervals (with the same confidence level, of course) is not equivalent to hypotheses testing.

Consider a case when two CI’s for means overlap, but population means are significantly different. Let’s select only cases with aspirated consonants and compare the average vowel duration for round and unrounded vowels.

w1 <- phono[phono$aspiration == 'yes' & phono$roundness == "round", ]
w2 <- phono[phono$aspiration == 'yes' & phono$roundness == "unrounded", ]

Do CI’s overlap?

MeanCI(w1$vowel.dur)

##     mean   lwr.ci   upr.ci 
## 81.74052 77.89567 85.58537

MeanCI(w2$vowel.dur)

##     mean   lwr.ci   upr.ci 
## 76.90839 74.54499 79.27179

Can we conclude that mean vowel duration is different for round and unrounded vowels?

Now: perform an accurate test, a two sample Student’s t-test.

# reject or not reject H0
t.test(w1$vowel.dur, w2$vowel.dur)

## 
##  Welch Two Sample t-test
## 
## data:  w1$vowel.dur and w2$vowel.dur
## t = 2.1134, df = 269.27, p-value = 0.03549
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.3304964 9.3337590
## sample estimates:
## mean of x mean of y 
##  81.74052  76.90839

Null hypotheses should be rejected, so population means are different.

Actually, testing hypothesis about the equality of population means is equivalent to finding whether a CI for the difference of means includes zero.

# CI for difference between means
MeanDiffCI(w1$vowel.dur, w2$vowel.dur)

##  meandiff    lwr.ci    upr.ci 
## 4.8321277 0.3304964 9.3337590

So, intersection of CI’s for means (or for any population parameters) \(\ne\) CI for the difference includes zero \(\ne\) \(H_0\) about equality should not be rejected.