Import the BES2015 dataset. Use either read.csv or load (depending which file you are importing)
load("C:/Users/poppr/Dropbox (CEMAP)/POL1041_SOC1041/lab9/BES2015.Rda")
Information about the data http://www.britishelectionstudy.com/data-object/post-election-wave-6-of-the-2014-2017-british-election-study-internet-panel/
Imagine that we want to know if the attitude to the duty to vote (dutyToVote2) is different between men and women (gender)?
H0: There is no difference in male and female assessment of duty to vote in the population. H0: mu(men) - mu(women) = 0 H1: There is a difference in male and female assessment of duty to vote in the population. H1: mu(men) - mu(women) != 0
Step 1: specify alpha alpha = 0.05
Step 2: calculate the test statistic and look up the resulting p-value
# t = ((x_bar1 - x_bar2) - 0)/sqrt((S_1^2)/n1 + (s_2^2/n2))
# t =
We need the degres of freedom
# df = ((s1^2/n1) + (s2^2)/n2)^2) / 1/(n1-1)*((s1^2)/n1)^2 + 1/(n2-1)*((s2^2)/n2)^2
# df =
Lastly, we look up the p-value
Step 3: Draw your conclusion based on the p-value Is the p-value less than the error level?
Alternatively, use t.test()
t.test(BES2015$dutyToVote2 ~ BES2015$gender)
##
## Welch Two Sample t-test
##
## data: BES2015$dutyToVote2 by BES2015$gender
## t = -9.8707, df = 18336, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.13465320 -0.09003539
## sample estimates:
## mean in group 1 mean in group 2
## 4.311263 4.423608
Imagine now that we want to test if men’s belief that voting is a civic duty is weaker than women’s belief in the population (dutyToVote2, gender).
H0: mu(men) - mu(women) >= 0 H1: mu(men) - mu(women) < 0
Step 1: specify alpha alpha = 0.05
Step 2: calculate the test statistic and look up the resulting p-value
# t = ((x_bar1 - x_bar2) - 0)/sqrt((S_1^2)/n1 + (s_2^2/n2))
# t =
We need the degrees of freedom
# df = ((s1^2/n1) + (s2^2)/n2)^2) / 1/(n1-1)*((s1^2)/n1)^2 + 1/(n2-1)*((s2^2)/n2)^2
# df =
Lastly, we look up the p-value
Step 3: Draw your conclusion based on the p-value Is the p-value less than the error level?
Alternatively, use t.test()
t.test(BES2015$dutyToVote2 ~ BES2015$gender, alternative="less")
##
## Welch Two Sample t-test
##
## data: BES2015$dutyToVote2 by BES2015$gender
## t = -9.8707, df = 18336, p-value < 2.2e-16
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf -0.09362237
## sample estimates:
## mean in group 1 mean in group 2
## 4.311263 4.423608
Imagine that you want to know if there is a difference between younger and older voters regarding the level of trust towards MPs.
H0: mu(old)-mu(young)=0 H1: mu(old)-mu(young)!=0
Generate a new variable called young equal to 1 if Age is lower or equal to 34 and 0 otherwise.
BES2015$young<-ifelse(BES2015$Age<=34,1,0)
Step 1: specify alpha alpha = 0.05
Step 2: calculate the test statistic and look up the resulting p-value
# t = ((x_bar1 - x_bar2) - 0)/sqrt((S_1^2)/n1 + (s_2^2/n2))
# t =
We need the degres of freedom
# df = ((s1^2/n1) + (s2^2)/n2)^2) / 1/(n1-1)*((s1^2)/n1)^2 + 1/(n2-1)*((s2^2)/n2)^2
# df =
Lastly, we look up the p-value
Step 3: Draw your conclusion based on the p-value Is the p-value less than the error level?
Alternatively, use t.test()
t.test(BES2015$trustMPs~BES2015$young)
##
## Welch Two Sample t-test
##
## data: BES2015$trustMPs by BES2015$young
## t = -0.39597, df = 4700.9, p-value = 0.6921
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.06728258 0.04467073
## sample estimates:
## mean in group 0 mean in group 1
## 3.638149 3.649455
Time magazine reported the result of a telephone poll of 800 adults on taxes for cigarettes. The respondets were asked whether the tax on cigarettes should be raised to pay for health care reform?" The results of the survey were: non-smokers: 605 respondents, 351 said ‘yes’, p1=0.58 smokers: 195 respondents, 41 said ‘yes’, p2=0.21
H0: p1 - p2 = 0 or p1 = p2 H1: p1 - p2 != 0 or p1 != p2
Step 1: specify alpha
alpha = 0.05
We need a pooled sample proportion
# p_hat = (41+351)/(195+605)
# p_hat = 392/800
# p_hat = 0.49
Then we caculate the pooled standard error under the null hypothesis
#se = sqrt(p_hat(1-p_hat)(1/n1 + 1/n2))
#se = sqrt(0.49(0.51)(1/195 + 1/605))
# se = sqrt(0.2499 * 0.005128205 +0.001652893)
# se = sqrt (0.2499 * 0.006781098)
#se = 0.04116547
Step 2: calculate the test statistic and look up the resulting p-value (p1 - p2)/se
#0.58-0.21
#0.37/se
#0.37/0.04116547
Test-statistic is 8.988116
Lastly, we look up the p-value
1 - pnorm(8.988116)
## [1] 0
Step 3: Draw your conclusion based on the p-value Is the p-value less than the error level?
Alternatively, prop.test()
prop.test(x = c(41, 195), n = c(351, 605), correct = FALSE)
##
## 2-sample test for equality of proportions without continuity
## correction
##
## data: c(41, 195) out of c(351, 605)
## X-squared = 50.457, df = 1, p-value = 1.218e-12
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.2556645 -0.1553454
## sample estimates:
## prop 1 prop 2
## 0.1168091 0.3223140