Research on peer pressure has shown that the mean influence score for a scale of peer pressure is 520 with a standard deviation of 80. An investigator would like to show that a minor change in conditions will produce scores with a mean of only 500, and he plans to run a t test to compare his sample mean with a population mean of 520. He plans to collect a sample of size n = 100. * a. What is the effect size? * b. What is the power of the test? * c. What sample size is needed to raise the power to .70, .80, .90?
library(pwr)
x1<-pwr.t.test(d=(520-500)/80,n=100,sig.level=0.05, type="paired",alternative="two.sided")
Effect size - 100
Power of the test - 0.6969
Sample size needed to raise power to .7,.8,.9 -
x2<-pwr.t.test(d=(520-500)/80,power=0.7,sig.level=0.05, type="paired",alternative="two.sided")
x3<-pwr.t.test(d=(520-500)/80,power=0.8,sig.level=0.05, type="paired",alternative="two.sided")
x4<-pwr.t.test(d=(520-500)/80,power=0.9,sig.level=0.05, type="paired",alternative="two.sided")
We can see we need sample size = 100,127,170 respectively. The below plot illustrates this further.
samplesizes<-c(x2$n,x3$n,x4$n)
powers<-c(x2$power,x3$power,x4$power)
plot(y=samplesizes,x=powers,main = "Plot of samplesize vs. power")
library(MASS)
library(ggplot2)
library(dplyr)
df1<-birthwt %>% select(race,smoke,ptl,ht,ui,ftv)
df1[,] <- lapply(df1[,] , factor)
birthwt$ui<-as.logical(birthwt$ui)
birthwt$ht<-as.logical(birthwt$ht)
We load the dataset birthwt and extract two variables - race,smoke, hypertension(ht) and urinary infection(UI) for consideration.
ggplot(birthwt, aes(race,bwt))+geom_boxplot(fill="firebrick")+facet_wrap(~race, scales = "free")+labs(x="Mother's race (1 = white, 2 = black, 3 = other)", y="Birth Weight (gms)")
As visible from the plot, the median birth weight for all three categories - white, black, other are quite similar. The maximum and minimum spread is the most for white race.
We also examine the relation between birth weight and urinary infection.
ggplot(birthwt, aes(ui,bwt))+geom_boxplot(fill="chartreuse4")+facet_wrap(~ui)+labs(x="Uterine Irritability", y="Birth Weight (gms)")
As obvious, birth weight is higher for mothers without any urinary infection.
ggplot(birthwt, aes(ht,bwt))+geom_boxplot(fill="blue")+facet_wrap(~ht)+labs(x="Hyper Tension", y="Birth Weight (gms)")
From above plot, we say that birth weight is directly related to wether or not the mother was pre diagnosed for hypertension.
t.test(birthwt$bwt~df1$ui)
##
## Welch Two Sample t-test
##
## data: birthwt$bwt by df1$ui
## t = 3.8615, df = 35.696, p-value = 0.000455
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 275.8913 886.6553
## sample estimates:
## mean in group 0 mean in group 1
## 3030.702 2449.429
Looking at the above results, the p value (<0.05) indicates that we can reject the null hypothesis. The mean weight of the baby when no urinary infection in mother = 3030.702 gms as compared to a low 2449.429 gms from a mother with diagnosed urinary infection.
t.test(birthwt$bwt~df1$ht)
##
## Welch Two Sample t-test
##
## data: birthwt$bwt by df1$ht
## t = 1.6118, df = 11.909, p-value = 0.1332
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -153.6751 1024.4717
## sample estimates:
## mean in group 0 mean in group 1
## 2972.232 2536.833
From the above t-test, we can say with a high p-value, we cannot reject the null hypothesis, hence we cannot determine the mean difference between the two groups.
Conducting test on Urinary Infection data:
z<-sd(birthwt$ui)
pwr.t.test(d=(3030.702-2449.429)/z,power=1,sig.level=0.05)
##
## Two-sample t test power calculation
##
## n = 2
## d = 1631.917
## sig.level = 0.05
## power = 1
## alternative = two.sided
##
## NOTE: n is number in *each* group