STAT200 HW 8

Problem 1

Part 1

To study the effect of a new treatment, the researchers designed a randomized comparative experiment. 20 patients were randomly assigned to two groups: 10 of them received the treatment while the control group of the other 10 patients received a placebo pill that looked identical. The response variable is some body health index. Here are the data:

##    Control Treatment
## 1    48.86     48.88
## 2    50.60     52.63
## 3    51.02     52.55
## 4    47.99     50.94
## 5    54.20     53.02
## 6    50.66     50.66
## 7    45.91     47.78
## 8    48.79     48.44
## 9    47.76     48.92
## 10   51.13     51.63

Compute the sample mean and sample standard deviation for each group.

MeanC = mean(data1$Control)
MeanC

## [1] 49.692

SDC =  sd(data1$Control)
SDC

## [1] 2.317896

MeanT = mean(data1$Treatment)
MeanT

## [1] 50.545

SDT =  sd(data1$Treatment)
SDT

## [1] 1.92436

Compute the two-sample t statistic, degree of freedom and P-value for the two sided alternative.

Since there is no skew or outliers in our two samples, we can conduct the two-sample t-test.

t.test(data1$Control, data1$Treatment)

## 
##  Welch Two Sample t-test
## 
## data:  data1$Control and data1$Treatment
## t = -0.89538, df = 17.411, p-value = 0.3828
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.859351  1.153351
## sample estimates:
## mean of x mean of y 
##    49.692    50.545

Part 2

To study the effect of a new treatment, the researchers conduct an experiment on 10 patients. The data below shows some health index of the patients measured before and after they received the treatment.

##    Before After
## 1   48.86 48.88
## 2   50.60 52.63
## 3   51.02 52.55
## 4   47.99 50.94
## 5   54.20 53.02
## 6   50.66 50.66
## 7   45.91 47.78
## 8   48.79 48.44
## 9   47.76 48.92
## 10  51.13 51.63

(Hint: Consider the matched pairs t-test.)

Compute the sample mean and variance of the differences.

MeanB = mean(data2$Before)
MeanA = mean(data2$After)

MeanD = MeanB-MeanA
MeanD

## [1] -0.853

SDB = sd(data2$Before)
SDA = sd(data2$After)

SDB

## [1] 2.317896

SDA

## [1] 1.92436

I then hand calculated the variance of the differences to be: .95267

Compute the t statistic, degree of freedom and P-value for the two sided alternative.

t.test(data2$Before, data2$After, paired = TRUE)

## 
##  Paired t-test
## 
## data:  data2$Before and data2$After
## t = -2.1254, df = 9, p-value = 0.06248
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.76087438  0.05487438
## sample estimates:
## mean of the differences 
##                  -0.853

Problem 2

Refer to Problem 1 part 2. Now we would like to use sign test to answer the question that “Does the new treament increase patients’ health index?”

Let \(p\) be the probability that a randomly chosen patient has a higher health index after the treament. State the null and alternative hypothesis in terms of \(p\).

Ho = p = 1/2

Ha = p>1/2

Count the number of patients who have higher health index after the treatment. Also count the number of patients who have no change in the health index.

Amount of Patients who have higher health index after treatment: 7

Amount of Patients who have no change in health index: 1

Compute the P-value of the sign test. (Hint: You may use the function “binom.test” or “pbinom” in R.)

binom.test(7, 10, p=.5, alternative = "greater")

## 
##  Exact binomial test
## 
## data:  7 and 10
## number of successes = 7, number of trials = 10, p-value = 0.1719
## alternative hypothesis: true probability of success is greater than 0.5
## 95 percent confidence interval:
##  0.3933758 1.0000000
## sample estimates:
## probability of success 
##                    0.7

Problem 3

In a survey of 1430 undergraduate students, 1087 reported that they had one or more credit cards.

Give a 90% confidence interval for the proportion of all college students who have at least one credit card.

P(college students with atleast one credit card) = (.7415,.7786885) At a 90 % confidence interval

This was calculated using a z* value of 1.645, a standard error value of .0113, and a p hat of .7601.

Convert your confidence interval in part (a) to percents.

74.15% to 77.87%

Would a 95% confidence interval be wider or narrower than the one you found in part (a)? Verify your results by computing the interval.

It will be wider than my interval in part (a) because my z* value will be larger. P(college students with atleast one credit card) = (.737952, .782248) Calculated at a 95 % confidence interval.

Problem 4

Exergames are active video games such as rhythmic dancing games, virtual bicycles, balance board simulators and virtual sports simulators that require a screen and a console. A study of exergaming practiced by students from grade 10 to 11 , examined many factors related to the participation in exergaming. Of the 358 students who reported that they stressed about their health, 29.9% said that they were interested in exergaming. Of the 851 students who reported that they did not stress about their health, 20.8% said that they were interested in exergaming.

Define the two populations to be compared for this exercise.

The two populations being compared in this exercise are students stressed about their health, and students not stressed about their health.

Complete the following table by replacing “xxx” with your answer.

Population	Sample Size	Sample Proportion
Students who stress about their health	358	.299
Students who do not stress about their health	851	.208

Find the 90% confidence interval for the difference in proportions.

90% Confidence interval for difference in proportions = (.0451, .1369)

This was calculated using a z* value of 1.645, a SE of .0279, and a difference of means of .091

Use a significance test to compare the proportions. State the null and alternative hypothesis, compute the test statistic and the P-value. State your conclusion under the 5% significance level.

Ho = p1-p2 =0 Ha = p1-p2 does not equal 0

Test statistic = 3.41 (.091/.0267) 2P(z>=3.41) = 2(1-.9997) = .0006

Since our p-value of .0006 is smaller than the chosen significance level of .05, we reject the null hypothesis and conclude that the difference in proportions of students who are stressed and not stressed about their health and who are interested in exergaming are not equal to eachother.

Problem 5

One study examined whether or not a sample of children consumed an adequate amount of calcium based on the guidelines provided by the Institute of Medicine. Since there are different guidelines for children aged 5 to 10 years and those aged 11 to 13 years, the children were classified into these two age groups. Each students’ calcium intake was classfied as meeting or not meeting the guideline. There were 2029 children in the study. Here are the data:

Met requirement	5 to 10 years	11 to 13 years
No	194	557
Yes	861	417

We want to compare the extent to which the two age group of children met the calcium intake requirement. Complete the following table to identify the populations, the sample sizes, the count and the sample proportion.

Population	Sample Size	Count of Successes	Sample Proportion
5-10 years	1055	557	.528
11-13 years	974	417	.4281

Do you think it is appropriate to construct large sample confidence interval for comparing the two population proportions? Justify your answer.

When justifying whether it is appopriate to construct a large sample confidence interval, we look to see if it meets the three conditions: independence, random, and normal. The count of successes is greater than 5 for each sample, so normal condition is achieved. Independence is achieved as well because the two samples are from children of different ages. There is nothing said about the samples of children being random but if we assume that is true, and that these samples were srs, then we can go ahead with construction a large confidence interval. If they are not srs, then it would not be appopriate.

Use the 99% confidence interval for the comparison.

99% confidence interval = (.042937, .15687)

Use a significant test to make the comparison. Compute the test statistic and the P-value. State your conclusion under the 5% significance level.

Test statistic = 4.75 (.0999/.021) 2P(z>4.75)= nearly 0 Since our p-value is nearly zero, which is less than the tested significance level of .05, we reject the null hypothesis and conclude that there is a difference in proportions between the populations of children between the ages of 5-10 and ages 11-13 in terms of how many meet required calcium intake.