Confidence Interval - Lab 5

Tam, Richie, Nick

TVAScore <- c(20,27,32,35,40,44,49,53,54,54,55,57,61,64,64,66,69,70,72,72,73,75,77,77,78,79,79,82,85,86,90,97,98,101,107,114,115,115,119,120)

Below, we plotted our data set on a histogram. It appears to have a normal distribution as it is symmetric and bell-shaped.

hist(TVAScore)

Calculate the mean and standard deviation for the sample.

# calculate the mean and the standard deviation
TVAScore %>% 
  as.data.frame() %>% 
  summarise(mean = mean(TVAScore),
            sd = sd(TVAScore))

##     mean       sd
## 1 73.125 25.88454

The results show that the mean of the sample is 73 and the SD of the sample is 26.

If the true standard deviation for the population from which this sample was drawn is 12, compute the standard deviation of the sampling distribution.

The SD of the sampling distribution is computed by dividing the SD (12) by the square root of the sample size (40):

sdSample <- 12/(sqrt(40))

The standard deviation of the sampling distribution is 1.9

Calculate a 95% confidence interval for the true population mean of these scores. In other words, within what range of scores would you be 95% confident that the true mean score for the population lies?

The 95% confidence interval for the true population mean is given by the formula: xbar +/- z*(o/sqrt(n)), where

1.960*(sdSample/(sqrt(40)))

## [1] 0.588

From the result above, the 95% confidence interval is 73.1 +/-0.6.

Are your results trustworthy? Why or why not?

We had a relatively large number of samples (40), which increases how trustworthy our data is. However, the disparity between the standard deviation of the sample and the true standard deviation of the population makes our calculated results less trustworthy.

Now you give the TVA to a larger sample of 500 young adults. The mean score of your new, larger sample is 70.

If the true standard deviation for the population is still 12, compute the standard deviation of the sampling distribution.

The standard deviation of the sampling distribution is given by the formula: xbar +/- z*(SD/sqrt(n))

12/(sqrt(500))

## [1] 0.5366563

Since the sample size has increased, the margin of error has decreased. We are 95% confident that the population mean falls within 70 +/-0.54.

Give the 90% confidence interval for the true population mean of these scores.

The 90% confidence interval for the true population mean is given by the formula: xbar +/- z*(o/sqrt(n)), where

C.I.: 90% -> z = 1.645 o = 1.434274 n = 500

1.645*(1.434274/(sqrt(500)))

## [1] 0.1055147

Therefore, our confidence interval is equal to 70 +/-0.11.

Now give a 95% confidence interval. Make sure your new interval has an appropriate width when compared with the previous question.

Using the same formula as above, where

C.I.: 95% -> z = 1.960 o = 1.434274 n = 70

1.960*(1.434274/(sqrt(500)))

## [1] 0.1257197

Therefore, our confidence interval is now equal to 73.125 +/-0.13. Compared to the previous question, our margin of error has increased which makes sense given that the confidence interval has also increased.

Would you feel confident about using the results of this test as a standard against which to compare the scores of any group within the population (e.g., the scores of a sample of women, a sample of Asian Americans, etc.)?

We feel confident about using the results as a standard. This is because our results have a 95% confidence interval while still having a small margin for error. Therefor, we are fairly certain of the population mean. This test may have implications if used for samples outside the catagory of “young adults”, as the cognitive funtions of youth, adults, and elderly people differ. If this mean score is used to compare sub groups within the catagory “young adults”, our results could be useful. We are unsure how random the 500 person sample is. For example, if certain socio-cultural differences (like race or sex) were not properly randomized, it would not be fair to use our results as a standard if comparing these socio-cultural differences.

Compare the results of the two samples. What do they tell you?

The results for 95% confidence intervals for the sample sizes of 40 is 73.1 +/-0.6, and for the sample size of 500 is 73.1 +/-0.13. The results show that with a large sample size (500), the margin of error is much smaller compared to the small sample size (40). This is reasonable, because we are more confident about our results if the sample size is large.