Introduction

This investigation has to do with the stereotype I have heard many times: women have a better short term memory than men. To look into the validity of this stereotype we had adults in the fictional town, Takazaki, take a memory quiz and recorded their scores.

Our population parameter of interest is the difference between the average memory quiz score of the women and the average memory quiz score of the men in the town. The memory quiz consisted of the observational units looking at a set of 20 cards and memorizing them for 1 minute, waiting 30 seconds without seeing the cards, and being asked to recount all of the cards once more, their score being how many cards they could recount correctly out of the 20 they saw.

Our research question is: Do women in the town Takazaki have better short-term memories than men in Takazaki? This means I am trying to conclude if the parameter is positive(women have higher memory scores), negative(men have higher memory scores), or zero(men and women have the same memory scores). Once glancing at the data, my initial conjecture was to say that the parameter should be positive, as the men seemed to have a lower average quiz score. Before I saw any data, I was expecting the women to have a higher average memory score from personal observation, so I thought the parameter value would be around the same as I thought it was after skimming the data.

Data Collection Methods

The town of Takazaki was the setting for our investigation into the world of short term memory, we looked at men and women over the age of 18. The observational units of this investigation are the Adults of Takazaki and the variables are gender and memory test score. We measured their score out of one as a decimal to make the data easier to read.

To get a random sample, we chose 40 random houses with a number generator and asked all the adults in those houses for their consent to be in the study. Out of the 67 adults living in those houses, 52 said they would participate. We then picked more random houses with the random number generator, until we hit 32 observational units for the women group and 32 for the men group. When people from the houses we randomly chose one by one declined to participate in my study, I moved on to another random household.

Analysis of Results

  1. Define the population(s) and parameter(s) (again) in words

The population is the adults that live in the town Takazaki. The parameter is the difference between the average memory score of the women in the town and the men in the town.

  1. State the null and alternative hypotheses in symbols and in words

\(\mu_{w}:\) average memory score of the women in the town Takazaki. \(\mu_{m}:\) average memory score of the men in the town Takazaki.

\(H_0:\mu_{w}=\mu_{m}\) The null hypothesis states that the average memory scores of men and women are equal. \(H_a:\mu_{w}>\mu_{m}\) The alternative hypothesis states that women have a higher average memory score compared to men.

  1. State what a type I and a type II error would represent in this setting

A type I error would mean that we rejected the null hypothesis, that the average memory scores of men and women are equal, when in reality the null hypothesis is true.

A type II error would mean that we accepted the null hypothesis, but in reality women got a higher average memory score than men.

  1. Discuss/justify whether or not your measurements can reasonably be considered a representative sample from the population(s) of interest

My sample could be considered a representative sample from the population because we obtained the data through random sampling, it is just on the smaller side of observational units that we would want.

  1. Use a theory-based approach and appropriate R code to
library(readr)
Islands_Mini_Project_Sheet1 <- 
  read_csv("Islands Mini Project - Sheet1.csv")
head(Islands_Mini_Project_Sheet1, n=2)
  1. Find an appropriate test statistic and comment on appropriate validity conditions

Because our data is comprised of a categorical and a quantitative variable, the test statistic for our data set is a two-sample t-test. According to the validity conditions for a two-sample t-test: the quantitative variable should have a symmetric distribution in both groups or there should be at least 20 observations in each group and the sample distributions should not be strongly skewed.

We can see that there are 20 observational units in both our man and woman data group, but the sample distributions look to be skewed to the left, and there is not a symmetric distribution in either group. This means the validity conditions for the two-sample t-test aren’t met.

histogram(~score | Gender, data = Islands_Mini_Project_Sheet1, width = .1, layout = c(0, 1))

stat(t.test(score~Gender, data = Islands_Mini_Project_Sheet1))
##         t 
## -2.159962
diff(mean(score~Gender, data = Islands_Mini_Project_Sheet1))
##     Woman 
## 0.1078125

The standardized statistic, or t-statistic, is -2.16, which means our sample difference of 0.108 is -2.16 standard deviations below the mean.

  1. Find the p-value corresponding to your alternative hypothesis
pval(t.test(score~Gender, data = Islands_Mini_Project_Sheet1))
##    p.value 
## 0.03475947

Based on our p-value, the probability of obtaining the observed difference in means between women and men in our population is very unlikely when assuming that our null hypothesis is true.

  1. Indicate what statistical decision this p-value leads you to draw about the null hypothesis

This p-value leads us to the statistical decision to reject our null hypothesis.

  1. State your conclusion in the context of the problem

Our conclusion is that we reject the null hypothesis that states the difference in means between the men and women of Takazaki is zero, and we recognize that there is strong evidence showing that women have a higher mean memory score than men.

  1. Use R to find an appropriate confidence interval to describe the plausible values of your population parameter
confint(t.test(score~Gender, data = Islands_Mini_Project_Sheet1))

the confidence interval is -0.208 to -.008.

  1. Interpret the confidence interval in the context of the problem.

We are 95% confident that the range of estimates for the difference in mean memory scores between men and women is between -0.208 and -0.008. We can notice that zero is not included in this confidence interval, meaning that we are 95% confident that the alternative hypothesis is true, and women have higher average memory scores compared to men.

Conclusion

In this study, we showed that it is very likely that women in Takazaki have a higher average memory quiz score than men in Takazaki. Looking at the data, we could see that women had a higher average mean score just looking at our sample data. To make sure this was representative to the whole adult population of Takazaki, we created a null distribution to show how sampling from a population where the null hypothesis was true would create a distribution. To get our p-value, we compared the difference in means between men and women, and showed that it would be very unlikely to get a result like ours if our population modeled the null hypothesis. With a p-value of .035, we could reject the null hypothesis and say that there is strong evidence showing that women have a higher average memory quiz score when compared to men in the town of Takazaki. We also showed that we can be 95% confident that our null hypothesis is not true, with our confidence interval not containing zero. The data did behave as expected for me, and I would tentatively say that it would be appropriate to generalize this result to the entire town of Takazaki. If I could do this study again, though, I would get a much larger sample size to be more sure that my sample is representative of the entire town. To build on the results we got from this study, I think someone should have the observational units do another memory test of another type, like long term memory, to see if this finding holds up when compared to other types of memory.