Recalling Words

Background

Many teachers and other educators are interested in understanding how to best deliver new content to students. In general, they have two choices of how to do this.

The Meshed Approach
- Deliver new content while simultaneously reviewing previously understood content.
The Before Approach
- Deliver new content after fully reviewing previously understood content.

A study was performed to determine whether the Meshed or Before approaches to delivering content had any positive benefits on memory recall.

The Experiment (click to view)

The Data (click to view)

The results from the study can be found in the Friendly data set in R after loading library(car).

Click the “Code” button to see the data.

datatable(Friendly, options=list(lengthMenu = c(3,10,30)))

Hypothesis Tests

The purpose of this analysis is to determine whether or not the Meshed or Before methods have any positive benefit on memory recall. In order to do this three separate tests will be set up as follows. To counteract the multiple comparisons problem, which is to ensure that the probability of at least one type 1 error occuring stays within the desired limit, the Bonferroni correction method will be used. The Bonferroni correction method says to divide \(\alpha\) by m or the number of tests being performed. In this case that yields an \(\alpha\) of .1666 but for simplicity \(\alpha\) will be set to .01. This ensures that throughout all three tests the total probability of a type 1 error occuring will be less than .05.

Test 1

\[ H_0: difference\ in \ medians \ (Meshed - SFR) = 0 \] \[ H_a: difference\ in \ medians \ (Meshed - SFR) \neq 0 \]

Test 2

\[ H_0: difference\ in \ medians \ (Before - SFR) = 0 \] \[ H_a: difference\ in \ medians \ (Before - SFR) \neq 0 \]

Test 3

\[ H_0: difference\ in \ medians \ (Meshed - Before) = 0 \] \[ H_a: difference\ in \ medians \ (Meshed - Before) \neq 0 \]

Analysis

Hide Boxplot

Show Boxplot

boxplot(correct ~ condition, data = Friendly, col = "firebrick", main = "Number of words recalled",
        names = c("Before", "Meshed", "SFR (control)"))
stripchart(correct ~ condition, vertical=T,data= Friendly, method="stack",   
           add=TRUE, pch=19)

Summary Stats

Friendly %>%
  group_by(condition) %>%
  summarise(min = min(correct), median = median(correct), mean = mean(correct), max = max(correct), sd = sd(correct), `Number of Trials` = n()) %>%
  pander(caption="Number of words recalled")

Number of words recalled
condition	min	median	mean	max	sd	Number of Trials
Before	24	39	36.6	40	5.337	10
Meshed	30	36.5	36.6	40	3.026	10
SFR	21	27	30.3	39	7.334	10

Here it can be seen that all three conditions had the same number of observations at 10. The Before method has the highest median and a mean equal to that of the Meshed method. Important to note is that the meshed method has least spread in the data with a standard deviation considerable smaller than the other two conditions.

Test Results

Test 1

pander(wilcox.test(correct ~ condition, data = Meshed_SFR, mu = 0, alternative = "two.sided", conf.level = 0.99),
       caption = "Meshed - SFR", split.table=Inf)

Meshed - SFR *Insufficient evidence to reject the null
Test statistic	P value	Alternative hypothesis
72	0.1015	two.sided

Test 2

pander(wilcox.test(correct ~ condition, data = Before_SFR, mu = 0, alternative = "two.sided", conf.level = 0.99),
       caption = "Before - SFR", split.table=Inf)

Before - SFR *Sufficient evidence to reject the null and assume the alternative
Test statistic	P value	Alternative hypothesis
76.5	0.04555 *	two.sided

Test 3

pander(wilcox.test(correct ~ condition, data = Before_Meshed, mu = 0, alternative = "two.sided", conf.level = 0.99),
       caption = "Meshed - Before", split.table=Inf)

Meshed - Before *Insuffcient evidence to reject the null
Test statistic	P value	Alternative hypothesis
62	0.378	two.sided

Interpretation

After performing the appropriate tests only one of the three were found siginificant. Test 2 in which the Before group was compared to the control (SFR) group, did yield that sufficient evidence exists to assume that students who were taught the Before method faired better than those in the control. This means that the Before method does have benefit on memory recall when the only other option is no specific method at all. When the Before and Meshed methods were tested against each other, there was insufficient evidence to reject the null and asume that one method was better than the other. Further experiments could be carried out in the future to allow for a better analysis on wether the Before or Meshed method is best when compared to eachother. In the end with the data given it appears that the Before method does provide benefit over no method at all.

Appendix

It was determined that the Wilcoxon Rank Sum test was the appropriate test for this dataset because the sample size only had 30 oberservations. Also when the data was plotted on QQplots to test for normality they showed that the data was not normal. Below the QQplots can be seen.

Hide QQplots

Show QQplots

qqPlot(correct ~ condition, data = Friendly, ylab = "Number of words recalled correctly")     # data is not normally distributed