This project aims to answer if a person in a good is mood more likely to be helpful. A 1975 study sought to see if a positive mood led to increased optimism and in an increase in helpfulness. The study leads, Dr. Paula F Levin and Dr. Alice M. Isen, designed a study to see how often a stranger would mail a sealed envelope found in a phone booth. Sometimes, the envelope was stamped. Sometimes, the subject also found a dime in the coin return slot of the telephone. At the time of the study, a first class stamp was $0.08. $0.10 would cover the price of a phone call.
The dataset was found on openintro.org and is titled mail_me (1). Additional background information came from the 1975 article in Socimetry where Levin and Isen discuss their study (2). As Levin and Isen state in the 1975 article that “there were no differences between the sexes on this measure”, I am not including gender in my analyses (4).
Data Analysis
The existing dataframe consists of 4 factor columns. For further analysis, I will duplicate the data frame and convert the coin, mailed_letter, and stamped columns to numeric. I will also explore the summary statistics for the dataset and create a simple visualization of the data.
Rows: 42 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): stamped, found_coin, gender, mailed_letter
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#duplicate data set, convert observations to numericnum_mail_me <- mail_me %>%mutate(mailed_letter =case_match( mailed_letter,"no"~0,"yes"~1),found_coin =case_match( found_coin,"no_coin"~0,"coin"~1),stamped =case_match( stamped,"no"~0,"yes"~1))head(num_mail_me)
# A tibble: 6 × 4
stamped found_coin gender mailed_letter
<dbl> <dbl> <chr> <dbl>
1 0 1 male 1
2 0 1 male 1
3 0 1 male 1
4 0 1 male 1
5 0 1 male 0
6 0 1 female 1
#perform summary statisticssummary(num_mail_me)
stamped found_coin gender mailed_letter
Min. :0.0000 Min. :0.0000 Length:42 Min. :0.0000
1st Qu.:0.0000 1st Qu.:0.0000 Class :character 1st Qu.:0.0000
Median :1.0000 Median :0.0000 Mode :character Median :1.0000
Mean :0.5714 Mean :0.4524 Mean :0.5238
3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
Max. :1.0000 Max. :1.0000 Max. :1.0000
The summary suggests that though a coin was found less than half the time (mean = 0.4524), the envelope was mailed more than half the time (mean = 0.5238).
# how many letters were stamped?table(mail_me$stamped)
no yes
18 24
# how many letters were mailed?table(mail_me$mailed_letter)
no yes
20 22
# how many found coins?table(mail_me$found_coin)
coin no_coin
19 23
For the simple visualization, I will create grouped bar charts. For consistency, I have to recode the observations for the coin variable to match stamped and mailed_letter.
I also need to convert the data set to long format.
long_mail_me_2 <- mail_me_2 %>%pivot_longer(cols =everything(), names_to ="variable", values_to ="response")bar1 <-ggplot(long_mail_me_2, aes(x = variable, fill = response)) +geom_bar(position ="dodge") +labs(title ="Response Frequency For Coin Study",x ="Variable",y ="Count",caption ="Source: OpenIntro",fill ="Response")bar1
Statistical Analysis
My null hypothesis is that a stamped letter without a dime was as likely to be mailed as an unstamped letter found with a dime. My alternative hypothesis is that a stamped letter w/o a dime is more likely to be mailed than an unstamped letter with a dime.
PS - proportion of yes-stamp, no-dime letters mailed
PD - proportion of no-stamp, yes-dime letters mailed
tm__no_stamps_yes_coin <-7#correct = FALSE disables continuity correction which can be a concern which such a small sample sizeresults <-prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),c(total_yes_stamps_no_coin, total_no_stamps_yes_coin), alternative ="greater",correct =FALSE)
Warning in prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),
c(total_yes_stamps_no_coin, : Chi-squared approximation may be incorrect
results
2-sample test for equality of proportions without continuity correction
data: c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin) out of c(total_yes_stamps_no_coin, total_no_stamps_yes_coin)
X-squared = 6.3899, df = 1, p-value = 0.9943
alternative hypothesis: greater
95 percent confidence interval:
-0.8524793 1.0000000
sample estimates:
prop 1 prop 2
0.3076923 0.8750000
Warning in prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),
c(total_yes_stamps_no_coin, : Chi-squared approximation may be incorrect
results2
2-sample test for equality of proportions with continuity correction
data: c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin) out of c(total_yes_stamps_no_coin, total_no_stamps_yes_coin)
X-squared = 4.3179, df = 1, p-value = 0.9811
alternative hypothesis: greater
95 percent confidence interval:
-0.9534408 1.0000000
sample estimates:
prop 1 prop 2
0.3076923 0.8750000
Wow. That is an incredibly high p-value. I do NOT reject the null hypothesis.
But what if I reverse the alternative hypothesis? What if I predict that no-stamp/yes-dime letters are more likely to be mailed?
#correct = FALSE disables continuity correction which can be a concern which such a small sample sizeresults3 <-prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),c(total_yes_stamps_no_coin, total_no_stamps_yes_coin), alternative ="less",correct =FALSE)
Warning in prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),
c(total_yes_stamps_no_coin, : Chi-squared approximation may be incorrect
results3
2-sample test for equality of proportions without continuity correction
data: c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin) out of c(total_yes_stamps_no_coin, total_no_stamps_yes_coin)
X-squared = 6.3899, df = 1, p-value = 0.005738
alternative hypothesis: less
95 percent confidence interval:
-1.0000000 -0.2821361
sample estimates:
prop 1 prop 2
0.3076923 0.8750000
Look at the tiny p-value!
If I had hypothesized that two proportions were not equal, I would also be able to reject the null hypothesis.
#correct = FALSE disables continuity correction which can be a concern which such a small sample sizeresults <-prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),c(total_yes_stamps_no_coin, total_no_stamps_yes_coin),correct =FALSE)
Warning in prop.test(c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin),
c(total_yes_stamps_no_coin, : Chi-squared approximation may be incorrect
results
2-sample test for equality of proportions without continuity correction
data: c(tm_yes_stamps_no_coin, tm__no_stamps_yes_coin) out of c(total_yes_stamps_no_coin, total_no_stamps_yes_coin)
X-squared = 6.3899, df = 1, p-value = 0.01148
alternative hypothesis: two.sided
95 percent confidence interval:
-0.9071106 -0.2275048
sample estimates:
prop 1 prop 2
0.3076923 0.8750000
Tableau Plot
Visit https://public.tableau.com/app/profile/annet.isa/viz/DATA101/Sheet1 to see an interactive plot.
Screenshot of Visualization
Conclusion
The statistical analysis is eye-opening. Even though my initial premise (that there is no difference in the proportion of letters mailed in the yes-stamp/no-dime and no-stamp/yes-dime scenarios) is incorrect, I was not able to reject it in my first prop.test. As I gain familiarity with statistical testing, I will make it a habit to phrase my alternative hypothesis in different ways.
I am surprised people in 1975 were more likely to buy a stamp to mail a letter than drop an already-stamped letter in a mailbox.
An interesting aspect of this vintage 1975 study is that the study leads expected someone to be happy or to be in a better mood if they found a dime. Between inflation (a 1974 dime was $0.57 in 2023, a first class stamp $0.66 (3)) and the lack of payphones in 2024, the study would be difficult to replicate today.
Citations
“Influence of a Good Mood on Helpfulness” https://www.openintro.org/data/index.php?data=mail_me
“Further Studies on the Effect of Feeling Good on Helping” https://www.jstor.org/stable/2786238