I. Initialization Block

Initializing RStudio

The data set we will use primarily is Data3350 which was produced in 2015 during an undergraduate research project about personality and humor. The VarsData3350 PDF file has descriptions of each variable in the Data3350 file. Both are available for download in D2L. Be sure to put the Data3350 in your R folder in Documents, and make sure your working directory is set the same way (Session menu). The code block below uses the library function to ensure that the Mosaic package is loaded and will import the data frame used in this module: Data3350.

library(mosaic)
library(readxl)
Data3350 = read_excel("Data3350.xlsx")

II. Exercises

  1. Use the thrill-seeking variable Thrill from the Data3350 data frame to test for a significant difference between younger students and older students (G21 variable, with “Y” meaning yes, 21 or older). Test whether thrill-seeking levels are higher for younger students at the 0.05 level of significance.

  2. Use the toxic relationship beliefs variable TxRel from the Data3350 data frame to test for a significant difference between females and males (Sex variable). Use the 0.05 level of significance.

  3. Use the Anxiety variable Anx from the Data3350 data frame to test for a significant difference between females and males (Sex variable). Use the 0.05 level of significance.

  4. Use the Neuroticism variable Neuro from the Data3350 data frame to test for a significant difference between those involved in social Greek fraternities and sororities. Use the 0.05 level of significance.

  5. Use the Neuroticism variable Neuro from the Data3350 data frame to test for a significance difference in levels of Neuroticism based upon Primary Humor Style (PHS variable). Use the 0.05 level of significance, and conduct a post hoc Tukey HSD test if needed, including an mplot.

  6. Use the weight variable from the built-in R data frame ChickWeight to test for a significant difference in the growth of baby chicks based upon the grouping variable Diet. Use the 0.01 level of significance, and conduct a post hoc Tukey HSD test if needed, including an mplot.

  7. Use the Self-Esteem variable SE from the Data3350 data frame to test for a significance difference in levels of Self-Esteem based upon Primary Humor Style (PHS variable). Use the 0.05 level of significance, and conduct a post hoc Tukey HSD test if needed, including an mplot.

  8. Use the count variable from the built-in R data frame InsectSprays to test for a significant difference in the number of insects counted in a certain area based on the type of insecticide used. The variable spray is the grouping variable. Use the 0.01 level of significance, and conduct a post hoc Tukey HSD test if needed, including an mplot.

  9. Use the Adult Playfulness variable Play from the Data3350 data frame to test for a significance difference in levels of Playfulness based upon the Friend-making variable (Friends) which indicates whether the individual is most comfortable making friends with members of the same or opposite sex, or if the preference is equal. Use the 0.05 level of significance, and conduct a post hoc Tukey HSD test if needed, including an mplot.

  10. Use the subset command as shown above for the following hypothesis test. Males are more likely to use Aggressive Humor, and younger folks are more likely to use aggressive humor than older folks. Find the overall aggressive humor average using the (HSAG variable) in the Data3350 data frame. Subset a group of young men who are less than 20 years old, and test the hypothesis that this sub-population has a higher group mean than the overall population mean.

III. Code Blocks

favstats(~ HSAG, data = Data3350)
subset(Data3350, Sex == "F", c(Age,HSAG))
fem = subset(Data3350, Sex == "F", c(Age,HSAG))
yFem = subset(fem, Age < 20, HSAG)
yFem
t.test(~HSAG, data = yFem, 
       mu = 29,
       alternative = "less")
Pre = Data3350$Stress1
Post = Data3350$Stress2
Gain = Post - Pre
favstats(Gain)
histogram(Gain, width = 2)
boxplot(Gain, horizontal = TRUE)
t.test(Gain, alternative = "greater")
t.test(Post , Pre, data = Data3350, 
       paired = TRUE,
       alternative = "greater")
t.test(Post, Pre, 
       paired = FALSE,
       alternative = "greater")
tally(Caff ~ G21, data = Data3350)
favstats(Caff ~ G21, data = Data3350)
favstats(Sleep ~ G21, data = Data3350)
histogram (~ Caff | G21 , data = Data3350, layout = c(1,2))
boxplot( Caff ~ G21 , data = Data3350, horizontal = TRUE, layout = c(1,2))
histogram (~ Sleep | G21 , data = Data3350, layout = c(1,2))
boxplot( Sleep ~ G21 , data = Data3350, horizontal = TRUE, layout = c(1,2))
t.test(Caff ~ G21, data = Data3350,
       alternative = "less")
t.test(Sleep ~ G21, data = Data3350,
       alternative = "greater")
mean(Sleep ~ G21, data = Data3350 )
mean(Sleep ~ G21, data = Data3350 )[[1]][1]-mean(Sleep ~ G21, data = Data3350 )[[2]][1]
tally(shuffle(Sleep) ~ G21 , data = Data3350)
shuf = do(1000) * mean(shuffle(Sleep) ~ G21 , data = Data3350)
shuf$diff = shuf$N - shuf$Y
shuf
sum(shuf$diff > .705)
t.test(OCD ~ Sex, data = Data3350)
t.test(Play ~ G21, data = Data3350,
       alternative = "greater")
t.test(CHS ~ Sex, data = Data3350,
       alternative = "less")
favstats(Neuro ~ PHS, data = Data3350)
mod = lm(Neuro ~ PHS , data = Data3350)
anova(mod)
TukeyHSD(mod, conf.level = 0.95)
mplot(TukeyHSD(mod, conf.level = 0.95))
mod2 = lm(GPA ~ SitClass, data = Data3350)
favstats(GPA ~ SitClass, data = Data3350)
anova(mod2)
mod3 = lm(Opt ~ PHS, data = Data3350)
favstats(Opt ~ PHS, data = Data3350)
anova(mod3)
TukeyHSD(mod3)
mplot(TukeyHSD(mod3))
mod4 = lm(TxRel ~ Friends, data = Data3350)
favstats(TxRel ~ Friends, data = Data3350)
anova(mod4)
mplot(TukeyHSD(mod4, conf.level = .9))
mod5 = lm(Narc ~ PHS, data = Data3350)
favstats(Narc ~ PHS, data = Data3350)
anova(mod5)
TukeyHSD(mod5)
mplot(TukeyHSD(mod5))
