Pre-Midterm Review

library(readr)
library(readxl)
library(tidyverse)
library(ggplot2)

Manipulate dataframes using tidyverse

(e.g., filter, mutate, ifelse, case_when, group_by)

filtered_mtcars <- mtcars %>%
  filter(mpg > 20) # Select cars with mpg > 20

mutated_mtcars <- mtcars %>%
  mutate(liters_per_100km = (235.21 / mpg)) # Add a new column to convert mpg to liters per 100 km

ifelse_mtcars <- mtcars %>%
  mutate(efficiency = ifelse(mpg > 25, "High", "Low")) # Categorize cars as "High" or "Low" efficiency based on their mpg

case_when_mtcars <- mtcars %>%
  mutate(engine_size = case_when(
    disp <= 100 ~ "Small",
    disp > 100 & disp <= 200 ~ "Medium",
    disp > 200 ~ "Large"
  )) # Categorize engine sizes based on displacement (disp) values

grouped_mtcars <- mtcars %>%
  mutate(engine_size = case_when(
    disp <= 100 ~ "Small",
    disp > 100 & disp <= 200 ~ "Medium",
    disp > 200 ~ "Large"
  )) %>%
  group_by(engine_size) %>%
  summarise(mean_mpg = mean(mpg), count = n()) # Calculate the mean mpg and count of cars in each engine size category

Plots

histograms

ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(binwidth = 2) +
  labs(title = "Histogram of Miles per Gallon", x = "Miles per Gallon", y = "Count") +
  theme_bw()

boxplots

ggplot(mtcars, aes(x = factor(cyl), y = mpg, fill = factor(cyl))) +
  geom_boxplot() +
  labs(title = "Boxplot of Miles per Gallon by Cylinders", x = "Number of Cylinders", y = "Miles per Gallon") +
  theme_bw()

scatterplots

ggplot(mtcars, aes(x = hp, y = mpg)) +
  geom_point() +
  labs(title = "Scatterplot of Miles per Gallon vs Horsepower", x = "Horsepower", y = "Miles per Gallon") +
  theme_bw()

Test for normality

shapiro.test(mtcars$mpg)

## 
##  Shapiro-Wilk normality test
## 
## data:  mtcars$mpg
## W = 0.94756, p-value = 0.1229

For the Shapiro-Wilk normality test, the null and alternative hypotheses are as follows:

Null hypothesis (H0): The data is normally distributed. Alternative hypothesis (H1): The data is not normally distributed. In this case, the p-value is 0.1229. If we use a significance level of 0.05, the p-value is greater than the significance level. Therefore, we cannot reject the null hypothesis, and we assume that the mpg variable in the mtcars dataset is normally distributed.

Test for homogeneity of variance

library(readxl)
Modality_df <- read_excel("Modality.xlsx")

library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

leveneTest(Final~Modality,data=Modality_df)

## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2   0.534 0.5948
##       19

The null hypothesis is that the variances in the different groups are equal – meaning that a significant result from Levene’s test could indicate a problem. H0 = sigma1 = sigma2 Ha = sigma1 ≠ sigma2

Chi-square Test Example

You are interested to see if there is a relationship between being easily angered and heart disease. You get data from a random sample of 8,474 people. You had the subjects take a Spielberger Trait Anger Scale, which measures how prone a person is to sudden anger and then you collected data on whether they had CHD (“coronary heart disease”). The results are as follows: Of those with CHD, 53 were considered “low anger”, 110 had “moderate anger” and 27 had “high anger”. Of those without CHD, 3057 had “low anger”, 4621 had “moderate anger” and 606 had “high anger”.

This data appears to be most appropriate for the chi-squared test. I start by creating a contingency table.

cont_table <- data.frame(CHD =c(53, 110, 27), noCHD=c(3057, 4621, 606))
cont_table

##   CHD noCHD
## 1  53  3057
## 2 110  4621
## 3  27   606

Please run the appropriate analysis to determine if CHD is associated with propensity to anger easily.

This data is contains independent measures. I need to calculate the proportions for each group (row ties the column divided by total = expected for each row column pairing) I can just run chisq.test() function because if the expected frequency is below 5 in any cell, a warning message will appear.

chisq.test(cont_table)

## 
##  Pearson's Chi-squared test
## 
## data:  cont_table
## X-squared = 16.077, df = 2, p-value = 0.0003228

Please be sure to test the assumptions of the model and comment on whether our data violate the assumptions of the model.

There was no warning message so the expected frequency assumption is met. The degree of freedom is greater than 1 so there is not need for Yates’ continuity correction.

Interpret the results of your analysis. What would you conclude?

sum(cont_table$CHD)

## [1] 190

sum(cont_table$noCHD)

## [1] 8284

X-squared = 16.077, df = 2, p-value = 0.0003228

Of the 8,474 people in this survey, 190 individuals indicated that they suffered from coronary heart disease (CHD), while 8284 did not. A chi-square test was conducted to test whether there was an association between sudden anger and CHD. Results show a significant association between the education level and whether partnered `χ2(2, N = 8,474 =16.08), p =0.0003228.

I need to determine the odds ratio next. Here, I calculate the odds ratios for each pair of groups (low vs. moderate, low vs. high, and moderate vs. high)

CHD_low<-53
CHD_mod<-110
CHD_high<-27 
noCHD_low<-3057
noCHD_mod<-4621
noCHD_high<-606
OR_low_mod <- (CHD_low / noCHD_low) / (CHD_mod / noCHD_mod)
OR_low_high <- (CHD_low / noCHD_low) / (CHD_high / noCHD_high)
OR_mod_high <- (CHD_mod / noCHD_mod) / (CHD_high / noCHD_high)

cat("Odds Ratio (low vs. moderate):", round(OR_low_mod, 3), "\n")

## Odds Ratio (low vs. moderate): 0.728

cat("Odds Ratio (low vs. high):", round(OR_low_high, 3), "\n")

## Odds Ratio (low vs. high): 0.389

cat("Odds Ratio (moderate vs. high):", round(OR_mod_high, 3), "\n")

## Odds Ratio (moderate vs. high): 0.534

Based on the odds ratios, the odds of suffering from CHD in the low anger group were 0.728 times the odds in the moderate anger group, and 0.389 times the odds in the high anger group. The odds of suffering from CHD in the moderate anger group were 0.534 times the odds in the high anger group. These results suggest that there is a significant relationship between sudden anger levels and the likelihood of suffering from coronary heart disease.

Fisher’s Exact Test

new_cont_table <- data.frame(CHD = c(2, 8), noCHD = c(15, 5))
new_cont_table

##   CHD noCHD
## 1   2    15
## 2   8     5

This contingency table should fall the assumption of expected frequency of >5.

chisq.test(new_cont_table)

## Warning in chisq.test(new_cont_table): Chi-squared approximation may be
## incorrect

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  new_cont_table
## X-squared = 6.1256, df = 1, p-value = 0.01332

There is a Warning: Warning: Chi-squared approximation may be incorrect which indicated that Fisher’s Exact test will be more approprate.

fisher.test(new_cont_table)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  new_cont_table
## p-value = 0.006887
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.007216891 0.665096413
## sample estimates:
## odds ratio 
## 0.09230336

In a smaller sample of 30 individuals, data on coronary heart disease (CHD) and propensity to anger were collected. Of these individuals, 10 had CHD, and 20 did not have CHD. A 2x2 contingency table was created to analyze the association between sudden anger levels and the likelihood of suffering from coronary heart disease. Due to the small sample size and the expected frequency being below 5 in at least one cell, Fisher’s Exact Test was employed instead of the chi-square test.

Fisher’s Exact Test revealed a significant association between sudden anger levels and the likelihood of suffering from coronary heart disease (p-value = 0.006887). The odds ratio of 0.0923 suggests that the odds of suffering from CHD in one group are considerably different from the other group. However, it is important to interpret these results with caution, considering the small sample size, and further research with larger samples is recommended to validate these findings.

Independent T-test

invisibility <- read_csv("Invisibility.csv")

# Get summary statistics for the the continuous column for each category
invisibility_summary <-invisibility %>%
  group_by(Cloak) %>%
  summarise(n = n(),
    mean_mischief = mean(Mischief),
    sd_mischief = sd(Mischief),
    median_mischief = median(Mischief))
invisibility_summary

## # A tibble: 2 × 5
##   Cloak     n mean_mischief sd_mischief median_mischief
##   <dbl> <int>         <dbl>       <dbl>           <dbl>
## 1     0    12          3.75        1.91               4
## 2     1    12          5           1.65               5

In an experiment conducted to determine if having an invisibility cloak increases mischief, data was collected from 24 individuals. Half of the participants (n = 12) were provided with an invisibility cloak, while the other half (n = 12) did not have a cloak. On average, the participants with invisibility cloaks engaged in a higher number of mischievous acts (M = 5, SD = 1.65) compared to those without a cloak (M = 3.75, SD = 1.91). This suggests that individuals with invisibility cloaks may be more prone to mischief than those without cloaks.

# Create a box plot of the Mischief variable
ggplot(invisibility, aes(x = factor(Cloak), y = Mischief, fill = factor(Cloak))) +
  geom_boxplot() +
  labs(title = "Box plot of Mischief by Cloak", x = "Cloak", y = "Mischief") +
  theme_bw()

# Create a histogram of the Mischief variable
ggplot(invisibility, aes(x = Mischief, fill = factor(Cloak))) +
  geom_histogram(position = "dodge", bins = 10) +
  facet_wrap(~ factor(Cloak)) +
  labs(title = "Histogram of Mischief by Cloak", x = "Mischief", y = "Count") +
  theme_bw()

Assumptions of the Independent T-test

Data are measured at the interval level
Scores in different treatments are independent
The sampling distribution is normally distributed. We test this by seeing if the two independent samples are normally distributed.

We want to test that it is normal within group.

nocloak <-invisibility %>%
  filter(Cloak=="0")
cloak <- invisibility %>%
  filter(Cloak=="1")
shapiro.test(nocloak$Mischief)

## 
##  Shapiro-Wilk normality test
## 
## data:  nocloak$Mischief
## W = 0.91276, p-value = 0.2314

shapiro.test(cloak$Mischief)

## 
##  Shapiro-Wilk normality test
## 
## data:  cloak$Mischief
## W = 0.97262, p-value = 0.9362

From the histogram and other box plots it appears to be normal. Based on the Shapiro-Wilk normality test results, the data for both cloak, W = 0.97262, p-value = 0.9362, and nocloak, W = 0.91276, p-value = 0.2314, levels appear to be approximately normally distributed. Since the p-values are both greater than 0.05, we fail to reject the null hypothesis that the data are normally distributed. Therefore, we can assume that the data for both levels are normally distributed, which supports the assumption of the t-test.

Homogeneity of variance. The variance is the same in both groups.

leveneTest(Mischief ~ as.factor(Cloak), data=invisibility)

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  0.2698 0.6087
##       22

The test statistic is 0.2698, and the p-value is 0.6087, which is greater than the common alpha level of 0.05. This indicates that we fail to reject the null hypothesis that there is no difference in the variances of the Mischief scores between the two groups. Therefore, we can assume that the variances of Mischief scores are approximately equal for both Cloak levels, which supports the assumption of the t-test.

t.test(Mischief ~ Cloak, data=invisibility, var.equal=TRUE)

## 
##  Two Sample t-test
## 
## data:  Mischief by Cloak
## t = -1.7135, df = 22, p-value = 0.1007
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -2.7629284  0.2629284
## sample estimates:
## mean in group 0 mean in group 1 
##            3.75            5.00

In an experiment to determine if having an invisibility cloak increases mischief, data was collected from 24 individuals, with 12 having an invisibility cloak (Cloak = 1) and the other 12 not having a cloak (Cloak = 0). The mean mischief score for those with a cloak (M = 5.00, SD = 1.65) was higher compared to those without a cloak (M = 3.75, SD = 1.91). An independent samples t-test was conducted to compare the means between the two groups. The test revealed that the difference in mischief scores between the two groups was not statistically significant, t(22) = -1.7135, p = 0.1007, with a 95% confidence interval for the mean difference ranging from -2.76 to 0.26. Based on these results, we cannot conclude that having an invisibility cloak significantly increases mischief.

Welch Test

The Welch’s test allows you to compare the means of two groups without requiring homogeneity of variance. Regardless of whether your homogeneity of variance assumption was met, run the Welch test and interpret the results.

t.test(Mischief ~ Cloak, data=invisibility)

## 
##  Welch Two Sample t-test
## 
## data:  Mischief by Cloak
## t = -1.7135, df = 21.541, p-value = 0.101
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -2.764798  0.264798
## sample estimates:
## mean in group 0 mean in group 1 
##            3.75            5.00

In an experiment to determine if having an invisibility cloak increases mischief, data was collected from 24 individuals, with 12 having an invisibility cloak (Cloak = 1) and the other 12 not having a cloak (Cloak = 0). The mean mischief score for those with a cloak (M = 5.00, SD = 1.65) was higher compared to those without a cloak (M = 3.75, SD = 1.91). As the assumption of homogeneity of variance was not met, a Welch’s t-test was conducted to compare the means between the two groups. The test revealed that the difference in mischief scores between the two groups was not statistically significant, t(21.541) = -1.7135, p = 0.101, with a 95% confidence interval for the mean difference ranging from -2.76 to 0.26. Based on these results, we cannot conclude that having an invisibility cloak significantly increases mischief.

The Welch’s t-test, like the independent t-test, uses the mean rather than the median. It is designed to compare the means of two groups when the assumption of homogeneity of variances is not met.

Wilcoxon rank-sum

A psychologist collects data to investigate the depressant effects of two recreational drugs: ecstasy and alcohol. She tested 20 clubbers; on a Saturday evening 10 were given an ecstasy tablet to take and 10 drank alcohol. Levels of depression were measured using the Beck Depression Inventory (BDI) the day after (Sunday). The data are as follows:

Participant <- c(1:20)
Drug <- c(rep("Ecstasy", 10), rep("Alcohol", 10))
Depression_Index_Score <- c(15, 35, 16, 18, 19, 17, 27, 16, 13, 20, 16, 15, 20, 15, 16, 13, 14, 19, 18, 18)

clubbers_df <- data.frame(Participant, Drug, Depression_Index_Score)

This data has 2 categorical and 1 continuous variables.I will use the Independent t-test flow chart.

Create a boxplot of the scores for Alcohol & Ecstasy. Interpret your figure – does it look like one drug results in higher depression scores than the other.

#Descriptive statistics:
data_sum <- clubbers_df %>%
group_by(Drug) %>%
summarise(n=n(),
mean_dep =mean(Depression_Index_Score),
sd_dep=sd(Depression_Index_Score))
data_sum

## # A tibble: 2 × 4
##   Drug        n mean_dep sd_dep
##   <chr>   <int>    <dbl>  <dbl>
## 1 Alcohol    10     16.4   2.27
## 2 Ecstasy    10     19.6   6.60

Now I create the boxplot:

ggplot(clubbers_df, aes(x=factor(Drug), y=Depression_Index_Score)) + 
         geom_boxplot()

The individuals who took Ecstasy scored slightly higher on the Depression Index Score and there are a couple of outliers in this group as well.

Determine the appropriate statistical test and list the assumptions of the test.

T-Test; Independence, normality of groups, homogeneity of variances

Conduct test(s) to determine if this data meets the assumptions of the parametric statistical test.

To determining the appropriate test, I will first test that the both groups are normality distributed about the mean.

Alcohol<-clubbers_df%>%
  filter(Drug =="Alcohol")
Ecstasy<-clubbers_df%>%
  filter(Drug =="Ecstasy")
shapiro.test(Alcohol$Depression_Index_Score)

## 
##  Shapiro-Wilk normality test
## 
## data:  Alcohol$Depression_Index_Score
## W = 0.95947, p-value = 0.7798

shapiro.test(Ecstasy$Depression_Index_Score)

## 
##  Shapiro-Wilk normality test
## 
## data:  Ecstasy$Depression_Index_Score
## W = 0.81064, p-value = 0.01952

One the p-values for the Shapiro-Wilk test (Ecstasy) is smaller than alpha at 0.05. Therefore, I reject the null as there is sufficient evidence that the difference scores is not normal.

Since the data in each group is not normal, I need to conduct a nonparametric test. I will use the Wilcoxon rank-sum test. This test has the assumption that the distribution around the median is symmetric. I already created box plots of the data. The data does not look perfectly symmetric but it is good enough for this demonstration of my skills. I calculate the median so that I might use it in my communication statement.

median(Alcohol$Depression_Index_Score)

## [1] 16

median(Ecstasy$Depression_Index_Score)

## [1] 17.5

Depending on your result from step c, run the appropriate statistical test.

wilcox.test(Depression_Index_Score~Drug, data=clubbers_df, paired = FALSE, exact=FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Depression_Index_Score by Drug
## W = 35.5, p-value = 0.2861
## alternative hypothesis: true location shift is not equal to 0

A Wilcoxon rank-sum test was conducted to test whether the distribution of depression scores was significantly different between the two drug groups. The test statistic was W = 35.5, and the p-value was p = 0.2861. Based on this result, we cannot conclude that there is a significant difference in depression levels between the clubbers who consumed ecstasy and those who consumed alcohol.

Interpret your results in light of the original question.

In an experiment to examine the depressant effects of two recreational drugs, ecstasy and alcohol, a psychologist collected data from 20 clubbers. Participants were assigned to two conditions: 10 received an ecstasy tablet, while the other 10 consumed alcohol. Levels of depression were measured using the Beck Depression Inventory (BDI) the day after (Sunday). Since the data for each group was not normally distributed, the median depression score was used for comparison. The median depression score was lower for the alcohol group (Md = 16) than for the ecstasy group (Md = 17.5). The results suggest that the clubbers who consumed ecstasy experienced higher levels of depression the following day compared to those who consumed alcohol. A Wilcoxon rank-sum test was conducted to test whether the distribution of depression scores was significantly different between the two drug groups. The test statistic was W = 35.5, and the p-value was p = 0.2861. Based on this result, we cannot conclude that there is a significant difference in depression levels between the clubbers who consumed ecstasy and those who consumed alcohol.

Dependent T-test

load("chico.Rdata")

This data represents 20 students who took 2 tests; test1 and test2. We want to know if students’ scores changed from test1 to test2.

Descriptive Statistics

# Get summary statistics for the the test1 and test2 columns.
chico_summary <-chico %>%
  summarise(n = n(),
    mean_test1 = mean(grade_test1),
    mean_test2 = mean(grade_test2), 
    sd_test1 = sd(grade_test1),
    sd_test2 = sd(grade_test2))
chico_summary

##    n mean_test1 mean_test2 sd_test1 sd_test2
## 1 20      56.98     58.385 6.616137 6.405612

Create a chart representing this data (your choice which chart you create).

# Create a histogram of the test1 and test2
hist(chico$grade_test1)

hist(chico$grade_test2)

Assumptions of the Dependent T-test

There are a few assumptions of the dependent t-test:

• The data must be dependent (nothing to test)

• The data must be at least interval (nothing to test)

• When we are doing a paired test, the differences should be normally distributed.

Are the difference scores are normally distributed? You will have to compute the differences of our scores and then check to see if that new variable is normally distributed. Are the differences normally distributed? Have we violated the assumption of our t-test?

diff <- chico$grade_test2 - chico$grade_test1
mean(diff)

## [1] 1.405

shapiro.test(diff)

## 
##  Shapiro-Wilk normality test
## 
## data:  diff
## W = 0.9664, p-value = 0.6778

The Shapiro-Wilk normality test was conducted to assess the normality of the distribution of the difference in test scores between test1 and test2. The test resulted in a W value of 0.9664 and a p-value of 0.6778. Since the p-value is greater than the typical significance level of 0.05, we fail to reject the null hypothesis that the data is normally distributed. This indicates that the differences in test scores follow a normal distribution, which satisfies the assumption of normality required for conducting a dependent t-test.

Please run the dependent t-test and interpret the results.

t.test(chico$grade_test1, chico$grade_test2, paired=TRUE)

## 
##  Paired t-test
## 
## data:  chico$grade_test1 and chico$grade_test2
## t = -6.4754, df = 19, p-value = 3.321e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -1.8591314 -0.9508686
## sample estimates:
## mean difference 
##          -1.405

On average, students who scored lower on test 1 (M = 56.98, SD = 6.6) than the same students did when on test 2 (M = 58.38, SD = 6.4). A dependent t-test was conducted to test whether the mean difference scores were significantly different from 0. This difference, 1.4, was significant t(19) = -6.48, p = 3.321e-06. Based on this result, we can conclude that students mean test scores increase in the second test.

In an experiment to compare the performance of students on two tests, test1 and test2, a paired t-test was conducted on the test scores of 20 students. On average, the students scored 56.8 (SD = 6.6161) on test1 and 58.385 (SD = 6.4056) on test2. The mean difference in test scores between the two tests was 1.405. The paired t-test resulted in a t-value of -6.4754 with 19 degrees of freedom, and a highly significant p-value of 3.321e-06. Based on the 95 percent confidence interval for the mean difference (-1.8591314 to -0.9508686), we can conclude that there is a significant difference in the students’ performance between test1 and test2, with the students performing better on test2.

Wilcoxon signed-rank test

You are interested in how fostering kittens influences one’s happiness. You recruit 20 subjects who undergo two conditions: 1) fostering a kitten for 2 weeks, where happiness is measured on day 7 and 2) a control condition with no fostered kittens. Your experiment in counterbalanced, meaning that 10 of your subjects undergo condition 1 and then 2 while the rest of your subjects undergo condition 2 and then 1. You then compare the happiness scores (on a scale of 0 to 100). The data are in the file: Kittens.xlsx.
This data is paired and therefore is dependent. I will go through the Paired t-test flow chart

Kittens <- read_excel("Kittens.xlsx")

head(Kittens)

## # A tibble: 6 × 3
##   Subject Kitten No_kitten
##     <dbl>  <dbl>     <dbl>
## 1       1     42        23
## 2       2     60        63
## 3       3     59        14
## 4       4     46        37
## 5       5     83        82
## 6       6     44        47

Create a graphical visualization of the data

hist(Kittens$Kitten)

hist(Kittens$No_kitten)

mean(Kittens$Kitten)

## [1] 59.8

sd(Kittens$Kitten)

## [1] 19.48441

mean(Kittens$No_kitten)

## [1] 54.95

sd(Kittens$No_kitten)

## [1] 25.71192

Paired t-test required that we look at the difference between the groups.

diff <-Kittens$Kitten - Kittens$No_kitten
mean(diff)

## [1] 4.85

sd(diff)

## [1] 12.58769

On average, happiness scores were higher for participants when fostering kittens (M = 59.8, SD = 19.48) compared to the control condition with no kittens (M = 54.95, SD = 25.71). The mean difference in happiness scores between the two conditions was 4.85 (SD = 12.59), indicating that, on average, participants experienced greater happiness while fostering kittens.

Determine the appropriate statistical test and list the assumptions of the test

The Paired t-test requires that the data be paired, and that the difference between the groups is normal.

Conduct test(s) to determine if this data meets the assumptions of the parametric statistical test.

shapiro.test(diff)

## 
##  Shapiro-Wilk normality test
## 
## data:  diff
## W = 0.86632, p-value = 0.01013

The Shapiro-Wilk normality test was conducted on the difference in happiness scores between the two conditions to determine if the normality assumption was met. The test resulted in a W value of 0.86632 and a p-value of 0.01013. Since the p-value is less than the typical significance level (e.g., 0.05), the normality assumption is not met, indicating that the distribution of the differences in happiness scores is not normal. Consequently, a non-parametric test, such as the Wilcoxon signed-rank test, will be used for further analysis.

Depending on your results from step c, run the appropriate statistical test.

median(Kittens$Kitten)

## [1] 61

median(Kittens$No_kitten)

## [1] 61

median(diff)

## [1] 4.5

wilcox.test(Kittens$Kitten, Kittens$No_kitten, paired = TRUE, exact=FALSE)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  Kittens$Kitten and Kittens$No_kitten
## V = 141.5, p-value = 0.06372
## alternative hypothesis: true location shift is not equal to 0

A Wilcoxon signed-rank test was conducted to determine whether there was a significant difference in happiness scores between the kitten fostering and control conditions. The median difference in happiness scores between the two conditions was 4.5. The test resulted in a V value of 141.5 and a p-value of 0.06372. Although the p-value is slightly above the typical significance level of 0.05, the results suggest a trend towards increased happiness in the kitten fostering condition compared to the control condition. However, this difference was not statistically significant at the 0.05 level, and I cannot conclusively determine that fostering kittens leads to a significant increase in happiness scores.

Interpret your results in light of the original question.

On average, happiness scores were higher for participants when fostering kittens (Median = 61) compared to the control condition with no kittens (Median = 61). The median difference in happiness scores between the two conditions was 4.5, indicating that participants experienced greater happiness while fostering kittens. However, due to the non-normal distribution of the data, a Wilcoxon signed-rank test was conducted to determine whether there was a significant difference in happiness scores between the kitten fostering and control conditions. The median difference in happiness scores between the two conditions was 4.5. The test resulted in a V value of 141.5 and a p-value of 0.06372. Although the p-value is slightly above the typical significance level of 0.05, the results suggest a trend towards increased happiness in the kitten fostering condition compared to the control condition. However, this difference was not statistically significant at the 0.05 level, and I cannot conclusively determine that fostering kittens leads to a significant increase in happiness scores.

ANOVA

Imagine that you are interested in how different teaching modalities affect a student’s knowledge of course content. You decide to measure final exam scores based on three different teaching modalities: face-to-face, hybrid, on-line. You have ~8 students in each modality. Data is in Modality.xlsx.

The first step is to become familiar wit the data set.

head(Modality_df)

## # A tibble: 6 × 2
##   Modality Final
##   <chr>    <dbl>
## 1 f2f         79
## 2 f2f         42
## 3 f2f         95
## 4 f2f         76
## 5 f2f         68
## 6 f2f         62

It looks like the column Modality should be a factor class.I also need to know what the factors are.

#Convert the "Modality" column to a factor
Modality_df$Modality <- as.factor(Modality_df$Modality)

#Determine the names of the factors
levels(Modality_df$Modality)

## [1] "f2f"    "hybrid" "online"

Based on these results, I determine: f2f = face-to-face, hybrid = hybrid, online = on-line.

Next, I check that the data has 8 students in each modality.

table(Modality_df$Modality)

## 
##    f2f hybrid online 
##      6      8      8

It does not! Wow. I was not expecting this.Perhaps ~8 means approximately.

This data set has three categorical and one continuous variable. I will now proceed through the ANOVA flow chart.

I am interested in how different teaching modalities affect a student’s knowledge of course content and I will seek to do this by comparing the means (or median).

Please create a boxplot for each of our teaching modalities.

ggplot(data = Modality_df, aes(x = Modality, y = Final)) +
  geom_boxplot() +
  labs(title = "Box Plot of Modalities", x = "Modality", y = "Final") +
  theme_minimal()

#Descriptive Statistics
data_sum <- Modality_df %>%
group_by(Modality) %>%
summarise(n=n(),
mean_dep =mean(Final),
median_dep =median(Final),
sd_dep=sd(Final))
data_sum

## # A tibble: 3 × 5
##   Modality     n mean_dep median_dep sd_dep
##   <fct>    <int>    <dbl>      <dbl>  <dbl>
## 1 f2f          6     70.3       72     17.9
## 2 hybrid       8     51.6       53.5   12.2
## 3 online       8     51.6       52     11.4

Test the assumptions of the model and comment on whether our data violate the assumptions of your model.

I test for normality on the difference of the scores of all the groups.

# for loop through "Modality", run shapiro.test() on the "Final" column
for (Modality_types in levels(Modality_df$Modality)) {
  Modality_subset <- subset(Modality_df, Modality == Modality_types)
  test_result <- shapiro.test(Modality_subset$Final)
  cat(paste("Modality_types:", Modality_types, "- W = ", round(test_result$statistic, 3), ", p-value = ", round(test_result$p.value, 3), "\n"))
}

## Modality_types: f2f - W =  0.982 , p-value =  0.961 
## Modality_types: hybrid - W =  0.97 , p-value =  0.895 
## Modality_types: online - W =  0.977 , p-value =  0.944

All the p-values for each of the Shapiro-Wilk test is larger than alpha at 0.05. Therefore, I fail to reject the null as there is insufficient evidence that the difference scores is not normal.

I now run Levene’s Test to test that the homogeneity of variance across groups are the same.

leveneTest(Final~Modality,data=Modality_df)

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2   0.534 0.5948
##       19

The p-value is greater than 0.05 (p-value = 0.5948), therefore I do not have evidence to conclude that the variances across the groups are different.

The all assumptions have been tested that need to be tested before running the ANOVA and they have been met.

Please run the appropriate analysis to determine if different teaching methods affect student’s learning.

myanova<-aov(Final~Modality,data=Modality_df)
summary(myanova)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## Modality     2   1527   763.6   4.097 0.0332 *
## Residuals   19   3541   186.4                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

An ANOVA test to compare means across groups was conducted. There was a significant effect of teaching modalities on levels of final exam scores, F(2, 19)=4.097, p=0.0332.

I need to further test the assumptions of the ANOVA test using the residuals.

#extract the residuals
myresiduals <- residuals(object=myanova)

#Test for normality assumption of the residuals
hist(myresiduals)

qqnorm(y=myresiduals)

#Shapiro-Wilk on residuals:
shapiro.test(myresiduals)

## 
##  Shapiro-Wilk normality test
## 
## data:  myresiduals
## W = 0.99183, p-value = 0.9993

The histogram and QQ plot are normal looking and this is supported by the results of our Shapiro_Wilk test (W = 0.99183, p-value = 0.9993) which finds no indication the normality is violated.

I now conduct Post hoc comparisons.

pairwise.t.test(Modality_df$Final, Modality_df$Modality, p.adj="holm")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  Modality_df$Final and Modality_df$Modality 
## 
##        f2f  hybrid
## hybrid 0.06 -     
## online 0.06 1.00  
## 
## P value adjustment method: holm

Post hoc comparisons using the Holm test indicated that the mean score for the face-to-face condition was not significantly, but marginally close to significantly different from the hybrid and on-line condition. The hybrid and on-line condition did not significantly differ from each other.

Interpret the results of your analysis. What would you conclude?

In an investigation on how different teaching modalities affect a student’s knowledge of course content, measured final exam scores, 22 people were assigned to 3 conditions: face-to-face (6), hybrid (8), on-line (8). The mean final exam scores was equally lowest for the on-line group (M=51.62, SD=11.36) and hybrid (M=51.62, SD=12.21),and highest for the face-to-face (M=70.33, SD=17.85). An ANOVA test to compare means across groups was conducted. Post hoc comparisons using the Holm test indicated that the mean score for the face-to-face condition was marginally close to significantly different from the hybrid and on-line condition. However, no condition was significantly differ from another.

Kruskal-Wallis test

I pretend that the Modality_df data did not met an assumption and run it under Kruskal-Wallis rank sum test for practice.

kruskal.test(Final~Modality, data=Modality_df )

## 
##  Kruskal-Wallis rank sum test
## 
## data:  Final by Modality
## Kruskal-Wallis chi-squared = 5.3961, df = 2, p-value = 0.06734

In an investigation on how different teaching modalities affect a student’s knowledge of course content, measured by final exam scores, 22 people were assigned to 3 conditions: face-to-face (6), hybrid (8), and online (8). The median final exam scores were equally lowest for the online group (Mdn = 52.0) and the hybrid group (Mdn = 53.5), and highest for the face-to-face group (Mdn = 72.0). A Kruskal-Wallis rank sum test was conducted to compare the final exam scores across groups, as the data did not meet the assumptions for an ANOVA test.

Midterm Review

Lorraine Gaudio

2023-03-14