library(readr)
library(readxl)
library(haven)
library(tidyverse)
load("chico.Rdata")
load("clinicaltrial.Rdata")
pay_and_security <- read_csv("pay and security.csv")
ATV_riders_1 <- read_excel("ATV_riders-1.xlsx")
Drug_df <- read_sav("Drug.sav")
We will use the data chico.Rdata. This data represents
20 students who took 2 tests; test1 and test2. We want to know if
students’ scores changed from test1 to test2.
There are a variety of ways that you can explore this data. Let’s do 2:
Please determine the means and standard deviations of test1 and test2
head(chico)
## id grade_test1 grade_test2
## 1 student1 42.9 44.6
## 2 student2 51.8 54.0
## 3 student3 71.7 72.3
## 4 student4 51.6 53.4
## 5 student5 63.5 63.8
## 6 student6 58.0 59.3
# Get summary statistics for the the test1 and test2 columns.
chico_summary <-chico %>%
summarise(n = n(),
mean_test1 = mean(grade_test1),
mean_test2 = mean(grade_test2),
sd_test1 = sd(grade_test1),
sd_test2 = sd(grade_test2))
chico_summary
## n mean_test1 mean_test2 sd_test1 sd_test2
## 1 20 56.98 58.385 6.616137 6.405612
Create a chart representing this data (your choice which chart you create).
# Create a histogram of the test1 and test2
hist(chico$grade_test1)
hist(chico$grade_test2)
There are a few assumptions of the dependent t-test:
• The data must be dependent (nothing to test)
• The data must be at least interval (nothing to test)
• When we are doing a paired test, the differences should be normally distributed.
Please check to see if the difference scores are normally distributed. You will have to compute the differences of our scores and then check to see if that new variable is normally distributed. Are the differences normally distributed? Have we violated the assumption of our t-test?
shapiro.test(chico$grade_test1)
##
## Shapiro-Wilk normality test
##
## data: chico$grade_test1
## W = 0.9713, p-value = 0.782
shapiro.test(chico$grade_test2)
##
## Shapiro-Wilk normality test
##
## data: chico$grade_test2
## W = 0.96968, p-value = 0.7482
diff <- chico$grade_test2 - chico$grade_test1
mean(diff)
## [1] 1.405
shapiro.test(diff)
##
## Shapiro-Wilk normality test
##
## data: diff
## W = 0.9664, p-value = 0.6778
test1: p>0.05 (p-value =0.782) test2: p>0.05 (p-value =0.748) I fail to reject the null hypothesis that the data is normally distributed.There is insufficient evidence that there is a non-normal distribution of either test1 or test2 The histograms appear normal. The data is assumed to be normally distributed.
Please run the dependent t-test and interpret the results.
t.test(chico$grade_test1, chico$grade_test2, paired=TRUE)
##
## Paired t-test
##
## data: chico$grade_test1 and chico$grade_test2
## t = -6.4754, df = 19, p-value = 3.321e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -1.8591314 -0.9508686
## sample estimates:
## mean difference
## -1.405
Is this test significant? What does that mean?
The test is significant with a very low p-value of 3.321e-06. This means that we can reject the null hypothesis that there is no difference between the mean scores of two tests taken by the same group of students. The alternative hypothesis that the true mean difference is not equal to 0 is supported by the data, indicating that there is a statistically significant difference between the mean scores of the two tests. The 95 percent confidence interval for the mean difference between the two tests is between -1.8591314 and -0.9508686, which means we can be 95 percent confident that the true difference between the two tests falls within this range. Overall, this suggests that the students performed differently on the two tests, and further investigation may be needed to understand the reasons for this difference.
To communicate results, we want to remember our blocks from last week:
Start with some descriptive statistics.
The description tells you what the null hypothesis being tested is
A “stat” block is included
The results are interpreted
On average, students who scored lower on test 1
(M = 56.98, SD = 6.6) than the same
students did when on test 2 (M = 58.38, SD = 6.4).
A dependent t-test was conducted to test whether the mean difference
scores were significantly different from 0. This difference, 1.4, was
significant t(19) = -6.48, p = 3.321e-06. Based on this
result, we can conclude that students mean test scores increase in the
second test.
Write a communication block for your findings
For this example, we will use the dataset:
pay and security.csv. This dataset has data on people’s
desire for job security or job pay (measured from 1 – no concern to 10 –
ultimate concern). We want to know if people care more about job
security or job pay. Since we have data from 30 individuals who
responded to both questions, we can run a dependent t-test. But before
we do that, let’s look at descriptive plots and test our
assumptions.
Create a chart to compare people’s preferences for pay and security.
head(pay_and_security)
## # A tibble: 6 × 3
## id security pay
## <dbl> <dbl> <dbl>
## 1 1 3 4
## 2 2 1 9
## 3 3 6 8
## 4 4 7 5
## 5 5 6 4
## 6 6 5 6
hist(pay_and_security$security)
hist(pay_and_security$pay)
Determine whether the data meets the assumption of the dependent t-test.
Assumptions of the dependent t-test:
• The data must be dependent (nothing to test)
• The data must be at least interval (nothing to test)
• When we are doing a paired test, the differences should be normally distributed.
diff <- pay_and_security$security - pay_and_security$pay
mean(diff)
## [1] -1.133333
shapiro.test(diff)
##
## Shapiro-Wilk normality test
##
## data: diff
## W = 0.92528, p-value = 0.03685
diff: p>0.05 (p-value = 0.03685)
Based on the shapiro.test() function output, I reject
the null hypothesis that the data is normally distributed.There is
sufficient evidence that there is a non-normal distribution of the
difference between test1 or test2.
The histograms appear to not be normal.The data is assumed to be not be normally distributed.
For the Wilcoxon signed rank test assumptions:
This data is dependent because the id represents the same individuals.
Symmetric distribution around the median
does not assume normality or equal variances
Please run a Wilcoxon Signed-Rank Test to determine if people prefer job security or job pay. What is the p-value? What would you conclude?
median(pay_and_security$security)
## [1] 4.5
median(pay_and_security$pay)
## [1] 5
wilcox.test(pay_and_security$security, pay_and_security$pay, paired=TRUE, exact=FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: pay_and_security$security and pay_and_security$pay
## V = 92, p-value = 0.03269
## alternative hypothesis: true location shift is not equal to 0
Write a communication block of your findings.
People desire job security (Mdn = 4.5) less often
than or job pay (Mdn = 5). A Wilcoxon signed-rank test was
conducted to test whether the ranked difference scores were
significantly different. This difference, -1.13, was significant
V = 92, p-value = 0.03269. Based on this result, we can
conclude the medians differ and that people desire job pay more than job
security.
Sometimes you have more than two means you want to compare, this is
where ANOVA comes in. Let’s use the data
clinicaltrial.Rdata. This fictional data looks at the
impacts of a new drug, Joyzepam, a fictional antidepressant drug. We
have three conditions: placebo, an existing antidepressant, Anxifree,
and the new drug, Jayzepam. The outcome is overall improvement in each
person’s mood on a scale from -5 to +5.
head(clin.trial)
## drug therapy mood.gain
## 1 placebo no.therapy 0.5
## 2 placebo no.therapy 0.3
## 3 placebo no.therapy 0.1
## 4 anxifree no.therapy 0.6
## 5 anxifree no.therapy 0.4
## 6 anxifree no.therapy 0.2
table(clin.trial$drug)
##
## placebo anxifree joyzepam
## 6 6 6
#See how many people we have in each group
xtabs( ~drug,clin.trial)
## drug
## placebo anxifree joyzepam
## 6 6 6
Let’s start by running some descriptive statistics of our data. First, identify the mean of each group.
# Get summary statistics for the the groups.
clin.trial %>%
group_by(drug, therapy) %>%
summarise(mean = mean(mood.gain),
sd = sd(mood.gain),
n = n())
## `summarise()` has grouped output by 'drug'. You can override using the
## `.groups` argument.
## # A tibble: 6 × 5
## # Groups: drug [3]
## drug therapy mean sd n
## <fct> <fct> <dbl> <dbl> <int>
## 1 placebo no.therapy 0.3 0.2 3
## 2 placebo CBT 0.6 0.3 3
## 3 anxifree no.therapy 0.4 0.2 3
## 4 anxifree CBT 1.03 0.208 3
## 5 joyzepam no.therapy 1.47 0.208 3
## 6 joyzepam CBT 1.5 0.265 3
Or, the text book suggested to use aggregate() function
to calculate means and standard deviations for
mood.gain.
aggregate( mood.gain ~ drug, clin.trial, mean)
## drug mood.gain
## 1 placebo 0.4500000
## 2 anxifree 0.7166667
## 3 joyzepam 1.4833333
The column mood.gain in the output from using the
function aggregate() gives the means for each drug
classification.
aggregate( mood.gain ~ drug, clin.trial, sd)
## drug mood.gain
## 1 placebo 0.2810694
## 2 anxifree 0.3920034
## 3 joyzepam 0.2136976
Creating a visual representation of our data – separated by the three groups.
ggplot(clin.trial, aes(x = drug, y = mood.gain, fill = therapy)) +
geom_boxplot() +
labs(x = "Drug", y = "Mood Gain", fill = "Therapy")
Or, Navarro recommended in chapter 14 to use the function
plotmeans() from gplots package.
# install.packages("gplots")
library(gplots)
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
plotmeans( formula = mood.gain ~ drug, # plot mood.gain by drug
data = clin.trial, # the data frame
xlab = "Drug Administered", # x-axis label
ylab = "Mood Gain", # y-axis label
n.label = FALSE # don't display sample size
)
The average mood gain for three conditions (Joyzepam group, Anxifree group, and placebo group) is plotted above and the error bars represent the 95% confidence intervals. The plot shows that the Joyzepam group experances a larger improvement in mood compared to the Anxifree group and the placebo group. The Anxifree group also had a slightly larger mood gain than the control group, but the difference between the two groups was not as large as the difference between the Joyzepam and the other two groups. We will need to conduct an ANOVA statistical test to determine if the differences are significant of due by chance alone.
The textbook explains this graph as
The assumptions of ANOVA include:
o Homogeneity of variance across groups
o Observations should be independent (cannot test)
o The dependent variable needs to be measured on at least interval scale (cannot test)
o Within groups, the distributions are normal.
We should test for homogeneity of variance before we get started. Levene’s test will test whether the variance of the groups is the same (or different)
Determine whether there is homogeneity across groups. Please run the appropriate test. Is this test significant? What would we conclude (are the variances of the groups significantly different or the same)?
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
leveneTest(clin.trial$mood.gain ~ as.factor(clin.trial$drug))
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 1.4672 0.2618
## 15
leveneTest(clin.trial$mood.gain ~ as.factor(clin.trial$therapy))
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.3592 0.5573
## 16
The p-value for the drug type variable is greater than 0.05 (p-value = 0.2618), therefore I do not have evidence to conclude that the variances are different. The p-value for the therapy type is greater than 0.05 (p-value = 0.5573), therefore I do not have evidence to conclude that the variances are different.
Determine whether the distributions are normal within group. Please run a test (or look at distributions) to determine if the distributions are normal within groups. If you ran the test, is this test significant? What do you conclude?
You have to test for normality for each group.
# Loop through each type of the "drug" variable and create a histogram of the "mood.gain" column
for (drug_type in levels(clin.trial$drug)) {
drug_subset <- subset(clin.trial, drug == drug_type)
hist(drug_subset$mood.gain)
}
# for loop through each level of the "drug" variable and run shapiro.test() on the "mood.gain" column
for (drug_type in levels(clin.trial$drug)) {
drug_subset <- subset(clin.trial, drug == drug_type)
test_result <- shapiro.test(drug_subset$mood.gain)
cat(paste("Drug type:", drug_type, "- W = ", round(test_result$statistic, 3), ", p-value = ", round(test_result$p.value, 3), "\n"))
}
## Drug type: placebo - W = 0.961 , p-value = 0.83
## Drug type: anxifree - W = 0.956 , p-value = 0.79
## Drug type: joyzepam - W = 0.824 , p-value = 0.096
The histograms don’t look great but this is a really small sample and just a simple demonstration. In a real world sample, we would have more data points.
For “placebo” in drug type, the p-value is greater than 0.05 (p-value = 0.83), therefore I do not have evidence to conclude that this data is not normal. I fail to reject the null that the data is normal.
For “anxifree” in drug type, the p-value is greater than 0.05 (p-value = 0.79), therefore I do not have evidence to conclude that this data is not normal. I fail to reject the null that the data is normal.
For “joyzepam” in drug type, the p-value is greater than 0.05 (p-value = 0.096), therefore I do not have evidence to conclude that this data is not normal. I fail to reject the null that the data is normal.
And for the other column:
# Loop through each type of the "therapy" variable and create a histogram of the "mood.gain" column
for (therapy_type in levels(clin.trial$therapy)) {
therapy_subset <- subset(clin.trial, therapy == therapy_type)
hist(therapy_subset$mood.gain)
}
# for loop through each level of the "therapy" variable and run shapiro.test() on the "mood.gain" column
for (therapy_type in levels(clin.trial$therapy)) {
drug_subset <- subset(clin.trial, therapy == therapy_type)
test_result <- shapiro.test(drug_subset$mood.gain)
cat(paste("Therapy type:", therapy_type, "- W = ", round(test_result$statistic, 3), ", p-value = ", round(test_result$p.value, 3), "\n"))
}
## Therapy type: no.therapy - W = 0.87 , p-value = 0.122
## Therapy type: CBT - W = 0.994 , p-value = 1
For “no.therapy” in therapy type , the p-value is greater than 0.05 (p-value = 0.122), therefore I do not have evidence to conclude that this data is not normal. I fail to reject the null that the data is normal.
For “CBT” in therapy type, the p-value is greater than 0.05 (p-value = 1), therefore I do not have evidence to conclude that this data is not normal. I fail to reject the null that the data is normal.
Please run the ANOVA and interpret the results.
ANOVA <- aov(mood.gain ~ drug, data = clin.trial)
model <- aov(mood.gain ~ drug + therapy, data = clin.trial)
summary(ANOVA)
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 2 3.453 1.7267 18.61 8.65e-05 ***
## Residuals 15 1.392 0.0928
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 2 3.453 1.7267 26.149 1.87e-05 ***
## therapy 1 0.467 0.4672 7.076 0.0187 *
## Residuals 14 0.924 0.0660
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
“R doesn’t use the names ”between-group” and within-group”. Instead, it tries to assign more meaningful names: in our particular example, the between groups variance corresponds to the effect that the drug has on the outcome variable; and the within groups variance is corresponds to the “leftover” variability, so it calls that the residuals.”
From the summary of ANOVA: This only tests the three
groups. The F-value is 18.61. The p-value =8.65e-05. The
*** directs the viewer’s attention to how significant the
p-value is. A significant effect of the drug therapy on mood
change, F(2)=18.61, p = 8.65e-05. I reject the null and have evidence to
conclude that there are differences between the groups. At least one
group is not equal.
I know that we are only doing a one-way ANOVA, but you gave me data for a two way. So, I tried to learn how to do it. A two-way ANOVA was conducted to examine the effects of drug (placebo, anxifree, and joyzepam) and therapy (no.therapy and CBT) on mood gain. The main effect of drug was significant, F(2, 14) = 26.149, p = 1.87e-05, as was the main effect of therapy, F(1, 14) = 7.076, p = .0187
One problem with ANOVA is that while it can tell you that at least one of the means is significantly different, you don’t know which means are different from which other means.
We can do post-hoc tests to determine which means are significantly different from the others. There are many options, but let’s focus on the following:
• Bonferroni is the simplest adjustment where we simply divide the alpha level by the number of comparisons. Unfortunately, this correction is probably a bit conservative – meaning that we will not find significant results even when evidence may be strong enough to reject. Because of this, most people use different corrections.
• The Holm correct is more often used and has more power to detect differences without increasing the Type 1 error rate.
Run a post-hoc test with a Holm correction.
From lecture:
pairwise.t.test(clin.trial$mood.gain, clin.trial$drug, p.adj="holm")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: clin.trial$mood.gain and clin.trial$drug
##
## placebo anxifree
## anxifree 0.1502 -
## joyzepam 9.1e-05 0.0011
##
## P value adjustment method: holm
Or, from the textbook:
library(lsr)
posthocPairwiseT(ANOVA, p.adjust.method="holm")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: mood.gain and drug
##
## placebo anxifree
## anxifree 0.1502 -
## joyzepam 9.1e-05 0.0011
##
## P value adjustment method: holm
Which groups have significantly different means?
Post hoc tests using a Holm correction for multiple comparisons were conducted to analyze the pairwise differences between the “mood.gain” and “drug” groups. The results revealed that there was no significant difference in the mood change between the “mood.gain” and “drug” conditions when comparing anxifree and placebo (p = 0.1502).There was a significant difference in the mood change between the “mood.gain” and “drug” conditions when comparing joyzepam and placebo (p = 9.1e-05) and when comparing joyzepam and anxifree.
Write a communication block of your findings.
In an experiment to examine the effect that different drugs have on mood, 18 people were assigned to three conditions: placebo, anxifree, and joyzepam. The mean mood change was lowest for the placebo group (M = 0.45, SD = 0.28), higher for anxifree condition (M = 0.71, SD = 0.39) and the highest for the joyzepam condition (M = 1.48, SD = 0.21). A one-way ANOVA was conducted to examine the effect these drugs effect on mood gain. The main effect of drug was significant, F(2)=18.61, p = 8.65e-05. I reject the null and have evidence to conclude that there are differences between the groups. Post-hoc t-tests with a Holm adjustment revealed that the mean mood gain for anxifree was not significantly higher than that for placebo (p = .150). However,the mean mood gain for joyzepam was significantly higher for anxifree (p = .001) and placebo (p < .001). These results suggest that there are significant differences in mood gain among the joyzepam drug to the other drugs tested.
If the assumption of normality is not met, we can run a
Kruskal-Wallis test that works similarly to our other non-parametric
tests, where we use the ranks of the data instead of the values
themselves. We will use the same data
(clinialtrail.Rdata) as in the previous section.
sapply(clin.trial, class)
## drug therapy mood.gain
## "factor" "factor" "numeric"
Please run a non-parametric version of ANOVA:
kruskal.test(mood.gain ~ drug, data=clin.trial)
##
## Kruskal-Wallis rank sum test
##
## data: mood.gain by drug
## Kruskal-Wallis chi-squared = 12.076, df = 2, p-value = 0.002386
K(2) = 12.076, p = 0.002386
Run a non-parametric post-hoc test with Holm correction.
pairwise.wilcox.test(clin.trial$mood.gain, clin.trial$drug, p.adjust.method="holm", exact=FALSE)
##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: clin.trial$mood.gain and clin.trial$drug
##
## placebo anxifree
## anxifree 0.261 -
## joyzepam 0.015 0.015
##
## P value adjustment method: holm
Which groups are significantly different from one another?
Median mood gain for joyzepam was significantly higher than that for placebo (p = 0.015) and anxifree (p = 0.015)
In an experiment to examine the effect that different drugs
have on mood, 18 people were assigned to three conditions: placebo,
anxifree, and joyzepam. The mean mood change was lowest for the placebo
group (M = 0.45, SD = 0.28), higher for anxifree condition (M = 0.71, SD
= 0.39) and the highest for the joyzepam condition (M = 1.48, SD =
0.21). A Kruskal-Wallis rank sum test was conducted to examine the
effect these drugs effect on mood gain. The main effect of drug was
significant, K(2) = 12.076, p = 0.002386. I reject the null
and have evidence to conclude that there are differences between the
groups. Post-hoc pairwise comparisons using Wilcoxon rank sum test with
continuity correction revealed that the median mood gain for anxifree
was not significantly higher than that for placebo (p = .261), and the
median mood gain for joyzepam was significantly higher than that for
placebo (p = 0.015) and anxifree (p = 0.015). These results suggest that
there are significant differences in mood gain among the three drugs
tested.
When we learned about the independent t-test we learned about the Welch test if the data does not meet the homogeneity of variance assumption. We have a similar option for the ANOVA test. Read section 14.8 in your text to learn about this alternative test.
What R code can you use to test ANOVA if the homogeneity of variance assumption is not met?
Welch one-way test
Run this code and interpret the results (even if the data meet the assumption in this case).
oneway.test(mood.gain ~ drug, data = clin.trial)
##
## One-way analysis of means (not assuming equal variances)
##
## data: mood.gain and drug
## F = 26.322, num df = 2.0000, denom df = 9.4932, p-value = 0.000134
Download data from canvas titled ATV_riders.xlsx. This
data shows the type of ATV rider (ATV_type) and the number of years the
rider has been riding (Yrs_riding), this data is based off values in the
Waight & Bath (2014) article Recreation Specialization Among ATV
Users and Its Relationship to Environmental Attitudes and Management
Preferences on the Island of Newfoundland. While this data is based on
it, it is not their actual data (Dr. Snopkowski made it up!).
# Investigate the data
sapply(ATV_riders_1, class)
## ATV_type Yrs_riding
## "character" "numeric"
Data Cleaning:
# Change column ATV_type to factor class
ATV_riders_1$ATV_type <- factor(ATV_riders_1$ATV_type)
Study Details:
length(ATV_riders_1$ATV_type)
## [1] 77
In a study to examine ATV rider (ATV_type) and the number of years the rider has been riding (Yrs_riding), 77 people where classified into one of three categories: Active, Casual, and Dedicated.
Please create a graph of the data so that you can visually see the mean and its associated error of the three groups.
plotmeans( formula = Yrs_riding ~ ATV_type, # plot Yrs_riding by ATV_type
data = ATV_riders_1, # the data frame
xlab = "Years riding", # x-axis label
ylab = "ATV Type", # y-axis label
n.label = FALSE # don't display sample size
)
aggregate(Yrs_riding ~ ATV_type, ATV_riders_1, mean)
## ATV_type Yrs_riding
## 1 Active 15.160558
## 2 Casual 8.322483
## 3 Dedicated 21.253532
aggregate(Yrs_riding ~ ATV_type, ATV_riders_1, sd)
## ATV_type Yrs_riding
## 1 Active 4.145183
## 2 Casual 4.096048
## 3 Dedicated 3.715762
The mean years riding was lowest for the Casual classification (M = 8.32, SD = 4.10), higher for Active classification (M = 15.16, SD = 4.15) and the highest for the Dedicated classification (M = 21.25, SD = 3.72).
Does the data meet the assumptions of ANOVA (hint: you may need to run some tests)
The data is independent
leveneTest(ATV_riders_1$Yrs_riding ~ (ATV_riders_1$ATV_type))
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.4482 0.6405
## 74
The p-value for the ATV type variable is greater than 0.05 (p-value = 0.6405), therefore I do not have evidence to conclude that the variances are different.
# Loop through each type of the "ATV" variable and create a histogram of the "Yrs_riding" column
for (ATV_type in levels(ATV_riders_1$ATV_type)) {
ATV_subset <- subset(ATV_riders_1, ATV_type == ATV_type)
hist(ATV_subset$Yrs_riding)
}
# for loop through each level of the "ATV" variable and run shapiro.test() on the "Yrs_riding" column
for (ATV_type in levels(ATV_riders_1$ATV_type)) {
ATV_subset <- subset(ATV_riders_1, ATV_type == ATV_type)
test_result <- shapiro.test(ATV_subset$Yrs_riding)
cat(paste("ATV type:", ATV_type, "- W = ", round(test_result$statistic, 3), ", p-value = ", round(test_result$p.value, 3), "\n"))
}
## ATV type: Active - W = 0.956 , p-value = 0.01
## ATV type: Casual - W = 0.956 , p-value = 0.01
## ATV type: Dedicated - W = 0.956 , p-value = 0.01
The histograms don’t look very normal.
For“Active” in ATV type, the p-value is less than 0.05 (p-value = 0.01), therefore I have evidence to conclude that this data is not normal. I reject the null in favor of the alternative that the data is not normal.
For“Casual in ATV type, the p-value is less than 0.05 (p-value = 0.01), therefore I have evidence to conclude that this data is not normal. I reject the null in favor of the alternative that the data is not normal.
For“Dedicated”in ATV type, the p-value is less than 0.05 (p-value = 0.01), therefore I do not have evidence to conclude that this data is not normal. I reject the null in favor of the alternative that the data is not normal.
Run the appropriate test and interpret the output.
Kruskal-Wallis rank sum test is a non-parametric test for three of more groups. This tests is appropriate because it does not have the assumption that the data is normally distributed.
kruskal.test(Yrs_riding ~ ATV_type, data=ATV_riders_1)
##
## Kruskal-Wallis rank sum test
##
## data: Yrs_riding by ATV_type
## Kruskal-Wallis chi-squared = 51.014, df = 2, p-value = 8.363e-12
A Kruskal-Wallis rank sum test was conducted to examine the
effect these drugs effect on mood gain. The main effect of the ATV type
of rider was significant,
K(2) = 51.014, p-value = 8.36e-12. I reject the null that
the medians are equal in favor of the alternative. At least one ATV type
of rider has a median difference from the other classifications. In
other words, I have evidence to conclude that there are differences
between the medians of the groups.
Please run a post-hoc test to determine which of the means are significantly different from each other.
pairwise.wilcox.test(ATV_riders_1$Yrs_riding, ATV_riders_1$ATV_type, p.adjust.method="holm", exact=FALSE)
##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: ATV_riders_1$Yrs_riding and ATV_riders_1$ATV_type
##
## Active Casual
## Casual 6.5e-06 -
## Dedicated 6.5e-06 5.5e-09
##
## P value adjustment method: holm
Post-hoc pairwise comparisons using Wilcoxon rank sum test with continuity correction revealed that the median years riding for Dedicated was significantly higher than that for Casual (p = 5.5e-09) and Active (p = 6.5e-06). Also the median years riding for Active was significantly higher than that for Casual (p = 6.5e-06). These results suggest that there are significant differences in years riding among the three types of ATV riders.
Write a communication block of your results
In a study to examine ATV rider (ATV_type) and the number of
years the rider has been riding (Yrs_riding), 77 people where classified
into one of three categories: Active, Casual, and Dedicated. The mean
years riding was lowest for the Casual classification (M = 8.32, SD =
4.10), higher for Active classification (M = 15.16, SD = 4.15) and the
highest for the Dedicated classification (M = 21.25, SD = 3.72). A
Kruskal-Wallis rank sum test was conducted to examine the effect these
drugs effect on mood gain. The main effect of the ATV type of rider was
significant, K(2) = 51.014, p-value = 8.36e-12. I reject
the null that the medians are equal in favor of the alternative. At
least one ATV type of rider has a median difference from the other
classifications. In other words, I have evidence to conclude that there
are differences between the medians of the groups. Post-hoc pairwise
comparisons using Wilcoxon rank sum test with continuity correction
revealed that the median years riding for Dedicated was significantly
higher than that for Casual (p = 5.5e-09) and Active (p = 6.5e-06). Also
the median years riding for Active was significantly higher than that
for Casual (p = 6.5e-06). These results suggest that there are
significant differences in years riding among the three types of ATV
riders.
A neurologist might be interested in the depressant effects of certain recreational drugs. She tested 20 individuals: 10 were given ecstasy to take on a Saturday night and 10 were allowed to drink alcohol only. Levels of depression are measured using the Beck Depression Inventory (BDI) the day after and midweek.
names(Drug_df)
## [1] "Drug" "Sunday_BDI" "Wednesday_BDI"
Data Cleaning:
# Replace 1 with Ecstasy and 2 with Alcohol
Drug_df$Drug <- ifelse(Drug_df$Drug == 1, "Ecstasy", "Alcohol")
# Check the class of the Drug column
if (class(Drug_df$Drug) != "factor") {
# Convert the Drug column to a factor
Drug_df$Drug <- factor(Drug_df$Drug)
}
In a study to examine the depressant effects of certain recreational drugs, 20 people where classified into one of two treatments: “Ecstasy” or “Alcohol” and then measured using Beck Depression Inventory (BDI) the day after and midweek.
Since columns Sunday_BDI and Wednesday_BDI
are measurements taken the day after drug treatment and midweek after
drug treatment, data is dependent. I will need to run a statistical test
to determine if there are differences between the two Drug
treatments. The Wilcoxon signed-rank test is a paired t-test should
work.
Be sure to present a descriptive chart of the depression scores for Sunday and Wednesday. Check assumptions of the model and run the appropriate test. (Note: alcohol drinkers are not used in this question).
# Get summary statistics for the the Sunday_BDI and test2 columns.
Drug_summary <- Drug_df %>%
summarise(n = n(),
mean_Sunday = mean(Sunday_BDI),
mean_Wednesday = mean(Wednesday_BDI),
sd_Sunday = sd(Sunday_BDI),
sd_Wednesday = sd(Wednesday_BDI))
Drug_summary
## # A tibble: 1 × 5
## n mean_Sunday mean_Wednesday sd_Sunday sd_Wednesday
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 20 18 21.0 5.08 12.9
In a study to examine the depressant effects of certain recreational drugs, 20 people received one of two drug treatments: “Ecstasy” or “Alcohol” and then measured using Beck Depression Inventory (BDI) the day after and midweek.The mean Sunday BDI was lower for the Sunday measure (M = 18, SD = 5.08), and higher for Wednesday classification (M = 21.05, SD = 12.92).
Are the assumptions of the test met?
The Wilcoxon signed-rank test does not assume normality or equal variances The assumptions include:
dependent data
symmetric distribution around the median which can be tested by evaluating a histogram of the median difference.
# Create a vector of the difference between day after and midweek BDI scores.
diff <- Drug_df$Wednesday_BDI - Drug_df$Sunday_BDI
# Create histogram of diff
hist(diff)
# Add median line to histogram
abline(v = median(diff), col = "red", lwd = 2)
The histogram appears to have symmetric distribution around the median.All assumptions are met.
We only focus on “Ecstasy” for this demonstration.
ecstasy <- Drug_df %>%
filter(Drug == "Ecstasy")
# Get summary statistics for the the Sunday_BDI and test2 columns.
ecstasy_summary <- ecstasy %>%
summarise(n = n(),
mean_Sunday = mean(Sunday_BDI),
mean_Wednesday = mean(Wednesday_BDI),
sd_Sunday = sd(Sunday_BDI),
sd_Wednesday = sd(Wednesday_BDI))
ecstasy_summary
## # A tibble: 1 × 5
## n mean_Sunday mean_Wednesday sd_Sunday sd_Wednesday
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 10 19.6 32 6.60 4.78
# Create a vector of the difference between day after and midweek BDI scores.
diff_e <- ecstasy$Wednesday_BDI - ecstasy$Sunday_BDI
# Create histogram of diff
hist(diff_e)
# Add median line to histogram
abline(v = median(diff_e), col = "red", lwd = 2)
median(diff_e)
## [1] 14
Does the depression scores of ecstasy users’ change from Sunday to Wednesday? Run the appropriate test
wilcox.test(ecstasy$Sunday_BDI, ecstasy$Wednesday_BDI, paired=TRUE, exact=FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: ecstasy$Sunday_BDI and ecstasy$Wednesday_BDI
## V = 0, p-value = 0.01403
## alternative hypothesis: true location shift is not equal to 0
A Wilcoxon signed-rank test was conducted to test whether the
ranked difference scores were significantly different. This median
difference, 14, was significant
V = 0, p = 0.0.01403.
Write a communication block of your results
In a study examining the depressant effects of recreational drugs, a neurologist tested 20 individuals, 10 of whom were given ecstasy on a Saturday night and 10 who were allowed to drink alcohol only. Levels of depression were measured using the Beck Depression Inventory (BDI) the day after (Sunday) and midweek (Wednesday). For the participants who took ecstasy, the mean BDI scores were 19.6 (SD = 6.603) on Sunday and 32 (SD = 4.78) on Wednesday. The median difference in BDI scores between Sunday and Wednesday was 14.
A Wilcoxon signed-rank test was conducted to determine if there was a significant difference in BDI scores between Sunday and Wednesday for individuals who took ecstasy. The test yielded a V-value of 0 and a significant p-value of 0.01403. Based on these results, we can conclude that there is a significant difference in the depression levels of individuals who took ecstasy between Sunday and Wednesday, with depression levels being higher on Wednesday.
Create flowcharts for the dependent t-test & ANOVA tests.
Paired t-test
ANOVA