knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE
)
Previous sections:
Note: This analysis is used for personal study purpose. In this section, I will summerize some basic commands for assessing normality based on several online sources.
R Packages:
# More packages will be shown during the analysis process
# Load the required library
library(tidyverse) # Data Wrangling
library(conflicted) # Dealing with conflict package
library(readxl) # Read csv file
Dealing with Conflicts
There is a lot of packages here, and sometimes individual functions are
in conflict. R’s default conflict resolution system gives precedence to
the most recently loaded package. This can make it hard to detect
conflicts, particularly when introduced by an update to an existing
package.
conflict_prefer("filter", "dplyr")
conflict_prefer("select", "dplyr")
conflict_prefer("Predict", "rms")
conflict_prefer("impute_median", "simputation")
conflict_prefer("summarize", "dplyr")
Data used in this notes
fakestroke.csv
(Source: https://github.com/THOMASELOVE/432-data/blob/master/data/fakestroke.csv)
Loading the data, adjust the character of each column:
# Loading the data, adjust the character of each column
data <- read_excel("D:/Statistics/R/R data/fakestroke.xlsx",
col_types = c("text", "text", "numeric",
"text", "numeric", "text", "text",
"numeric", "numeric", "text", "numeric",
"text", "numeric", "numeric", "text",
"numeric", "numeric", "numeric"))
df = data
The fakestroke.csv file contains the following 18 variables for 500 patients:
# To make a table in R markdown: 1st row: header, 2nd row: Alignment; the remaining row: for content
## |Variable | Description |
## |:------- | :---------- |
## |studyid | Study ID # (z001 through z500) |
| Variable | Description |
|---|---|
| studyid | Study ID # (z001 through z500) |
| trt | Treatment group (Intervention or Control) |
| age | Age in years |
| sex | Male or Female |
| nihss | NIH Stroke Scale Score (can range from 0-42; higher scores indicate more severe neurological deficits) |
| location | Stroke Location - Left or Right Hemisphere |
| hx.isch | History of Ischemic Stroke (Yes/No) |
| afib | Atrial Fibrillation (1 = Yes, 0 = No) |
| dm | Diabetes Mellitus (1 = Yes, 0 = No) |
| mrankin | Pre-stroke modified Rankin scale score (0, 1, 2 or > 2) indicating functional disability - complete range is 0 (no symptoms) to 6 (death) |
| sbp | Systolic blood pressure, in mm Hg |
| iv.altep | Treatment with IV alteplase (Yes/No) |
| time.iv | Time from stroke onset to start of IV alteplase (minutes) if iv.altep=Yes |
| aspects | Alberta Stroke Program Early Computed Tomography score, which measures extent of stroke from 0 - 10; higher scores indicate fewer early ischemic changes |
| ia.occlus | Intracranial arterial occlusion, based on vessel imaging - five categories3 |
| extra.ica | Extracranial ICA occlusion (1 = Yes, 0 = No) |
| time.rand | Time from stroke onset to study randomization, in minutes |
| time.punc | Time from stroke onset to groin puncture, in minutes (only if Intervention) |
First, I will add the table about parametric or non-parametric test for independent groups. This table will help you have an overview about which approach is suitable with your variables:
It is very important that you check the assumptions before deciding which statistical test is appropriate. Most parametric tests based on the normal distribution have four basic assumptions that must be met for the test to be accurate. Field et al. 2012 indicated that:
Normally distributed data: In short, the rationale behind hypothesis testing relies on having something that is normally distributed.
Homogeneity of variance: The variances should be the same throughout the data. In designs in which you test several groups of participants this assumption means that each of these samples comes from populations with the same variance. In correlational designs, this assumption means that the variance of one variable should be stable at all levels of the other variable.
Interval data: Data should be measured at least at the interval level. This assumption is tested by common sense and so won’t be discussed further.
Independence: This assumption, like that of normality, is different depending on the test you’re using. In some cases it means that data from different participants are independent, which means that the behaviour of one participant does not influence the behaviour of another. In repeated-measures designs (in which participants are measured in more than one experimental condition), we expect scores in the experimental conditions to be non-independent for a given participant, but behaviour between different participants should be independent.
According to thomaselove.github, data are well approximated by a Normal distribution if the shape of the data’s distribution is a good match for a Normal distribution with mean and standard deviation equal to the sample statistics: The data are symmetrically distributed about a single peak, located at the sample mean.
Several tools for assessing Normality of a single batch of data:
A histogram with superimposed Normal distribution.
Histogram variants (like the boxplot) which provide information
on the center, spread and shape of a distribution.
The Empirical Rule for interpretation of a standard deviation
A specialized normal Q-Q plot (also called a normal probability plot or normal quantile-quantile plot) designed to reveal differences between a sample distribution and what we might expect from a normal distribution of a similar number of values with the same mean and standard deviation.
According to thomaselove.github, most of the time, when we want to understand whether our data are well approximated by a Normal distribution, we will use a graph to aid in the decision.
One option is to build a histogram with a Normal density function (with the same mean and standard deviation as our data) superimposed. This is one way to help visualize deviations between our data and what might be expected from a Normal distribution.
Package: tidyverse
Command: ggplot(dataframe, aes ( )) + geom_histogram(aes(y =
..density..)) + stat_function(fun = dnorm, args = list(mean = mean(
numeric variable, na.rm = T), sd = sd(numeric variable, na.rm =
T)))
Let examine the distribution of sbp and age:
ggplot(df, aes(sbp)) +
geom_histogram(aes(y = ..density..), binwidth = 1, colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean( df$sbp , na.rm = T), sd = sd( df$sbp , na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "sbp (mmHg)", y = "Density") +
theme_bw()
ggplot(df, aes(age)) +
geom_histogram(aes(y = ..density..), binwidth = 1, colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean( df$age, na.rm = T), sd = sd( df$age, na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "age (years)", y = "Density") +
theme_bw()
Discuss
sbp: The blue density curve is single-peaked,
and bell-shaped. Therefore, it is a normal curve.
=> We can say that the sbp data distribution seems
like normal.
age: The curve looks like a little bit left-skew: seems like non-normal distribution.
However, the density plot is not really helpful for understanding the tail behavior. I advice to draw a normal Q-Q plot before further conclusion.
In the Q-Q plot, the data are ranked and sorted. Each value is
compared to the expected value that the score should have in a normal
distribution and they are plotted against one another. If the data are
normally distributed then the actual scores will have the same
distribution as the score we expect from a normal distribution, and
you’ll get a lovely straight diagonal line.
Let draw a Q-Q plot for both sbp and
age variables. There are 2 types of commands for Q-Q
plot. First let’s use a quick command:
Package: ggpubr
# Drawing QQ-plot for sbp:
library(ggpubr)
a = ggqqplot(df$sbp, ylab = "sbp")
# Drawing density plot for sbp:
b= ggplot(df, aes(sbp)) +
geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean( df$sbp, na.rm = T), sd = sd( df$sbp, na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "sbp (mmHg)", y = "Density") +
theme_bw()
# Combine plot:
library(ggpubr)
plot = ggarrange(a,b,
ncol=2, nrow=1,
common.legend = FALSE,
legend="right",
labels = c("A","B"))
annotate_figure(plot, bottom = text_grob("The distribution for sbp",
color = "red", face = "bold", size = 12))
# Drawing QQ-plot for age:
library(ggpubr)
a = ggqqplot( df$age, ylab = "age")
# Drawing density plot for age:
b = ggplot(df, aes(age)) +
geom_histogram(aes(y = ..density..), binwidth = 1, colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean( df$age, na.rm = T), sd = sd( df$age, na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "age (years)", y = "Density") +
theme_bw()
# Combine plot:
library(ggpubr)
plot = ggarrange(a,b,
ncol=2, nrow=1,
common.legend = FALSE,
legend="right",
labels = c("A","B"))
annotate_figure(plot, bottom = text_grob("The distribution for age",
color = "red", face = "bold", size = 12))
Let’s try using a more complicated command:
Package: tidyverse
# Drawing Q-Q plot for sbp
a = ggplot(data = df, aes(sample = sbp)) +
geom_qq() +
geom_qq_line() +
labs( x = "Theoretical", y = "sbp")
# Drawing density plot for sbp:
b= ggplot(df, aes(sbp)) +
geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean(df$sbp, na.rm = T), sd = sd(df$sbp, na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "sbp (mmHg)", y = "Density") +
theme_bw()
# Combine plot:
library(ggpubr)
plot = ggarrange(a,b,
ncol=2, nrow=1,
common.legend = FALSE,
legend="right",
labels = c("A","B"))
annotate_figure(plot, bottom = text_grob("The distribution for sbp",
color = "red", face = "bold", size = 12))
# Drawing QQ-plot for age:
library(ggpubr)
a = ggplot(data = df, aes(sample = age)) +
geom_qq() +
geom_qq_line() +
labs( x = "Theoretical", y = "age")
# Drawing density plot for age:
b = ggplot(df, aes(age)) +
geom_histogram(aes(y = ..density..), binwidth = 1, colour = "black", fill = "white") +
stat_function(fun = dnorm,
args = list(mean = mean(df$age, na.rm = T), sd = sd(df$age, na.rm = T)),
lwd = 1, col = "blue") +
labs(x = "age (years)", y = "Density") +
theme_bw()
# Combine plot:
library(ggpubr)
plot = ggarrange(a,b,
ncol=2, nrow=1,
common.legend = FALSE,
legend="right",
labels = c("A","B"))
annotate_figure(plot, bottom = text_grob("The distribution for age",
color = "red", face = "bold", size = 12))
Discuss
For sbp: These simulated data appear to be well-modeled by the Normal distribution, because the points on the Normal Q-Q plot follow the diagonal reference line:
For age:
The density plot shows left skew, with a longer tail on the left
hand side and more clustered data at the right end of the
distribution.
The Q-Q plot shows that most of the points are under the straight
line: Curving down and away from the line in both tails => left
skew.
If on the Q-Q plot, most of the points are above the straight line: Curving up and away from the line in both tails => right skew.
From https://thomaselove.github.io/431-notes:
To examine the descriptive analysis of variables.
Package: psych
Command:
library(psych)
psych::describe(cbind(df$age, df$sbp))
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 500 64.71 17.05 65.75 65.49 15.94 23 96 73 -0.33 -0.40 0.76
## X2 2 499 145.48 25.14 145.00 145.44 25.20 78 231 153 0.05 -0.11 1.13
Discuss
=> We should use another command to calculate the p-value.
To examine the descriptive analysis of variables or factors of variables.
Package: psych
Command:
library(psych)
describe.by(cbind(df$age, df$sbp))
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 500 64.71 17.05 65.75 65.49 15.94 23 96 73 -0.33 -0.40 0.76
## X2 2 499 145.48 25.14 145.00 145.44 25.20 78 231 153 0.05 -0.11 1.13
describe.by(cbind(df$age, df$sbp), df$trt)
##
## Descriptive statistics by group
## INDICES: Control
## vars n mean sd median trimmed mad min max range skew kurtosis se
## V1 1 267 65.38 16.1 65.7 66.1 15.27 24 94 70 -0.29 -0.32 0.99
## V2 2 266 145.00 24.4 145.0 144.8 24.46 82 231 149 0.15 0.03 1.50
## ------------------------------------------------------------
## INDICES: Intervention
## vars n mean sd median trimmed mad min max range skew kurtosis se
## V1 1 233 63.93 18.09 65.8 64.76 16.01 23 96 73 -0.33 -0.56 1.19
## V2 2 233 146.03 26.00 146.0 146.14 26.69 78 214 136 -0.06 -0.26 1.70
To quickly examine the descriptive analysis of variables.
Package: pastecs
Command:
library(pastecs)
stat.desc(cbind(df$age, df$sbp), basic=F, norm=T, p=0.95)
## V1 V2
## median 6.575000e+01 145.00000000
## mean 6.470580e+01 145.47895792
## SE.mean 7.627075e-01 1.12525785
## CI.mean.0.95 1.498514e+00 2.21083795
## var 2.908613e+02 631.83640373
## std.dev 1.705466e+01 25.13635621
## coef.var 2.635723e-01 0.17278345
## skewness -3.305647e-01 0.04582062
## skew.2SE -1.513328e+00 0.20955860
## kurtosis -3.970448e-01 -0.11045899
## kurt.2SE -9.106278e-01 -0.25308825
## normtest.W 9.744833e-01 0.99840676
## normtest.p 1.183624e-07 0.93364481
Discuss
=> This result should be in used with the Q-Q plot and density plot for normality conclusion.
To quickly examine the descriptive analysis of factors in
variables.
Package: pastecs
Command:
library(pastecs)
by(cbind(df$age, df$sbp), df$trt, stat.desc, basic=F, norm=T )
## INDICES: Control
## V1 V2
## median 6.570000e+01 145.00000000
## mean 6.538240e+01 144.99624060
## SE.mean 9.853615e-01 1.49577831
## CI.mean.0.95 1.940100e+00 2.94512211
## var 2.592403e+02 595.13583487
## std.dev 1.610094e+01 24.39540602
## coef.var 2.462580e-01 0.16824854
## skewness -2.927944e-01 0.15427514
## skew.2SE -9.820483e-01 0.51648808
## kurtosis -3.177592e-01 0.02906420
## kurt.2SE -5.348310e-01 0.04882888
## normtest.W 9.699551e-01 0.99675936
## normtest.p 2.132393e-05 0.86710009
## ------------------------------------------------------------
## INDICES: Intervention
## V1 V2
## median 65.800000000 146.0000000
## mean 63.930472103 146.0300429
## SE.mean 1.185100119 1.7032014
## CI.mean.0.95 2.334933953 3.3557189
## var 327.239714000 675.9085763
## std.dev 18.089768213 25.9982418
## coef.var 0.282960029 0.1780335
## skewness -0.331285618 -0.0643649
## skew.2SE -1.038831068 -0.2018326
## kurtosis -0.556078576 -0.2618235
## kurt.2SE -0.875485563 -0.4122128
## normtest.W 0.973327147 0.9958385
## normtest.p 0.000222805 0.7869225
Mechanism: it compares the scores in the sample to a normally distributed set of scores with the same mean and standard deviation.
Warning: In large samples this test can be significant even when the scores are only slightly different from a normal distribution. Therefore, they should always be interpreted in conjunction with histograms, or Q-Q plots, and the values of skew and kurtosis.
Command:
* For numeric variables: shapiro.test(variable)
* For separate factors in 1 numeric variable: by(numeric
variable, group/categorical variable, name of test)
As a final point, bear in mind to look at the variable for separate groups. If our analysis involves comparing groups, then what’s important is not the overall distribution but the distribution in each group.
shapiro.test(df$age)
##
## Shapiro-Wilk normality test
##
## data: df$age
## W = 0.97448, p-value = 1.184e-07
shapiro.test(df$sbp)
##
## Shapiro-Wilk normality test
##
## data: df$sbp
## W = 0.99841, p-value = 0.9336
by(df$age, df$trt, shapiro.test)
## df$trt: Control
##
## Shapiro-Wilk normality test
##
## data: dd[x, ]
## W = 0.96996, p-value = 2.132e-05
##
## ------------------------------------------------------------
## df$trt: Intervention
##
## Shapiro-Wilk normality test
##
## data: dd[x, ]
## W = 0.97333, p-value = 0.0002228
by(df$sbp, df$trt, shapiro.test)
## df$trt: Control
##
## Shapiro-Wilk normality test
##
## data: dd[x, ]
## W = 0.99676, p-value = 0.8671
##
## ------------------------------------------------------------
## df$trt: Intervention
##
## Shapiro-Wilk normality test
##
## data: dd[x, ]
## W = 0.99584, p-value = 0.7869
Discuss
For age:
* age: p value < 0.05 => non-normal
distribution.
* Distribution of trt in age: p < 0.05 =>
non-normal distribution.
For sbp:
* sbp: p value > 0.05 => normal
distribution.
* Distribution of trt in sbp: p > 0.05 => normal
distribution.
According to Field et al. 2012:
Homogeneity of variance: This assumption means that the variances should be the same throughout the data. In designs in which you test several groups of participants this assumption means that each of these samples comes from populations with the same variance. In correlational designs, this assumption means that the variance of one variable should be stable at all levels of the other variable.
Example: if you measured the vertical distance between the lowest score and the highest score, all of these distances would be fairly similar. Although the means increase, the spread of scores for hearing loss is the same at each level of the concert variable.
There are many solutions to test for the equality (homogeneity) of variance across groups, including:
F-test: Compare the variances of two samples. The data must be normally distributed.
Bartlett’s test: Compare the variances of k samples, where k can be more than two samples. The data must be normally distributed. The Levene test is an alternative to the Bartlett test that is less sensitive to departures from normality.
Levene’s test: Compare the variances of k samples, where k can be more than two samples. It’s an alternative to the Bartlett’s test that is less sensitive to departures from normality.
Fligner-Killeen test: a non-parametric test which is very robust against departures from normality.
When there are only two groups the test we use to determine if the variance is the same is called a variance ratio test. The test involves dividing the variance of group one by the variance of group two. If this ratio is close to one the conclusion drawn is that the variance of each group is the same. If the ratio is far from one the conclusion drawn is that the variances are not the same.
Whether or not a variance ratio test should ever be conducted is an open question. Simulation studies have shown that for many cases where the standard t-test performs poorly, and hence we want to use the unequal variance t-test formula, the variance ratio test does not have good ability to accurately determine that the variances of the two groups are different. Conversely, for scenarios where the variance ratio test has good ability to accurately determine that the variance of the two groups are different, just using the standard t-test that incorrectly assumes the variances of the groups are equal still works very well. Combined these two results have led some to conclude that it is better to not ever use a variance ratio test.
This test uses the following null and alternative hypotheses:
H0: The group variances are equal.
HA: The group variances are not equal.
Command:
* var.test(x, y, ratio = 1, alternative = c(“two.sided”, “less”,
“greater”), conf.level = 0.95, …)
* var.test(formula, data, subset, na.action, …)
var.test(age ~ trt , data = df)
##
## F test to compare two variances
##
## data: age by trt
## F = 0.7922, num df = 266, denom df = 232, p-value = 0.06615
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.616568 1.015679
## sample estimates:
## ratio of variances
## 0.7922029
var.test(sbp ~ trt , data = df)
##
## F test to compare two variances
##
## data: sbp by trt
## F = 0.8805, num df = 265, denom df = 232, p-value = 0.3155
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.6851554 1.1291608
## sample estimates:
## ratio of variances
## 0.8804975
Discuss
For age by trt: p value > 0.05 => We fail
to reject the null hypothesis that the group variances are the
same.
=> However, this result is still contradict to the Levene’s test and
Fligner-Killeen’s test but agree with Bartlett’s test (at the next
section).
=> Because age is non-normal distribution data.
Therefore, the Fligner-Killeen and Levene’s test are suitable approaches
and the results will be more accurate.
For sbp by trt: p value > 0.05 => This
result is similar to what we found at Levene’s test and Bartlett’s
test.
=> We do not have sufficient evidence to say the variance is
significantly different across the groups.
Generally we are interested in testing whether or not there is a difference in the group means. When testing for differences in group means the specific test statistic formula to use depends on whether or not the group variances are equal.
In statistics, Bartlett’s test is used to test if k samples are from populations with equal variances. Equal variances across populations are called homoscedasticity or homogeneity of variances. Some statistical tests, for example, the ANOVA test, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption. Bartlett’s test enables us to compare the variance of two or more samples to decide whether they are drawn from populations with equal variance. It is fitting for normally distributed data.
This test uses the following null and alternative hypotheses:
Command: bartlett.test(numeric variable ~ group, dataset)
bartlett.test( age ~ trt , data = df)
##
## Bartlett test of homogeneity of variances
##
## data: age by trt
## Bartlett's K-squared = 3.3654, df = 1, p-value = 0.06658
bartlett.test( sbp ~ trt , data = df)
##
## Bartlett test of homogeneity of variances
##
## data: sbp by trt
## Bartlett's K-squared = 1.0019, df = 1, p-value = 0.3168
Discuss
For age by trt: p value > 0.05 => We fail
to reject the null hypothesis that the group variances are the
same.
=> We do not have sufficient evidence to say the variance is
significantly different across the groups.
=> However, this result is contradict to the Levene’s test
result (at the next section). This can be explained due
to the non-normal distribution of age by trt. Therefore, the
Barlett’s test produces different results.
For sbp by trt: p value > 0.05 => This result is similar to what we found at Levene’s test.
This test uses the following null and alternative hypotheses:
H0: The variance among each group is equal.
HA: At least one group has a variance that is not equal to the rest.
Package: car
Command: leveneTest(outcome variable, group, center =
median/mean)
Warning: In large samples Levene’s test can be significant even when group variances are not very different. Therefore, it should be interpreted in conjunction with the variance ratio.
Example 1: For the age:
library(car)
leveneTest(df$age, df$trt)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 4.0963 0.04351 *
## 498
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
leveneTest(df$age, df$trt, center = mean)
## Levene's Test for Homogeneity of Variance (center = mean)
## Df F value Pr(>F)
## group 1 4.3068 0.03847 *
## 498
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Discuss
The result is non-significant for the age (p-value
< 0.05) regardless of whether we use the median or mean. We
reject the null hypothesis that the group variances are
the same.
=> We have sufficient evidence to say the variance is significantly
different across the groups.
=> Assumption of homogenity of variance is violated.
=> However, this result is contradict to the Bartlett’s test
result (at the previous section). This can be explained
due to the non-normal distribution of age by trt.
Therefore, the Levene’s test is a suitable approach.
Example 2: For the sbp:
library(car)
leveneTest(df$sbp, df$trt)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 1.2385 0.2663
## 497
leveneTest(df$sbp, df$trt, center = mean)
## Levene's Test for Homogeneity of Variance (center = mean)
## Df F value Pr(>F)
## group 1 1.2387 0.2663
## 497
Discuss
The result is non-significant for the sbp (p-value > 0.05) regardless of whether we use the median or mean. This indicates that the variances are not significantly different.
=> We have do not have sufficient evidence to say the variance is significantly different across the groups.
The Fligner-Killeen test is one of the many tests for homogeneity of variances which is most robust against departures from normality.
This test uses the following null and alternative hypotheses:
H0: The variance among each group is equal.
HA: At least one group has a variance that is not equal to the rest.
Command:
with:
Example 1: Calculate age by trt
fligner.test(age ~ trt, data = df)
##
## Fligner-Killeen test of homogeneity of variances
##
## data: age by trt
## Fligner-Killeen:med chi-squared = 4.0497, df = 1, p-value = 0.04418
Discuss
The result is non-significant for the age (p-value
< 0.05) regardless of whether we use the median or mean. We
reject the null hypothesis that the group variances are
the same.
=> We have sufficient evidence to say the variance is
significantly different across the groups.
=> Assumption of homogenity of variance is violated.
=> This result is in accordance with Levene’test result because
age is non-normal distribution data. Therefore, the
Fligner-Killeen and Levene’s test is a suitable approach.
Example 2: Calculate sbp by trt
fligner.test(sbp ~ trt, data = df)
##
## Fligner-Killeen test of homogeneity of variances
##
## data: sbp by trt
## Fligner-Killeen:med chi-squared = 1.3563, df = 1, p-value = 0.2442
Discuss: The result is similar to Levene’test.
https://thomaselove.github.io/431-notes/assessing-normality.html#assessing-normality
Field, A., Miles, J. and Field, Z., 2012. Discovering statistics using
R. Sage publications.
https://saestatsteaching.tech/section-varianceratio#equal-variance-testing---two-groups
https://www.geeksforgeeks.org/bartletts-test-in-r-programming/
http://www.sthda.com/english/wiki/compare-multiple-sample-variances-in-r