Comparing Means in R

Comparing one-sample mean to a standard known mean:

What is one-sample t-test?

one-sample t-test is used to compare the mean of one sample to a known standard (or theoretical/hypothetical) mean (μ).

##Formula of one-sample t-test The t-statistic can be calculated as follow: ### where,

m is the sample mean

n is the sample size

s is the sample standard deviation with n−1

degrees of freedom

meu is the theoretical value

R function to compute one-sample t-test

To perform one-sample t-test, the R function t.test() can be used as follow:

# t.test(x, mu = 0, alternative = "two.sided")

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

# If .txt tab file, use this
# my_data <- read.delim(file.choose())
# Or, if .csv file, use this
# my_data <- read.csv(file.choose())

set.seed(1234)
my_data <- data.frame(
  name = paste0(rep("M_", 10), 1:10),
  weight = round(rnorm(10, 20, 2), 1)
)

Check your data

# Print the first 10 rows of the data
head(my_data, 10)

##    name weight
## 1   M_1   17.6
## 2   M_2   20.6
## 3   M_3   22.2
## 4   M_4   15.3
## 5   M_5   20.9
## 6   M_6   21.0
## 7   M_7   18.9
## 8   M_8   18.9
## 9   M_9   18.9
## 10 M_10   18.2

# Statistical summaries of weight
summary(my_data$weight)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   15.30   18.38   18.90   19.25   20.82   22.20

Min.: the minimum value

1st Qu.: The first quartile. 25% of values are lower than this.

Median: the median value. Half the values are lower; half are higher.

3rd Qu.: the third quartile. 75% of values are higher than this.

Max.: the maximum value

Visualize your data using box plots

library(ggpubr)

## Loading required package: ggplot2

ggboxplot(my_data$weight, 
          ylab = "Weight (g)", xlab = FALSE,
          ggtheme = theme_minimal())

Preleminary test to check one-sample t-test assumptions

Is this a large sample? - No, because n < 30.

Since the sample size is not large enough (less than 30, central limit theorem), we need to check whether the data follow a normal distribution.

Briefly, it’s possible to use the Shapiro-Wilk normality test and to look at the normality plot.

Shapiro-Wilk test:

Null hypothesis: the data are normally distributed

Alternative hypothesis: the data are not normally distributed

shapiro.test(my_data$weight) # => p-value = 0.6993

## 
##  Shapiro-Wilk normality test
## 
## data:  my_data$weight
## W = 0.9526, p-value = 0.6993

Visual inspection of the data normality using Q-Q plots (quantile-quantile plots). Q-Q plot draws the correlation between a given sample and the normal distribution.

library("ggpubr")
ggqqplot(my_data$weight, ylab = "Men's weight",
         ggtheme = theme_minimal())

Compute one-sample t-test

# One-sample t-test
res <- t.test(my_data$weight, mu = 25)
# Printing the results
res

## 
##  One Sample t-test
## 
## data:  my_data$weight
## t = -9.0783, df = 9, p-value = 7.953e-06
## alternative hypothesis: true mean is not equal to 25
## 95 percent confidence interval:
##  17.8172 20.6828
## sample estimates:
## mean of x 
##     19.25

Access to the values returned by t.test() function

The result of t.test() function is a list containing the following components:

statistic: the value of the t test statistics

parameter: the degrees of freedom for the t test statistics

p.value: the p-value for the test

conf.int: a confidence interval for the mean appropriate to the specified alternative hypothesis.

estimate: the means of the two groups being compared (in the case of independent t test) or difference in means (in the case of paired t test).

The format of the R code to use for getting these values is as follow:

# printing the p-value
res$p.value

## [1] 7.953383e-06

# printing the mean
res$estimate

## mean of x 
##     19.25

# printing the confidence interval
res$conf.int

## [1] 17.8172 20.6828
## attr(,"conf.level")
## [1] 0.95

non paramatric test

What’s one-sample Wilcoxon signed rank test?

The one-sample Wilcoxon signed rank test is a non-parametric alternative to one-sample t-test when the data cannot be assumed to be normally distributed. It’s used to determine whether the median of the sample is equal to a known standard value (i.e. theoretical value).

R function to compute one-sample Wilcoxon test

To perform one-sample Wilcoxon-test, the R function wilcox.test() can be used as follow:

# wilcox.test(x, mu = 0, alternative = "two.sided")

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

# If .txt tab file, use this
# my_data <- read.delim(file.choose())
# Or, if .csv file, use this
# my_data <- read.csv(file.choose())

set.seed(1234)
my_data <- data.frame(
  name = paste0(rep("M_", 10), 1:10),
  weight = round(rnorm(10, 20, 2), 1)
)

Check your data

# Print the first 10 rows of the data
head(my_data, 10)

##    name weight
## 1   M_1   17.6
## 2   M_2   20.6
## 3   M_3   22.2
## 4   M_4   15.3
## 5   M_5   20.9
## 6   M_6   21.0
## 7   M_7   18.9
## 8   M_8   18.9
## 9   M_9   18.9
## 10 M_10   18.2

# Statistical summaries of weight
summary(my_data$weight)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   15.30   18.38   18.90   19.25   20.82   22.20

Min.: the minimum value

1st Qu.: The first quartile. 25% of values are lower than this.

Median: the median value. Half the values are lower; half are higher.

3rd Qu.: the third quartile. 75% of values are higher than this.

Max.: the maximum value

Visualize your data using box plots

library(ggpubr)
ggboxplot(my_data$weight, 
          ylab = "Weight (g)", xlab = FALSE,
          ggtheme = theme_minimal())

# One-sample wilcoxon test
res <- wilcox.test(my_data$weight, mu = 25)

## Warning in wilcox.test.default(my_data$weight, mu = 25): cannot compute exact
## p-value with ties

# Printing the results
res

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  my_data$weight
## V = 0, p-value = 0.005793
## alternative hypothesis: true location is not equal to 25

# print only the p-value
res$p.value

## [1] 0.005793045

Comparing the means of two independent groups:

What is unpaired two-samples t-test?

The unpaired two-samples t-test is used to compare the mean of two independent groups.

Formula of unpaired two-samples t-test

1.Classical t-test:

t <-(mA−mB)/sqrt((S^2/nA)+(S2/nB))

mA and mB represent the mean value of the group A and B, respectively.

nA and nB represent the sizes of the group A and B, respectively.

S2 is an estimator of the pooled variance of the two groups. It can be calculated as follow :

sum(x−mA)^{2+sum(x−mB)}2/nA+nB−2

with degrees of freedom (df): df=nA+nB−2

2.Welch t-statistic:.

t<- mA−mB/ sqrt(S^2/nA+S2/nB)

Visualize your data and compute unpaired two-samples t-test in R

You can draw R base graphs as described at this link: R base graphs. Here, we’ll use the ggpubr R package for an easy ggplot2-based data visualization

R function to compute unpaired two-samples t-test

To perform two-samples t-test comparing the means of two independent samples (x & y), the R function t.test() can be used as follow:

#t.test(x, y, alternative = "two.sided", var.equal = FALSE)

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

# If .txt tab file, use this
# my_data <- read.delim(file.choose())
# Or, if .csv file, use this
# my_data <- read.csv(file.choose())

Here, we’ll use an example data set, which contains the weight of 18 individuals (9 women and 9 men):

# Data in two numeric vectors
women_weight <- c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5)
men_weight <- c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4) 
# Create a data frame
my_data <- data.frame( 
                group = rep(c("Woman", "Man"), each = 9),
                weight = c(women_weight,  men_weight)
                )

# Check your data
# Print all data
print(my_data)

##    group weight
## 1  Woman   38.9
## 2  Woman   61.2
## 3  Woman   73.3
## 4  Woman   21.8
## 5  Woman   63.4
## 6  Woman   64.6
## 7  Woman   48.4
## 8  Woman   48.8
## 9  Woman   48.5
## 10   Man   67.8
## 11   Man   60.0
## 12   Man   63.4
## 13   Man   76.0
## 14   Man   89.4
## 15   Man   73.3
## 16   Man   67.3
## 17   Man   61.3
## 18   Man   62.4

Compute summary statistics by groups:

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

group_by(my_data, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE)
  )

## # A tibble: 2 × 4
##   group count  mean    sd
##   <chr> <int> <dbl> <dbl>
## 1 Man       9  69.0  9.38
## 2 Woman     9  52.1 15.6

Visualize your data using box plots

# Plot weight by group and color by group
library("ggpubr")
ggboxplot(my_data, x = "group", y = "weight", 
          color = "group", palette = c("#00AFBB", "#E7B800"),
        ylab = "Weight", xlab = "Groups")

## Preleminary test to check independent t-test assumptions

# Shapiro-Wilk normality test for Men's weights
with(my_data, shapiro.test(weight[group == "Man"]))# p = 0.1

## 
##  Shapiro-Wilk normality test
## 
## data:  weight[group == "Man"]
## W = 0.86425, p-value = 0.1066

# Shapiro-Wilk normality test for Women's weights
with(my_data, shapiro.test(weight[group == "Woman"])) # p = 0.6

## 
##  Shapiro-Wilk normality test
## 
## data:  weight[group == "Woman"]
## W = 0.94266, p-value = 0.6101

We’ll use F-test to test for homogeneity in variances. This can be performed with the function var.test() as follow:

res.ftest <- var.test(weight ~ group, data = my_data)
res.ftest

## 
##  F test to compare two variances
## 
## data:  weight by group
## F = 0.36134, num df = 8, denom df = 8, p-value = 0.1714
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.08150656 1.60191315
## sample estimates:
## ratio of variances 
##          0.3613398

Compute unpaired two-samples t-test

1) Compute independent t-test - Method 1: The data are saved in two different numeric vectors.

# Compute t-test
res <- t.test(women_weight, men_weight, var.equal = TRUE)
res

## 
##  Two Sample t-test
## 
## data:  women_weight and men_weight
## t = -2.7842, df = 16, p-value = 0.01327
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -29.748019  -4.029759
## sample estimates:
## mean of x mean of y 
##  52.10000  68.98889

2) Compute independent t-test - Method 2: The data are saved in a data frame.

# Compute t-test
res <- t.test(weight ~ group, data = my_data, var.equal = TRUE)
res

## 
##  Two Sample t-test
## 
## data:  weight by group
## t = 2.7842, df = 16, p-value = 0.01327
## alternative hypothesis: true difference in means between group Man and group Woman is not equal to 0
## 95 percent confidence interval:
##   4.029759 29.748019
## sample estimates:
##   mean in group Man mean in group Woman 
##            68.98889            52.10000

Interpretation of the result

The p-value of the test is 0.01327, which is less than the significance level alpha = 0.05. We can conclude that men’s average weight is significantly different from women’s average weight with a p-value = 0.01327.

Access to the values returned by t.test() function

The result of t.test() function is a list containing the following components:

statistic: the value of the t test statistics

parameter: the degrees of freedom for the t test statistics

p.value: the p-value for the test

conf.int: a confidence interval for the mean appropriate to the specified alternative hypothesis.

estimate: the means of the two groups being compared (in the case of independent t test) or difference in means (in the case of paired t test).

The format of the R code to use for getting these values is as follow:

# printing the p-value
res$p.value

## [1] 0.0132656

# printing the mean
res$estimate

##   mean in group Man mean in group Woman 
##            68.98889            52.10000

# printing the confidence interval
res$conf.int

## [1]  4.029759 29.748019
## attr(,"conf.level")
## [1] 0.95

Unpaired Two-Samples Wilcoxon Test in R

#Tools

The unpaired two-samples Wilcoxon test (also known as Wilcoxon rank sum test or Mann-Whitney test) is a non-parametric alternative to the unpaired two-samples t-test, which can be used to compare two independent groups of samples. It’s used when your data are not normally distributed.

Visualize your data and compute Wilcoxon test in R

R function to compute Wilcoxon test

To perform two-samples Wilcoxon test comparing the means of two independent samples (x & y), the R function wilcox.test() can be used as follow:

# wilcox.test(x, y, alternative = "two.sided")

x,y: numeric vectors

alternative: the alternative hypothesis. Allowed value is one of “two.sided” (default), “greater” or “less”.

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

# If .txt tab file, use this
# my_data <- read.delim(file.choose())
# Or, if .csv file, use this
# my_data <- read.csv(file.choose())

Here, we’ll use an example data set, which contains the weight of 18 individuals (9 women and 9 men):

# Data in two numeric vectors
women_weight <- c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5)
men_weight <- c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4) 
# Create a data frame
my_data <- data.frame( 
                group = rep(c("Woman", "Man"), each = 9),
                weight = c(women_weight,  men_weight)
                )

Check your data

print(my_data)

##    group weight
## 1  Woman   38.9
## 2  Woman   61.2
## 3  Woman   73.3
## 4  Woman   21.8
## 5  Woman   63.4
## 6  Woman   64.6
## 7  Woman   48.4
## 8  Woman   48.8
## 9  Woman   48.5
## 10   Man   67.8
## 11   Man   60.0
## 12   Man   63.4
## 13   Man   76.0
## 14   Man   89.4
## 15   Man   73.3
## 16   Man   67.3
## 17   Man   61.3
## 18   Man   62.4

Compute summary statistics by groups:

library(dplyr)
group_by(my_data, group) %>%
  summarise(
    count = n(),
    median = median(weight, na.rm = TRUE),
    IQR = IQR(weight, na.rm = TRUE)
  )

## # A tibble: 2 × 4
##   group count median   IQR
##   <chr> <int>  <dbl> <dbl>
## 1 Man       9   67.3  10.9
## 2 Woman     9   48.8  15

Visualize your data:

# Plot weight by group and color by group
library("ggpubr")
ggboxplot(my_data, x = "group", y = "weight", 
          color = "group", palette = c("#00AFBB", "#E7B800"),
          ylab = "Weight", xlab = "Groups")

Compute unpaired two-samples Wilcoxon test

1) Compute two-samples Wilcoxon test - Method 1: The data are saved in two different numeric vectors.

res <- wilcox.test(women_weight, men_weight)

## Warning in wilcox.test.default(women_weight, men_weight): cannot compute exact
## p-value with ties

res

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  women_weight and men_weight
## W = 15, p-value = 0.02712
## alternative hypothesis: true location shift is not equal to 0

2) Compute two-samples Wilcoxon test - Method 2: The data are saved in a data frame.

res <- wilcox.test(weight ~ group, data = my_data,
                   exact = FALSE)
res

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  weight by group
## W = 66, p-value = 0.02712
## alternative hypothesis: true location shift is not equal to 0

# Print the p-value only
res$p.value

## [1] 0.02711657

Comparing the means of paired samples:

What is paired samples t-test?

statistical hypotheses

In statistics, we can define the corresponding null hypothesis (H0) as follow:

H0:m=0

H0:m<=0

H0:m>=0

##The corresponding alternative hypotheses (Ha) are as follow:

Ha:m!=0 (different)

Ha:m>0 (greater)

Ha:m<0 (less)

Formula of paired samples t-test

t-test statistisc value can be calculated using the following formula:

t=m/(s/sqrt(n))

where,

m is the mean differences

n is the sample size (i.e., size of d).

s is the standard deviation of d

We can compute the p-value corresponding to the absolute value of the t-test statistics (|t|) for the degrees of freedom (df): df=n−1.

Visualize your data and compute paired t-test in R

R function to compute paired t-test

To perform paired samples t-test comparing the means of two paired samples (x & y), the R function t.test() can be used as follow:

# t.test(x, y, paired = TRUE, alternative = "two.sided")

Import your data into R

1. Prepare your data as specified here: Best practices for preparing your data set for R

2. Save your data in an external .txt tab or .csv files

3. Import your data into R as follow:

# If .txt tab file, use this
# my_data <- read.delim(file.choose())
# Or, if .csv file, use this
# my_data <- read.csv(file.choose())

Here, we’ll use an example data set, which contains the weight of 10 mice before and after the treatment.

# Data in two numeric vectors
# ++++++++++++++++++++++++++
# Weight of the mice before treatment
before <-c(200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7)
# Weight of the mice after treatment
after <-c(392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2)
# Create a data frame
my_data <- data.frame( 
                group = rep(c("before", "after"), each = 10),
                weight = c(before,  after)
                )

Check your data

# Print all data
print(my_data)

##     group weight
## 1  before  200.1
## 2  before  190.9
## 3  before  192.7
## 4  before  213.0
## 5  before  241.4
## 6  before  196.9
## 7  before  172.2
## 8  before  185.5
## 9  before  205.2
## 10 before  193.7
## 11  after  392.9
## 12  after  393.2
## 13  after  345.1
## 14  after  393.0
## 15  after  434.0
## 16  after  427.9
## 17  after  422.0
## 18  after  383.9
## 19  after  392.3
## 20  after  352.2

Compute summary statistics by groups:

library("dplyr")
group_by(my_data, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE)
  )

## # A tibble: 2 × 4
##   group  count  mean    sd
##   <chr>  <int> <dbl> <dbl>
## 1 after     10  394.  29.4
## 2 before    10  199.  18.5

Visualize your data using box plots

To use R base graphs read this: R base graphs. Here, we’ll use the ggpubr R package for an easy ggplot2-based data visualization.

# Visualize your data:
# Plot weight by group and color by group
library("ggpubr")
ggboxplot(my_data, x = "group", y = "weight", 
          color = "group", palette = c("#00AFBB", "#E7B800"),
          order = c("before", "after"),
          ylab = "Weight", xlab = "Groups")

Paired Samples T-test in R

Box plots show you the increase, but lose the paired information. You can use the function plot.paired() [in pairedData package] to plot paired data (“before - after” plot).

Plot paired data:

# Subset weight data before treatment
before <- subset(my_data,  group == "before", weight,
                 drop = TRUE)

# subset weight data after treatment
after <- subset(my_data,  group == "after", weight,
                 drop = TRUE)

Plot paired data

library(PairedData)

## Loading required package: MASS

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

## Loading required package: gld

## Loading required package: mvtnorm

## Loading required package: lattice

## 
## Attaching package: 'PairedData'

## The following object is masked from 'package:base':
## 
##     summary

pd <- paired(before, after)
plot(pd, type = "profile") + theme_bw()

Preleminary test to check paired t-test assumptions

Assumption 1: Are the two samples paired?

Yes, since the data have been collected from measuring twice the weight of the same mice.

Assumption 2: Is this a large sample?

No, because n < 30. Since the sample size is not large enough (less than 30), we need to check whether the differences of the pairs follow a normal distribution.

How to check the normality?

Use Shapiro-Wilk normality test as described at: Normality Test in R.

Null hypothesis: the data are normally distributed

Alternative hypothesis: the data are not normally distributed

# compute the difference
d <- with(my_data, 
        weight[group == "before"] - weight[group == "after"])
# Shapiro-Wilk normality test for the differences
shapiro.test(d) # => p-value = 0.6141

## 
##  Shapiro-Wilk normality test
## 
## data:  d
## W = 0.94536, p-value = 0.6141

Compute paired samples t-test

Question : Is there any significant changes in the weights of mice after treatment?

1) Compute paired t-test - Method 1: The data are saved in two different numeric vectors.

# Compute t-test
res <- t.test(before, after, paired = TRUE)
res

## 
##  Paired t-test
## 
## data:  before and after
## t = -20.883, df = 9, p-value = 6.2e-09
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -215.5581 -173.4219
## sample estimates:
## mean difference 
##         -194.49

2) Compute paired t-test - Method 2: The data are saved in a data frame.

# Compute t-test
res <- t.test(weight ~ group, data = my_data, paired = TRUE)
res

## 
##  Paired t-test
## 
## data:  weight by group
## t = 20.883, df = 9, p-value = 6.2e-09
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  173.4219 215.5581
## sample estimates:
## mean difference 
##          194.49

Interpretation of the result

The p-value of the test is 6.210^{-9}, which is less than the significance level alpha = 0.05. We can then reject null hypothesis and conclude that the average weight of the mice before treatment is significantly different from the average weight after treatment with a p-value = 6.210^{-9}.

Access to the values returned by t.test() function

The result of t.test() function is a list containing the following components:

statistic: the value of the t test statistics

parameter: the degrees of freedom for the t test statistics

p.value: the p-value for the test

conf.int: a confidence interval for the mean appropriate to the specified alternative hypothesis.

estimate: the means of the two groups being compared (in the case of independent t test) or difference in means (in the case of paired t test).

The format of the R code to use for getting these values is as follow:

# printing the p-value
res$p.value

## [1] 6.200298e-09

# printing the mean
res$estimate

## mean difference 
##          194.49

# printing the confidence interval
res$conf.int

## [1] 173.4219 215.5581
## attr(,"conf.level")
## [1] 0.95

LAB MANUAL

Haseeb Shahzad 512102

2024-01-08

Comparing Means in R

Comparing one-sample mean to a standard known mean:

What is one-sample t-test?

one-sample t-test is used to compare the mean of one sample to a known standard (or theoretical/hypothetical) mean (μ).

m is the sample mean

n is the sample size

s is the sample standard deviation with n−1

degrees of freedom

meu is the theoretical value

R function to compute one-sample t-test

To perform one-sample t-test, the R function t.test() can be used as follow:

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

Check your data

Min.: the minimum value

1st Qu.: The first quartile. 25% of values are lower than this.

Median: the median value. Half the values are lower; half are higher.

3rd Qu.: the third quartile. 75% of values are higher than this.

Max.: the maximum value

Visualize your data using box plots

Preleminary test to check one-sample t-test assumptions

Is this a large sample? - No, because n < 30.

Since the sample size is not large enough (less than 30, central limit theorem), we need to check whether the data follow a normal distribution.

Briefly, it’s possible to use the Shapiro-Wilk normality test and to look at the normality plot.

Shapiro-Wilk test:

Null hypothesis: the data are normally distributed

Alternative hypothesis: the data are not normally distributed

Visual inspection of the data normality using Q-Q plots (quantile-quantile plots). Q-Q plot draws the correlation between a given sample and the normal distribution.

Compute one-sample t-test

Access to the values returned by t.test() function

The result of t.test() function is a list containing the following components:

statistic: the value of the t test statistics

parameter: the degrees of freedom for the t test statistics

p.value: the p-value for the test

conf.int: a confidence interval for the mean appropriate to the specified alternative hypothesis.

estimate: the means of the two groups being compared (in the case of independent t test) or difference in means (in the case of paired t test).

The format of the R code to use for getting these values is as follow:

non paramatric test

What’s one-sample Wilcoxon signed rank test?

The one-sample Wilcoxon signed rank test is a non-parametric alternative to one-sample t-test when the data cannot be assumed to be normally distributed. It’s used to determine whether the median of the sample is equal to a known standard value (i.e. theoretical value).

R function to compute one-sample Wilcoxon test

To perform one-sample Wilcoxon-test, the R function wilcox.test() can be used as follow:

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

Check your data

Min.: the minimum value

1st Qu.: The first quartile. 25% of values are lower than this.

Median: the median value. Half the values are lower; half are higher.

3rd Qu.: the third quartile. 75% of values are higher than this.

Max.: the maximum value

Visualize your data using box plots

Comparing the means of two independent groups:

What is unpaired two-samples t-test?

The unpaired two-samples t-test is used to compare the mean of two independent groups.

Formula of unpaired two-samples t-test

1.Classical t-test:

t <-(mA−mB)/sqrt((S2/nA)+(S2/nB))

mA and mB represent the mean value of the group A and B, respectively.

nA and nB represent the sizes of the group A and B, respectively.

S2 is an estimator of the pooled variance of the two groups. It can be calculated as follow :

sum(x−mA)2+sum(x−mB)2/nA+nB−2

with degrees of freedom (df): df=nA+nB−2

2.Welch t-statistic:.

t<- mA−mB/ sqrt(S2/nA+S2/nB)

Visualize your data and compute unpaired two-samples t-test in R

You can draw R base graphs as described at this link: R base graphs. Here, we’ll use the ggpubr R package for an easy ggplot2-based data visualization

R function to compute unpaired two-samples t-test

To perform two-samples t-test comparing the means of two independent samples (x & y), the R function t.test() can be used as follow:

Import your data into R

Prepare your data as specified here: Best practices for preparing your data set for R

Save your data in an external .txt tab or .csv files

Import your data into R as follow:

Here, we’ll use an example data set, which contains the weight of 18 individuals (9 women and 9 men):

t <-(mA−mB)/sqrt((S^2/nA)+(S2/nB))

sum(x−mA)^{2+sum(x−mB)}2/nA+nB−2

t<- mA−mB/ sqrt(S^2/nA+S2/nB)