Introduction to Inferential Statistics
Choosing the Suitable Statistical Test
- Steps of statistical test selection
- Choosing the most common statistical tests guide
Normality and Homogeneity of Variance Assumptions
Load Packages, Data, and Exploring Data

Introduction to Inferential Statistics

Assess the strength of evidence for/against a hypothesis; evaluate the data

Inferential statistical methods provide a confirmatory data analysis.
- Generalize conclusions from data from part of a group (sample) to the whole group (population)
- Assess the strength of the evidence
- Make comparisons
- Make predictions

Inferential statistical methods divide into 2 categories.

Hypothesis Testing: Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.
Model Fitting: Model fitting is a measure of how well a statistical learning model generalizes to similar data to that on which it was trained. A model that is well-fitted produces more accurate outcomes.

What is Inference?

The process of drawing conclusions about population parameters based on a sample taken from the population.

A sample is likely to be a good representation of the population.
There is an element of uncertainty as to how well the sample represents the population.
The way the sample is taken matters.

What is Hypothesis Testing?

Proposed explanation for a phenomenon.
A hypothesis is an educated guess about something in the world around you. It should be testable, either by experiment or observation.
Proposed explanation
Objectively testable
Singular - hypothesis
Plural - hypotheses

Examples

A new medicine you think might work.
A way of teaching you think might be better.
A possible location of new species.

What is a Hypothesis Statement?

“If I…(do this to an independent variable)….then (this will happen to the dependent variable).”

Example

If I (decrease the amount of water given to herbs) then (the herbs will increase in size).
If I (give patients counseling in addition to medication) then (their overall depression scale will decrease).
If I (give exams at noon instead of 7) then (student test scores will improve).

A good hypothesis statement should:

Include an “if” and “then” statement (according to the University of California).
Include both the independent and dependent variables.
Be testable by experiment, survey or other scientifically sound technique.
Be based on information in prior research (either yours or someone else’s).
Have design criteria (for engineering or programming projects).

Null and Alternative Hypothesis

Hypothesis 0 (\(H_0\)): Assumption of the test holds and is failed to be rejected at some level of significance.
Hypothesis (\(H_a\)): Assumption of the test does not hold and is rejected at some level of significance.

One-way and Two-way Tests

One-way test/One tailed test: Hypothesis test that counts chance results only in one direction.
Two-way test/Two tailed test: Hypothesis test that counts chance results in two directions.

Errors in Statistical Tests

Type I Error: The incorrect rejection of a true null hypothesis or a false positive.
Type II Error: The incorrect failure of rejection of a false null hypothesis or a false negative.

Alpha(\(\alpha\))

\(\alpha\) is probability of rejecting \(H_0\) when \(H_a\) is true.
\(\alpha\) = Probability of Type-I error.
Ranges from 0 to 1
High α is not good

p-value

A p-value is a statistical measurement used to validate a hypothesis against observed data.
A p-value measures the probability of obtaining the observed results, assuming that the null hypothesis is true.
The lower the p-value, the greater the statistical significance of the observed difference.
If p-value > alpha: Fail to reject the null hypothesis (i.e. not significant result).
If p-value <= alpha: Reject the null hypothesis (i.e. significant result).

If Significance level, \(\alpha=0.05\) or 5%

A small p (≤ 0.05), reject the null hypothesis. This is strong evidence that the null hypothesis is invalid.
A large p (> 0.05) means the alternate hypothesis is weak, so you do not reject the null.

Steps in Hypothesis Testing

Step-1: State the Null Hypothesis(H0)
- True until proven false
- Usually posits no relationship
Step-2: Select Statistical Test
- Pick from vast library
- Know which one to choose
Step-3: Significance Level
- Usually 1% or 5%
- What threshold for luck?
Step-4: State the Alternative Hypothesis(Ha)
- Negation of null hypothesis
- Usually asserts specific relationship
Step-5: Test Statistic
- Convert to p-value
- How likely it was just luck?
Step-6: Accept or Reject
- Small p-value? Reject H0
- Small: Below significance level

Types of Statistical Test

Parametric: Only normal distribution.
Non Parametric: Any distribution.

Parametric and non-parametric tests

Statistical tests are either parametric or non-parametric tests:

Parametric tests are used to compare means of the groups while non-parametric tests are used to compare the medians.
Parametric tests are used to compare samples with normally distributed numeric data.
Non-parametric tests are used to compare samples with non-normally distributed numeric data, or with ordinal data.
Parametric tests use the actual values of the variable.
Non-parametric tests use the ranks of the values.

Variable Distribution Type Tests (Gaussian/Normal)

Shapiro-Wilk Test
D’Agostino’s K^2 Test
Anderson-Darling Test

Variable Relationship Tests (Correlation)

Pearson’s Correlation Coefficient
Spearman’s Rank Correlation
Kendall’s Rank Correlation
Chi-Squared Test

Compare Sample Means (Carametric)

Student’s t-test
Paired Student’s t-test
Analysis of Variance Test (ANOVA)
Repeated Measures ANOVA Test

Compare Sample Means (Non-parametric)

Mann-Whitney U Test
Wilcoxon Signed-Rank Test
Kruskal-Wallis H Test
Friedman Test

Statistical Test Selection

Situation	Test
1 categorical variable	1 sample proportion test
2 categorical variables	chi squared test
1 numeric variable	t-test
1 numeric and 1 categorical variable	t-test or ANOVA
more than 2 categorical variables	ANOVA
2 numeric variables	correlation test

Choosing the Suitable Statistical Test

Steps of statistical test selection

Q1: Bivariate Vs Multivariable

The first question we need to ask is whether we are dealing with bivariate analysis or multivariate analysis.

Bivariate analysis: studying the relationship between two variables. For example:

Age and height
Type of treatment and complication
Sex and smoking
Smoking and coffee consumption

Multivariate (regression modelling/analysis): studying the effect of multiple variables on an outcome variable. For example:

Effect of smoking, sex, coffee consumption on blood pressure.
Effect of smoking, sex, coffee consumption on having a heart attack.

Q2: Difference Vs Correlation

If we are doing bivariate analysis, we have to ask if we are studying a difference or a correlation.

Difference: to study the difference between two or more groups, or two or more conditions For example:

The difference between males and females regarding coffee consumption
The difference in body weight before and after being on a specific diet. Correlation: to study the association between two variables
The association between age and weight
The association between coffee consumption and the number of sleeping hours.

Q3: Independent Vs Paired data

If we are doing bivariate analysis, we have to ask if we are working with independent data or paired data

Independent (unpaired) The observations in each sample are not related There is no relationship between the subjects in each sample.

Subjects in the first group cannot also be in the second group
No subject in either group can influence subjects in the other group
No group can influence the other group

Dependent (Paired): paired samples include:

Pre-test/post-test samples (a variable is measured before and after an intervention)
Cross-over trials
Matched samples
When a variable is measured twice or more on the same individual

Q4: Type of outcome and normality of distribution

Whatever the analysis we are doing, it is important to identify the types of data variables we are studying. The type of data variables is very important in choosing the suitable test. The following chart helps to distinguish between different types of data variables.

Time to event data (survival data): This is a special data type that will be discussed in survival analysis.

Q5: Number of groups /conditions

It is important to ask if we are comparing two groups (conditions) or more than two groups (conditions). For example: - Are we comparing two groups (diseased, not diseased), or three groups (normal, osteopaths, osteoporosis)? - Are we comparing two conditions (pre-test, post-test), or three conditions (before the operation, during the operation, after the operation)?

Choosing the most common statistical tests guide

Normality and Homogeneity of Variance Assumptions

Normality of distribution

It is important before doing some statistical tests to determine if a numeric variable is normally distributed or not.

This histogram shows normally distributed variable.

Assumption of normality

For some tests to be done, data needs to be approximately normally distributed. - How to test for normality? - 1- Plotting a histogram or QQ plot - 2- Using a statistical test - The statistical tests for normality are the Shapiro-Wilk and Kolmogorov-Smirnoff tests - We usually do both, the graph and the statistical tests. - The hypotheses of the Shapiro-Wilk and Kolmogorov-Smirnoff tests

Ho : the variable is normally distributed H1 : the variable is not normally distributed

Assumption of Homogeneity of variances

Homogeneity of variances (similar standard deviations) means that the variable we are studying has the same variance across groups.
We need to test for the equality of variances between groups when using some statistical tests, e.g. Independent t-tests and one-way ANOVA.
Homogeneity of variances is tested using Levene’s test.

Interpretation of the test result: If the p-value is < 0.05 reject H0 and conclude that the assumption of equal variances has not been met.

We accept the null hypothesis (say that there is equal variance) if the P-value > 0.05.

If the homogeneity of variance assumption was not met, the standard tests cannot be done, and modified tests can be used (will be discussed with the relevant tests).

Load Packages, Data, and Exploring Data

# Load packages 
library(tidyverse)
library(ggplot2)
library(ggpubr)
library(gridExtra)
library(gtsummary)
library(gt)
library(datasets)

Normality Test - Shapiro-Wilk Test

Tests whether a data sample has a Gaussian distribution.

Assumptions
Observations in each sample are independent and identically distributed (iid).

Interpretation

H0: The sample has a Gaussian/normal distribution.
Ha: The sample does not have a Gaussian/normal distribution.

# Normality Test in R
data <- read.csv("data/500_Person_Gender_Height_Weight_Index.csv")
# examine first few rows
head(data)

  Gender Height Weight Index
1   Male    174     96     4
2   Male    189     87     2
3 Female    185    110     4
4 Female    195    104     3
5   Male    149     61     3
6   Male    189    104     3

# Check Distribution of Height
gghistogram(data, x = "Height", add = "mean", fill = "#003f5c")

ggqqplot(data, x = "Height")

# Normality Test 
shapiro.test(data$Height)


    Shapiro-Wilk normality test

data:  data$Height
W = 0.96065, p-value = 2.665e-10

# Interpretation
test <- shapiro.test(data$Height)

# Set the significance level
alpha = 0.05

if(test$p.value > alpha){
  print("The sample has a Gaussian/normal distribution(Fail to reject the null hypothesis, the result is not significant)")
} else {
  print("The sample does not have a Gaussian/normal distribution(Reject the null hypothesis, the result is significant)")
}

[1] "The sample does not have a Gaussian/normal distribution(Reject the null hypothesis, the result is significant)"

Interpretation of the result: If p-value < 0.05 (or another chosen significance level), then there is evidence that the sample has a Gaussian/normal distribution.

Reporting significant results: A Shapiro-Wilk test was used to check whether a data sample has a Gaussian distribution. Significant result(the sample does not have a Gaussian/normal distributio) was found in the results(p-value < 0.05).

Reporting non-significant results: A Shapiro-Wilk test was used to check whether a data sample has a Gaussian distribution. No significant result(the sample has a Gaussian/normal distribution) was found in the results(p > 0.05)

Homogeneity of Variances Test

Tests whether a data sample has a equal variances.

Assumptions
Observations in each sample are independent and identically distributed (iid).

Interpretation

H0: The sample has a equal variances.
Ha: The sample does not have a equal variances. .

library(car)
leveneTest(Height ~ Gender, data = data)

Levene's Test for Homogeneity of Variance (center = median)
       Df F value  Pr(>F)  
group   1  3.2219 0.07327 .
      498                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation of the result: If p-value < 0.05 (or another chosen significance level), then there is evidence that the sample has a equal variances.

Reporting significant results: A leveneTest test was used to check whether a data sample has a equal variances. Significant result(the sample does not have equal variances) was found in the results(p-value < 0.05).

Reporting non-significant results: A leveneTest test was used to check whether a data sample has a equal variances. No significant result(the sample has equal variances) was found in the results(p > 0.05)

One Sample Proportion Test

Is there a difference in the number of men and women in the population?

H0: There is no difference.
Ha: There is a difference.

# Frequency Table
table(data$Gender)


Female   Male 
   255    245

# Proportion Table 
prop.table(table(data$Gender))


Female   Male 
  0.51   0.49

# 1 Sample proportion test
x <- table(data$Gender)
prop.test(x)


    1-sample proportions test with continuity correction

data:  x, null probability 0.5
X-squared = 0.162, df = 1, p-value = 0.6873
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.4652797 0.5545644
sample estimates:
   p 
0.51

prop_test <- prop.test(x, n=12, p = 0.5)
alpha <- 0.05 
if(prop_test$p.value > alpha){
  print("There is a difference.(Fail to reject H0, the result is not significant)")
} else{
  print("There is no difference.(Reject H0, the result is significant)")
}

[1] "There is a difference.(Fail to reject H0, the result is not significant)"

Interpretation of the result: If p-value < 0.05 (or another chosen significance level), then there is evidence that there is a difference in the number of men and women in the population.

Reporting significant results: A proportion test was used to check whether a difference in the number of men and women in the population. Significant result(there is no difference in the number of men and women in the population) was found in the results(p-value < 0.05).

Reporting non-significant results: A proportion test was used to check whether a difference in the number of men and women in the population. No significant result(there is a difference in the number of men and women in the populations) was found in the results(p > 0.05)

T-tests

It compares mean of two groups
It is a parametric statistical test.
It’s used to study if there is statistical difference between two groups

Types of t-test

One sample t-test
Paired t-test(Dependent)
Unpaired t-test(Independent)

Unpaired t-test also have 2 categories

Student’s t-test
- Equal variance
- Two sample t-test
Welch t-test
- Unequal variance
- Unequal variance t-test

Selection of t-test

One sample t-test(for one sample)
Paired t-test(for dependent samples)
Student t-test(When sample size and variance are equal)
Welch t-test(When sample size and variance are different)

One sample t-test in R

It compares the mean of one sample
- Known(from previous study) mean (\(\mu\))
- Hypothetical mean(\(\mu\))

# One sample t-test
t.test(data$Height)


    One Sample t-test

data:  data$Height
t = 232.06, df = 499, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 168.5052 171.3828
sample estimates:
mean of x 
  169.944

# One sample t-test, set the value of mu
t.test(data$Height, mu=169)


    One Sample t-test

data:  data$Height
t = 1.289, df = 499, p-value = 0.198
alternative hypothesis: true mean is not equal to 169
95 percent confidence interval:
 168.5052 171.3828
sample estimates:
mean of x 
  169.944

# One sample t-test is two tailed test by default
t.test(data$Height, mu=169, alternative = "two.sided")


    One Sample t-test

data:  data$Height
t = 1.289, df = 499, p-value = 0.198
alternative hypothesis: true mean is not equal to 169
95 percent confidence interval:
 168.5052 171.3828
sample estimates:
mean of x 
  169.944

# To perform one tailed, upper tailed test
t.test(data$Height, mu=169, alternative = "greater")


    One Sample t-test

data:  data$Height
t = 1.289, df = 499, p-value = 0.09899
alternative hypothesis: true mean is greater than 169
95 percent confidence interval:
 168.7372      Inf
sample estimates:
mean of x 
  169.944

# To perform one tailed, lower tailed test
t.test(data$Height, mu=169, alternative = "less")


    One Sample t-test

data:  data$Height
t = 1.289, df = 499, p-value = 0.901
alternative hypothesis: true mean is less than 169
95 percent confidence interval:
     -Inf 171.1508
sample estimates:
mean of x 
  169.944

Paired t-test

It compares the mean between two related samples.(each subject is measured twice)

Example:

Pre-weight(Weight of patient before study period)
Post-weight(Weight of patient after study period)

anorexia <- read.csv("data/anorexia.csv")
head(anorexia)

  X Treat Prewt Postwt
1 1  Cont  80.7   80.2
2 2  Cont  89.4   80.1
3 3  Cont  91.8   86.4
4 4  Cont  74.0   86.3
5 5  Cont  78.1   76.1
6 6  Cont  88.3   78.1

x <- subset(anorexia, Treat == "Cont", Prewt, drop =TRUE)
y <- subset(anorexia, Treat == "Cont", Postwt, drop=TRUE) 
# Perform paired t-test
t.test(x, y, paired = TRUE)


    Paired t-test

data:  x and y
t = 0.28723, df = 25, p-value = 0.7763
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -2.776708  3.676708
sample estimates:
mean difference 
           0.45

Student’s t-test

It compares the mean of two independent samples.
It assumes:
- Sample have equal variance
- Equal sample size
H0: the means of the samples are equal.
Ha: the means of the samples are unequal.

# Import data 
us_mortality = read.csv("data/USRegionalMortality.csv")
head(us_mortality)

   X        Region Status    Sex         Cause  Rate  SE
1  5 HHS Region 01  Urban   Male Heart disease 188.2 1.0
2  6 HHS Region 01  Rural   Male Heart disease 199.1 2.6
3  7 HHS Region 01  Urban Female Heart disease 115.1 0.6
4  8 HHS Region 01  Rural Female Heart disease 124.5 1.7
5  9 HHS Region 02  Urban   Male Heart disease 226.8 0.8
6 10 HHS Region 02  Rural   Male Heart disease 248.8 3.3

# Filtering Data
x <- us_mortality %>%
  filter(Cause == "Heart disease" & Sex == "Male")

y <- us_mortality %>%
  filter(Cause == "Heart disease" & Sex == "Female")

plot(x$Rate, y$Rate)

# Student's t-test
t.test(x$Rate, y$Rate, var.equal = TRUE)


    Two Sample t-test

data:  x$Rate and y$Rate
t = 9.9475, df = 38, p-value = 3.951e-12
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 63.53616 96.00384
sample estimates:
mean of x mean of y 
   216.60    136.83

# Student's t-test
std_test <- t.test(x$Rate, y$Rate, var.equal = TRUE)
alpha = 0.05 
if(std_test$p.value > alpha) {
  print("The means are equal(Fail to reject H0, the result is not significant)")
} else{
   print("The means are not equal(Reject H0, the result is significant)")
}

[1] "The means are not equal(Reject H0, the result is significant)"

Welch’s t-test

It compares the mean of two independent samples.
It assumes:
- Samples don’t have equal variance
- Sample size is not equal.

# Welch's t-test
t.test(x$Rate, y$Rate)


    Welch Two Sample t-test

data:  x$Rate and y$Rate
t = 9.9475, df = 35.141, p-value = 9.31e-12
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 63.49267 96.04733
sample estimates:
mean of x mean of y 
   216.60    136.83

welch_test <- t.test(x$Rate, y$Rate)
alpha = 0.05 
if(welch_test$p.value > alpha) {
  print("The means are equal(Fail to reject H0, the result is not significant)")
} else{
   print("The means are not equal(Reject H0, the result is significant)")
}

[1] "The means are not equal(Reject H0, the result is significant)"

Summary

# One sample t-test
t.test(x, mu="known mean")
# Two independent samples(Paired t-test)
t.test(x, y, paired=TRUE)
# Two independent samples(Stdudent's t-test)
t.test(x, y, equal.var=TRUE)
# Two independent samples(Welch's t-test)
t.test(x, y)

Chi-Squared Test

Also known as

Test of Independence
Test of Association
It is a non-parametric test.
It tests if two categorical variables are related.

The null hypothesis is taht no relationship exists between the variables(Independent Variables)

It only compare categorical variables. It can’t compare numerical variables.
It only tells if the two variables are dependent or independent.
It can’t tell type of relationship between two variables.

Contingency Table

Contingency table is a table with at least two rows and two columns(2x2) and its use to present categorical data in terms of frequency counts.

Requirement to Apply Chi-squared test

It should be two categorical variables(e.g; Gender)
Each variables should have at leats two groups(e.g; Gender = Female or Male)
There should be independence of observations(between and within subjects)
Large sample size
- The expected frequencies should be at least 1 for each cell.
- The expected frequencies for the majority(80%) of the cells should be at least 5.

If the sample size is small, we have to use Fisher’s Exact Test

Fisher’s Exact Test is similar to Chi-squared test, but it is used for small-sized samples.

migraine_data <- read.csv("data/KosteckiDillon.csv")
head(migraine_data)

  X id time dos hatype age airq medication headache    sex
1 1  1  -11 753   Aura  30    9 continuing      yes female
2 2  1  -10 754   Aura  30    7 continuing      yes female
3 3  1   -9 755   Aura  30   10 continuing      yes female
4 4  1   -8 756   Aura  30   13 continuing      yes female
5 5  1   -7 757   Aura  30   18 continuing      yes female
6 6  1   -6 758   Aura  30   19 continuing      yes female

H0: the two samples are independent.
Ha: there is a dependency between the samples.

table <- table(migraine_data$sex, migraine_data$headache)
table

        
           no  yes
  female 1266 2279
  male    220  387

chisq.test(table)


    Pearson's Chi-squared test with Yates' continuity correction

data:  table
X-squared = 0.042688, df = 1, p-value = 0.8363

chisq.test(migraine_data$sex, migraine_data$headache)


    Pearson's Chi-squared test with Yates' continuity correction

data:  migraine_data$sex and migraine_data$headache
X-squared = 0.042688, df = 1, p-value = 0.8363

ch_test <- chisq.test(table)
alpha = 0.05 
if(ch_test$p.value > alpha) {
  print("Dependent(Fail to reject H0, the result is not significant)")
} else{
   print("Independent(Reject H0, the result is significant)")
}

[1] "Dependent(Fail to reject H0, the result is not significant)"

Fisher’s Test

fisher.test(migraine_data$sex, migraine_data$medication)


    Fisher's Exact Test for Count Data

data:  migraine_data$sex and migraine_data$medication
p-value < 2.2e-16
alternative hypothesis: two.sided

fs_test <- fisher.test(migraine_data$sex, migraine_data$medication)
alpha = 0.05 
if(ch_test$p.value > alpha) {
  print("Independent(Fail to reject H0, the result is not significant)")
} else{
   print("Dependent(Reject H0, the result is significant)")
}

[1] "Independent(Fail to reject H0, the result is not significant)"

Correlation Test

Correlation Measures whether greater values of one variable correspond to greater values in the other. Scaled to always lie between +1 and −1

Correlation is Positive when the values increase together.
Correlation is Negative when one value decreases as the other increases.
A correlation is assumed to be linear.
1 is a perfect positive correlation
0 is no correlation (the values don’t seem linked at all)
-1 is a perfect negative correlation

Correlation Methods

Pearson’s test: assumes the data is normally distributed and measures linear correlation.
Spearman’s test: does not assume normality and measures non-linear correlation.
Kendall’s test: similarly does not assume normality and measures non-linear correlation, but it less commonly used.

Difference Between Pearson’s and Spearman’s

Pearson’s Test	Spearman’s Test
Paramentric Correlation	Non-parametric
Linear relationship	Non-linear relationship
Continuous variables	continuous or ordinal variables
Propotional change	Change not at constant rate

Pearson’s Correlation Test

# Import Iris Dataset
data(iris)
head(iris)

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

iris$Species <- NULL

# Calculate correlation
cor(iris$Sepal.Length, iris$Sepal.Width)

[1] -0.1175698

cor(iris$Petal.Length, iris$Petal.Width)

[1] 0.9628654

# Calculate correlation using Spearman method
cor(iris$Sepal.Length, iris$Sepal.Width, method = "spearman")

[1] -0.1667777

cor(iris$Petal.Length, iris$Petal.Width, method = "spearman")

[1] 0.9376668

# Plot Sepal.Length vs Sepal.Width
plot(iris$Sepal.Length, iris$Sepal.Width)

# Plot Petal.Length vs Petal.Width
plot(iris$Petal.Length, iris$Petal.Width)

# Calculate correlation matrix
cor(iris)

             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000

# Calculate correlation matrix using spearman method
cor(iris, method = "spearman")

             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length    1.0000000  -0.1667777    0.8818981   0.8342888
Sepal.Width    -0.1667777   1.0000000   -0.3096351  -0.2890317
Petal.Length    0.8818981  -0.3096351    1.0000000   0.9376668
Petal.Width     0.8342888  -0.2890317    0.9376668   1.0000000

H0: the two samples are independent.
Ha: there is a dependency between the samples.

# Correlation Test 
cor.test(iris$Sepal.Length, iris$Petal.Length)


    Pearson's product-moment correlation

data:  iris$Sepal.Length and iris$Petal.Length
t = 21.646, df = 148, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.8270363 0.9055080
sample estimates:
      cor 
0.8717538

# Pearson's Correlation: Interpretation
pearson_cor <- cor.test(iris$Sepal.Length, iris$Petal.Length)

alpha = 0.05 
if(pearson_cor$p.value > alpha) {
  print("Independent(Fail to reject H0, the result is not significant)")
} else{
   print("Dependent (Reject H0, the result is significant)")
}

[1] "Dependent (Reject H0, the result is significant)"

Spearman’s Rank Correlation Test

Tests whether two samples have a monotonic relationship.

Assumptions
- Observations in each sample are independent and identically distributed (iid). - Observations in each sample can be ranked.

Interpretation
- H0: the two samples are independent. - Ha: there is a dependency between the samples.

# Spearman's Correlation: Interpretation
spearman_cor <- cor.test(iris$Sepal.Length, iris$Petal.Length, method = "spearman")

alpha = 0.05 
if(spearman_cor$p.value > alpha) {
  print("Independent(Fail to reject H0, the result is not significant)")
} else{
   print("Dependent (Reject H0, the result is significant)")
}

[1] "Dependent (Reject H0, the result is significant)"

Kendall’s Rank Correlation Test

Assumptions

Observations in each sample are independent and identically distributed (iid).
Observations in each sample can be ranked.

Interpretation

H0: the two samples are independent.
Ha: there is a dependency between the samples.

# Spearman's Correlation: Interpretation
kendall_cor <- cor.test(iris$Sepal.Length, iris$Petal.Length, method = "kendall")

alpha = 0.05 
if(kendall_cor$p.value > alpha) {
  print("Independent(Fail to reject H0, the result is not significant)")
} else{
   print("Dependent (Reject H0, the result is significant)")
}

[1] "Dependent (Reject H0, the result is significant)"

ANOVA - Analysis of Variance

Compares the means of 3(+) groups of data.
Used to study if there is statistical difference between 3(+) group of data.
Assumes the data are normally distributed and have equal variances

One-way ANOVA

Compares the mean of 3(+) groups of data considering one independent variable or factor.
Within each group there should be at least three observations.

Two-way ANOVA

Compares the means of 3(+) groups of data considering two independent variables or factors.

Assumptions

Observations in each sample are independent and identically distributed (iid).
Observations in each sample are normally distributed.
Observations in each sample have the same variance.

Interpretation

H0: the means of the samples are equal.
H1: one or more of the means of the samples are unequal.

# Effect of cadmium on growth of green alga
alga <- read.csv("data/S.capricornutum.csv")
head(alga)

  X conc count
1 1    0 120.9
2 2    0 118.0
3 3    0 134.0
4 4    5 121.2
5 5    5 118.6
6 6    5 120.4

# Structure 
str(alga)

'data.frame':   18 obs. of  3 variables:
 $ X    : int  1 2 3 4 5 6 7 8 9 10 ...
 $ conc : int  0 0 0 5 5 5 10 10 10 20 ...
 $ count: num  121 118 134 121 119 ...

# Dependent ~ Single Independent Variables(as factor) 
one_way <- aov(count ~ as.factor(conc), data=alga)
summary(one_way)

                Df Sum Sq Mean Sq F value   Pr(>F)    
as.factor(conc)  5  40069    8014   217.6 2.44e-11 ***
Residuals       12    442      37                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

TukeyHSD(one_way)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = count ~ as.factor(conc), data = alga)

$`as.factor(conc)`
             diff        lwr         upr     p adj
5-0     -4.233333  -20.87689   12.410223 0.9505229
10-0   -48.633333  -65.27689  -31.989777 0.0000051
20-0   -80.233333  -96.87689  -63.589777 0.0000000
40-0  -110.266667 -126.91022  -93.623111 0.0000000
80-0  -119.856667 -136.50022 -103.213111 0.0000000
10-5   -44.400000  -61.04356  -27.756444 0.0000135
20-5   -76.000000  -92.64356  -59.356444 0.0000000
40-5  -106.033333 -122.67689  -89.389777 0.0000000
80-5  -115.623333 -132.26689  -98.979777 0.0000000
20-10  -31.600000  -48.24356  -14.956444 0.0003906
40-10  -61.633333  -78.27689  -44.989777 0.0000004
80-10  -71.223333  -87.86689  -54.579777 0.0000001
40-20  -30.033333  -46.67689  -13.389777 0.0006220
80-20  -39.623333  -56.26689  -22.979777 0.0000434
80-40   -9.590000  -26.23356    7.053556 0.4279326

pig_data <- read.csv("data/ToothGrowth.csv")
head(pig_data)

  X  len supp dose
1 1  4.2   VC  0.5
2 2 11.5   VC  0.5
3 3  7.3   VC  0.5
4 4  5.8   VC  0.5
5 5  6.4   VC  0.5
6 6 10.0   VC  0.5

# Dependent ~ Multiple Independent Variables(as factor) 
two_way <- aov(len ~ as.factor(supp)+as.factor(dose), data=pig_data)
summary(two_way)

                Df Sum Sq Mean Sq F value   Pr(>F)    
as.factor(supp)  1  205.4   205.4   14.02 0.000429 ***
as.factor(dose)  2 2426.4  1213.2   82.81  < 2e-16 ***
Residuals       56  820.4    14.7                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

TukeyHSD(two_way, which = "as.factor(dose)")

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = len ~ as.factor(supp) + as.factor(dose), data = pig_data)

$`as.factor(dose)`
        diff       lwr       upr p adj
1-0.5  9.130  6.215909 12.044091 0e+00
2-0.5 15.495 12.580909 18.409091 0e+00
2-1    6.365  3.450909  9.279091 7e-06

Interpreting Data Using Inferential Statistics with R

Jubayer Hossain, Founder & Instructor, CHIRAL Bangladesh

06 June 2022

Introduction to Inferential Statistics

What is Inference?

What is Hypothesis Testing?

What is a Hypothesis Statement?

Null and Alternative Hypothesis

One-way and Two-way Tests

Errors in Statistical Tests

Alpha(\(\alpha\))

p-value

Steps in Hypothesis Testing

Types of Statistical Test

Parametric and non-parametric tests

Variable Distribution Type Tests (Gaussian/Normal)

Variable Relationship Tests (Correlation)

Compare Sample Means (Carametric)

Compare Sample Means (Non-parametric)

Statistical Test Selection

Choosing the Suitable Statistical Test

Steps of statistical test selection

Q1: Bivariate Vs Multivariable

Q2: Difference Vs Correlation

Q3: Independent Vs Paired data

Q4: Type of outcome and normality of distribution

Q5: Number of groups /conditions

Choosing the most common statistical tests guide

Normality and Homogeneity of Variance Assumptions

Normality of distribution

Assumption of normality

Assumption of Homogeneity of variances

Load Packages, Data, and Exploring Data

Normality Test - Shapiro-Wilk Test

Homogeneity of Variances Test

One Sample Proportion Test

T-tests

Types of t-test

Selection of t-test

One sample t-test in R

Paired t-test

Student’s t-test

Welch’s t-test

Summary

Chi-Squared Test

Contingency Table

Requirement to Apply Chi-squared test

Fisher’s Test

Correlation Test

Correlation Methods

Difference Between Pearson’s and Spearman’s

Pearson’s Correlation Test

Spearman’s Rank Correlation Test

Kendall’s Rank Correlation Test

ANOVA - Analysis of Variance

One-way ANOVA

Two-way ANOVA

References