2024-09-20

Slide 1: Title Slide

Please load these libraries

library(ggplot2)
library(knitr)
library(tinytex)
library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout

Slide 2: Introduction

What is hypothesis testing?

  • A statistical method used to assess the validity of a claim (hypothesis) about a population parameter.
  • It involves collecting data, analyzing it, and drawing conclusions about the population based on the sample.

Slide 3: Load the Dataset for Orange

Load the following Data

  Tree  age circumference
1    1  118            30
2    1  484            58
3    1  664            87
4    1 1004           115
5    1 1231           120
6    1 1372           142

Describes the dataset: circumference (circumference of orange trees) and age (age of trees)

Slide 4: Steps for Hypothesis Testing

  1. State the null and alternative hypotheses.
  2. Set the significance level (α).
  3. Choose the appropriate statistical test.
  4. Collect data from a random sample.
  5. Calculate the test statistic.
  6. Determine the p-value (probability of observing the data or more extreme, assuming H₀ is true).
  7. Make a decision based on the p-value.
  • Reject H₀ if p-value < α (evidence against the null hypothesis).
  • Fail to reject H₀ if p-value ≥ α (not enough evidence to reject the null hypothesis).

Slide 5: Examining the Effect of Age on Orange Tree Circumference (Data example)

-Null hypothesis (H₀): There is no relationship between age and orange tree circumference.
-Alternative hypothesis (H₁): There is a positive relationship between age and orange tree circumference.
-Significance level (α) = 0.05

Slide 6: Orange Circumference by Age

ggplot(Orange, aes(x = age, y = circumference)) +
  geom_point(color="purple",shape=22) +
  geom_smooth(method = "lm", se = FALSE,color="red") +
  labs(title = "Orange Tree Circumference by Age", x = "Age (years)", y = "Circumference (mm)")
`geom_smooth()` using formula = 'y ~ x'

Slide 7: Steps in Hypothesis Testing

Hypothesis: -H₀: The slope of the regression line is zero (no relationship between age and circumference).
-H₁: The slope of the regression line is not zero (there is a relationship between age and circumference).
Statistical Test: Linear regression
Formula for t-statistic:
t=β1/SE(β1)
Where:
-β^1 is the estimated slope coefficient
-SE(β^1) is the standard error of the slope coefficient

Slide 8: Conclusion

Perform the linear regression and calculate the p-value associated with the slope coefficient.
Interpret the results:
-If p-value < α, reject H₀ (there is a significant relationship between age and circumference).
-If p-value ≥ α, fail to reject H₀ (no significant relationship between age and circumference).

Slide 9: Histogram of circumfererence

Based on the graph, we can conclude that the smaller circumferences has a high frequency which means there was a lot of new growth.

Slide 10: Scatter Plot of Orange tree Circumference by age