Q1)

Consider our model :

\[ y_{ij} = \mu_i + \epsilon_{ij} \\ \text{For :} \\ \text{i = Group} \\ \text{j = Observation} \]

Here’s what we know :

\[ \Delta_{\bar{x}} = 2.35; \\ \text{df} = 18; \\ t_{o}=2.01; \\ P_{\text{value}} = 2.98\% \]

a) Determine \(SE(\Delta_{\bar{x}})\) :

Recall :

\[ t_o = \frac{\Delta_{\bar{x}}}{SE(\Delta_{\bar{x}})} \\ \therefore \\ SE(\Delta_{\bar{x}}) = \frac{\Delta_{\bar{x}}}{t_o} \\ \approx 1.169 \]

b) One sided or Two Sided Test?

Consider the following R-Snippet:

pt(2.01, df = 18, lower.tail = FALSE)

## [1] 0.02983103

pt(-2.01, df = 18, lower.tail = TRUE)

## [1] 0.02983103

In other words \(P(T > t_o) \approx 3\%\) and, \(P(T < t_o) \approx 3\%\); therefore, \(P(|T| > t_o) \approx 6\%\).

Therefore this t-test is one sided; so we are either testing :

\[ H_a : \mu_1 > \mu_2 \text{ or, } H_a : \mu_1 < \mu_2 \]

Reference ( pt() function explained )

c) If \(\alpha = 5\%\), what is your conclusion?

\[ P_{\text{Value}} < \alpha \implies \text{Reject Null Hyp.} \]

D) Generate a 90%-CI for \(\Delta_{\bar{x}}\)

Recall the general form :

\[ \text{Pt. Est.} \pm (\text{Crit-Val})(\text{SE}) \]

So,

\[ \Delta_{\bar{x}} \pm t_{\frac{\alpha}{2},df}SE(\Delta_{\bar{x}}) \\ \text{For: } \alpha = 10\% ; \text{df} = 18 \]

alpha <- .1
conf_coef <- 1-alpha/2
qt(conf_coef, df = 18)

## [1] 1.734064

Therefore the \(90\%-CI\) is :

\[ 2.35 \pm 1.734064*1.169 \\ \text{or, }\\ [0.322, 4.377] \]

According to the results at \(\alpha = 10\%\), we are 90% confident that based upon our data, the true difference is between about .3 to 4.4.

Reference ( Lab 7: pt() versus qt() in R )

Q2)

a) Can we reject \(H_o\) at \(\alpha = 5\%\)?

\[ P_{\text{Value}} < \alpha \implies \text{Reject H}_o \]

In other words, based upon random-sample, there appears to be statistically significant evidence to conclude the to conclude that the two population-means are not the same.

b) One sided or two sided?

Well in the computer-software output it states : “T-Test of Difference 0 (vs not =) : T-Value = -3.47”, indicating that this is a two-sided test.

c) If :

\[ H_o : \mu_1 - \mu_1 = 2 \\ H_a : \mu_1 - \mu_1 \ne 2 \]

at \(\alpha = 5\%\), do we reject \(H_o\)?

\[ t_o = \frac{\Delta_{\bar{x} -\Delta_{\mu}}}{SE(\Delta_{\bar{x}})} \\ = \frac{\Delta_{\bar{x}}- 2}{SE(\Delta_{\bar{x}})} \\ \approx \frac{-2.33- 2}{SE(\Delta_{\bar{x}})} \]

Consider that the calculation of \(SE(\Delta_{\bar{x}})\) differs depending on whether or not the variance are equivalent ( \(\sigma_1=\sigma_2\) ). In our computer output it states, “Both use Pooled Std. Dev”–therefore indicating we should use the calculation that assumes equal variance.

\[ SE(\Delta_{\bar{x}}) = S_p * \sqrt{\frac{1}{n_1} + \frac{1}{n_2}} \\ \approx 2.1277 * \sqrt{\frac{2}{20}} \\ \approx 0.6728 \]

Therefore,

\[ t_o \approx -6.440 \]

Which, with such a large \(t_o\), its clear that we reject \(H_o\). Indicating that it is extremely unlikely that their true difference is 2; or that is \(\mu_1\) is 2 larger than \(\mu_2\). We should have suspected this result as it would take two-pooled Std. Dev to the left of our center (2), making it quite unlikely.

d) If :

\[ H_o : \mu_1 - \mu_1 = 2 \\ H_a : \mu_1 - \mu_1 < 2 \]

at \(\alpha = 5\%\), do we reject \(H_o\)?

Okay, so there is no need to calculate anything here. We already know from earlier, \(H_o\) is unconvincing and we also see from our sample that \(\Delta_{\bar{x}}\approx -2.33\) with \(S_d = 2.12\), so its suggestive that the true difference is below 2.

e) Calc CI-95% for upper-end

Recall :

\[ \Delta_{\bar{x}} \pm t_{\alpha,\text{df}}*SE(\Delta_{\bar{x}}) \]

Notice its \(\alpha\) and not \(\frac{\alpha}{2}\) as this is one-sided.

So, calc \(t_{\alpha,\text{df}}\) :

qt(1 - .05, df = 38)

## [1] 1.685954

-2.33 + 1.685954 * 0.6728

## [1] -1.19569

\[ [-2.33,-2.33 + 1.685954 * 0.6728] \\ [-2.33,-1.19569] \]

So, from our point estimate to about -1.2 is the upper end of the CI-95%. In other words, we are 95% confident on the upper end it is from about -2.33 to -1.2.

f) Calc P-Val for :

\[ H_o : \mu_1 - \mu_1 = 2 \\ H_a : \mu_1 - \mu_1 \ne 2 \]

2*pt(-6.4405, df = 38)

## [1] 1.418823e-07

So, extremely unlikely as indicated before.

Q3)

a) What Hyp. Are we Testing?

Okay, so what we would like to determine :

ARE BOTH MACHINES OUTPUTTING THE SAME QUANTITY?

To determine this we have the following Hypothesis :

\[ H_o : \mu_1 = \mu_2 \\ H_a : \mu_1 \ne \mu_2 \]

b/c) Test Hypothesis & p-val

Note that this Problem is different by the fact we know the Population Distributions and, Variance.

\[ \text{Let X = Dist. Of Net Bottle-Volume} \\ \text{Machine1} \implies X \sim N(\mu_1 = \ ?, \sigma_1^2 = 0.020^2) \\ \text{Machine2} \implies X \sim N(\mu_2 = \ ?, \sigma_2^2 = 0.025^2) \\ \]All this to say, the \(\text{Z-Test}\) is most Appropriate :

\[ Z_o = \frac{\bar{x}_1 - \bar{x}_2}{SE(\bar{x}_1 - \bar{x}_2)} \\ \text{For} : \\ SE(\bar{x}_1 - \bar{x}_2) = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}} \]

x1 <- mean(df$Machine1)
x2 <- mean(df$Machine2)
a <- (.015^2)/10
b <- (.018^2)/10
z_score <- (x1 - x2)/sqrt(a+b)
p_value <- 2 * pnorm(z_score, lower.tail = FALSE); p_value

## [1] 0.1771356

Therefore since \(\alpha=5\%\) :

\[ \text{P-val} \approx 19\% > \alpha \]

We can Fail to reject the Null-Hypothesis; in other words, it is unlikely that there is a difference between machine net volume.

d) 95%-CI for \(\Delta_{\bar{x}}\)

95%-CI for \(\Delta_{\bar{x}}\)

\[ \Delta_{\bar{x}} \pm Z_{\frac{\alpha}{2}}*SE(\Delta_{\bar{x}}) \]

z_crit <- qnorm(0.975)  # for 95% CI
me <- sqrt(a + b)
lower <- (x1 - x2) - z_crit*me
upper <- (x1 - x2) + z_crit*me
c(lower, upper)

## [1] -0.004522262  0.024522262

Notice that the interval includes 0, indicating we have evidence against alternative hypothesis as previously stated.

Q4)

a) Assumption : Normally Dist. \(\Delta\)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Based upon the QQ-plot, our data appears to be approximately normal, however likely due to the small sample size, it doesn’t look great per-se.

b/c) 95%-CI for \(\Delta_{\bar{x}}\)

\[ H_o : \Delta_\mu = 0 \\ H_a : \Delta_\mu \ne 0 \\ \]The most appropriate test is clearly a paired-t-test

t.test(df$BirthOrder1, df$BirthOrder2, paired = TRUE, conf.level = 0.95)

## 
##  Paired t-test
## 
## data:  df$BirthOrder1 and df$BirthOrder2
## t = -0.36577, df = 9, p-value = 0.723
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.3664148  0.2644148
## sample estimates:
## mean difference 
##          -0.051

As indicated by the R-output :

 -0.3664148  0.2644148

Is our CI-95% for \(\Delta_{\bar{x}}\). Furthermore,

p-value = 0.723

Indicating clearly that our Alternative Hypothesis is statistically unconvincing. Which makes sense because it would be silly for birth order to matter.

Q5)

a)

Consider the QQ-Plot for Formulation1

Consider the QQ-Plot for Formulation2

If you notice, for both there appears to be a staircase pattern ( ie. it over and under approx. )–furthermore, clearly in each dist. there are outliers and the tails aren’t looking good. However, i do see that they tend to over and under approximate at the same locations (approx.) to the same degree (approx.)

Therefore, the variance appears to be approx. equal however, they are pushing it when it comes to normality.

Additionally, consider the samples numerical-evidence :

sd(df$Formulation1)^2

## [1] 103.5455

sd(df$Formulation2)^2

## [1] 98.99242

Nonetheless, I wouldn’t consider this a good sample size as it looks like we may be capturing noise, we should gather more data.

b/c) Is Formulation 1 > Formulation 2 on Avr? \(\alpha = 5\%\)

\[ \text{Let. : } \mu_1 = \text{Avr-Form1}\\ H_o : \mu_1 = \mu_2 \\ H_o : \mu_1 > \mu_2 \]

So this is a one sided t-test. With assumed equal variance :

t.test(df$Formulation1, df$Formulation2, var.equal = TRUE, alternative = "greater")

## 
##  Two Sample t-test
## 
## data:  df$Formulation1 and df$Formulation2
## t = 0.34483, df = 22, p-value = 0.3667
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -5.637883       Inf
## sample estimates:
## mean of x mean of y 
##  194.5000  193.0833

Based upon our P-value :

p-value = 0.3667

And our sample Estimates :

sample estimates:
mean of x mean of y 
 194.5000  193.0833

And as we know the sample-variance from earlier

[1] 103.5455
[1] 98.99242

It is clear that at \(\alpha = 5\%\) level, we fail to reject the Null Hypothesis.

Q6)

t.test(machine1, machine2, var.equal = FALSE, alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  machine1 and machine2
## t = 0.79894, df = 17.493, p-value = 0.435
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01635123  0.03635123
## sample estimates:
## mean of x mean of y 
##    16.015    16.005

Consider that for this example, the conclusion doesn’t change whether population variance is known or unknown

p-value = 0.435

Therefore, both indicate that we dont find statistically significant evidence to reject the Null Hypothesis. In other words, we dont find a convincing difference in the averages.

HW 1 : Design & Analysis of Experiments

Isaiah C. Mireles

04/12/25

Q1)