Introduction

This report investigates the effect of ultrasound on oil deabsorption using the OilDeapsorbtion dataset from the Stat2Data package. Through visualizations, interaction analysis, and statistical modeling, the goal is to assess the impact of two explanatory variables—oil amount and ultrasound time—on the response variable Diff, which represents the difference in oil removed during experimental and control runs. The report includes dot plots, interaction plots, model fitting, and confidence interval construction.

Section A: Dot Plots

First dot plots were created to investigate whether the two explanatory variables might have an effect on the response variable. The plot is shown below.

Based on the dot plot it shows a distinct cluster of values. The data points form two clear clusters which suggests there are at least two distinct outcomes in how much oil is removed under the tested conditions. The patterns do not look like they are random for the Oil Amount or the Ultrasound Time. Based on the dot plots it seems likely that a two way ANOVA can show significant differences that are associated with these factors.

Section B: Interaction Plot

Next an interaction plot was created to see if the intreaction term was worth putting into the model. Below shows the plot.

The two lines have noticeably different slopes which means that the effect of increasing Oil Amount and Mean Diff depends on which Ultrasound level is being observed at the time. Since the lines are not parallel the effect of Oil Amount is not the same for each Ultrasound setting. Because the lines are not parallel this shows that the interaction term should be included in the model.

Section C: Fit the Model and Check Conditions

Next the conditions for a two-way ANOVA with interaction model will be checked to see if the conditions are met. For a two-way ANOVA with interaction, the main assumptions are independence of observations, normality of residuals, and constant variance. Below shows the 3 plots used to observe this.

## hat values (leverages) are all = 0.1
##  and there are no factor predictors; no plot no. 5

First looking at the Residuals vs Fitted shows that the residuals appear to hover somewhat consistently around the horizontal line at 0. The smoothing line is almost flat which shows no strong curvature or pattern. This shows that the assumption of linearity is reasonable and there isn’t a glaring issue so far. The next plot is the Q-Q plot. The points are fairly close to the diaganal line with slight deviations at the extremes. The residuals at the far ends deviate quite a bit but the rest of the pattern is relatively straight which shows the assumption for normality could be considered good enough. The last plot is the Scale-Location plot. The red line in the plot is relatively flat and the spread of points does not drastically change with the fitted values. This indicates the assumption of constant variance is reasonably met. In the context of meeting the two-way ANOVA standards, the normality is mostly acceptable with the Q-Q plot not being perfect but good enough. The constant variance looks acceptable in both the Residuals vs Fitted and the Scale-Location plots. There are also no extreme patterns in residuals in the Residuals vs Fitted plots. These suggest the assumptions for running a two-way ANOVA model are fairly well met.

Section D: Run the Model and Show Summary

Next the two-way ANOVA model is ran and the results are interpreted below

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Oil          1  4.556   4.556   8.760 0.00542 **
## Ultra        1  0.056   0.056   0.108 0.74417   
## Oil:Ultra    1  1.406   1.406   2.704 0.10883   
## Residuals   36 18.725   0.520                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

For the Oil, since the p-value is less than 0.05 the effect of Oil on the response variable is statistically significant. This suggests that the mean difference in oil removed significantly varies depending on the amount of oil used. Next for Ultra, the p-value is large indicating there is no statistically significant effect of Ultrasound time on the difference in oil removed. The changes in Ultrasound time do not appear to significantly influence the outcome under the conditions tested. For the interaction effect Oil:Ultra the p-value is larger than 0.05 which indicates it is not statistically significant. This means the effect of Oil on Diff does not statistically vary at different levels of Ultrasound time. This suggest there is no evidence that the effect of oil amount on oil removal changes depending on ultrasound time.

Section E: Construct 95% Confidence Intervals

Lastly confidence intervals were constructed and observed. From the residuals row shown above the MSE = 0.520 and the df = 36. Nextthe marginal means were found to help calculate the confidence interval shown below.

## Tables of means
## Grand mean
##        
## 0.9875 
## 
##  Oil 
## Oil
##     5    10 
## 0.650 1.325 
## 
##  Ultra 
## Ultra
##     5    10 
## 1.025 0.950 
## 
##  Oil:Ultra 
##     Ultra
## Oil  5    10  
##   5  0.50 0.80
##   10 1.55 1.10

For oil, the difference in means would be 1.325-0.650 = 0.675. Assuming the experiment is balanced that would leave us with N1 = 20 (for Oil = 5) N2 = 20 (for Oil = 10) Next we plug these numbers into the standard error eqaution to get the t-critical value. Doing so leaves us with a t-value of t ~ 2.028 Next the 95% confidence interval can be constructed so d ± t * SE = 0.675 ± 2.028 * 0.228 Calculate the margin of error 2.028 * 2.028 = 0.463 So the 95% confidence interval is (0.675-0.463, 0.675+0.463) = (0.212, 1.138) Since the confidence interval does not include 0 it suggests a statistically significant increase in response when the oil amount is increased from 5 ml to 10 ml.

Using the same methodology as above for the ultrasound the difference in means from above would be 1.025-0.950 = 0.075. The SE is calculated to be 0.228. Therefore the 95% CI for Ultrasound difference is 0.0075 ± 2.028 * 0.228 = 0.075 ± 0.463 This results in (0.075-0.463, 0.075 + 0.463) = (-.388, 0.538) Because this interval includes 0 it indicates there is no statistically significant difference in performance between 5 and 10 minutes of ultrasound.

Appendix

Code for dot plot/setting up program

install.packages("devtools")
devtools::install_github("statmanrobin/Stat2Data")

library(Stat2Data)
library(emmeans)
library(Stat2Data)
data(package = "Stat2Data")

data(OilDeapsorbtion)

par(mfrow = c(1, 2))
stripchart(Diff ~ Oil, data = OilDeapsorbtion,
           method = "jitter", pch = 19,
           main = "Difference in Oil Removed by Oil Amount",
           xlab = "Oil Amount", ylab = "Diff")
stripchart(Diff ~ Ultra, data = OilDeapsorbtion,
           method = "jitter", pch = 19,
           main = "Difference in Oil Removed by Ultrasound Time",
           xlab = "Ultrasound Time", ylab = "Diff")

Code for interaction plot

par(mfrow = c(1, 1))
with(OilDeapsorbtion, interaction.plot(Oil, Ultra, Diff,
                                       main = "Interaction Plot",
                                       xlab = "Oil Amount", ylab = "Mean Diff",
                                       trace.label = "Ultrasound"))

Code for checking conditions for two-way ANOVA interaction

model <- aov(Diff ~ Oil * Ultra, data = OilDeapsorbtion)
par(mfrow = c(2, 2))
plot(model)

Code for running ANOVA model

summary(model)

Code for finding marginal means

model <- aov(Diff ~ Oil * Ultra, data = OilDeapsorbtion)
model.tables(model, type = "means")