We will investigate two variables and how they influence the
difference between the amounts of oil cleaned in an oil spill versus the
amounts cleaned in the control runs. To do so, we can use a two-way
anova model. The two explanatory variables of focus are both the amount
of oil present in the spill and the duration for which the oil to be
cleaned has been exposed to ultrasound.
We can clearly see that the distributions of Diff are quite
different, with the lower ultrasound exposure having a more spread-out
distribution. We are correct to consider Ultra as a significant
explanatory variable to analyze. Next we will look at how Diff responds
to the volume of oil.
Similarly, when more milliliters of oil are introduced (as done here
with the comparison of Oil plots), the result is significant; this time,
the Diff values seem shifted right and slightly “flattened” when more
oil is introduced.
The interaction plot reveals a significant interaction between
ultrasound exposure and oil volume, as the lines are not parallel,
indicating that the effect of ultrasound varies depending on the amount
of oil present. We must then include an interaction term in the model to
maintain higher accuracy.
To estimate the effects of the ultrasound and spilled oil
quantity on the difference in oil removed, we construct a two-way anova
model with interaction terms. We do this by letting
## [1] "model = aov(Diff ~ Ultra * Oil, data = oilData)"
Doing so yields the following:
## Df Sum Sq Mean Sq F value Pr(>F)
## oilData$Ultra 1 0.056 0.056 0.108 0.74417
## oilData$Oil 1 4.556 4.556 8.760 0.00542 **
## oilData$Ultra:oilData$Oil 1 1.406 1.406 2.704 0.10883
## Residuals 36 18.725 0.520
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The residuals vs. fitted plot shows that the data is roughly
scattered around 0, that the model largely captures the variation in the
data. Additionally, the Q-Q plot somewhat suggests normality of the
residuals, but the residuals are still slightly skewed.
It is
worth noting that fitting an anova model to this data is slightly
different than is normally done, as all data is quantitative. However,
the nature of the quantitative data, being either one of two numbers in
the cases of both Ultra and Oil, lends itself to a categorical approach.
We find that the interaction term is not statistically significant (p
> 0.05), so we can continue to examine the individual effects. The
ultrasound variable is also not significant (p > 0.05). These suggest
that there is no clear impact in the difference in cleaned oil. However,
the volume of oil itself is statistically significant (p < 0.05).
This result suggests that ultrasound exposure does not have a
significant impact on cleaning efficiency, while the volume of oil plays
a significant role.
We can find a 95% confidence interval for
the average difference between oil cleaned after ultrasound and oil
cleaned in the control groups by calculating the means of the
differences with 5 and 10 ml of oil, calculating the t-value for 95%
confidence with 36 degrees of freedom, and finding the standard error.
Computing this, we obtain a 95% confident that the average
difference in oil cleaned from either 5ml or 10ml of oil of -2.112 to
3.462. By the same method, we obtain a 95% confidence interval that the
average difference in oil cleaned based on ultrasound exposure for
either 5 or 10 minutes is -2.862 to 2.712.
The 95% confidence
interval for the difference in cleaned oil suggests that, on average,
the difference between oil cleaned after ultrasound treatment and
control ranges between -2.112 and 3.462 ml (for both 5 ml and 10 ml oil
volumes). Similarly, the confidence interval for ultrasound exposure
suggests the difference in cleaned oil falls between -2.862 and 2.712
ml.