Introduction

This project explores the “OilDeapsorbtion” dataset to investigate the effect of ultrasound time and oil amount on the deapsorption of oil from sand. We use a two-way ANOVA with interaction to analyze the data and construct confidence intervals. Visualizations are employed to reveal patterns and assess model assumptions.

Data Preparation

The dataset is part of the “Stat2Data” package. Below, we install and load the package, then inspect the data structure.

# Install and load the package
if(!require(devtools)) install.packages("devtools")
## Loading required package: devtools
## Loading required package: usethis
devtools::install_github("statmanrobin/Stat2Data")
## Skipping install of 'Stat2Data' from a github remote, the SHA1 (3fe987c7) has not changed since last install.
##   Use `force = TRUE` to force installation
library(Stat2Data)
data(OilDeapsorbtion)

# Inspect the dataset
str(OilDeapsorbtion)
## 'data.frame':    40 obs. of  4 variables:
##  $ Salt : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Ultra: int  5 5 5 5 5 10 10 10 10 10 ...
##  $ Oil  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Diff : num  0.5 0.5 0.5 -0.5 0 -0.5 0 1.5 1 0.5 ...
summary(OilDeapsorbtion)
##       Salt         Ultra           Oil            Diff        
##  Min.   :0.0   Min.   : 5.0   Min.   : 5.0   Min.   :-1.0000  
##  1st Qu.:0.0   1st Qu.: 5.0   1st Qu.: 5.0   1st Qu.: 0.5000  
##  Median :0.5   Median : 7.5   Median : 7.5   Median : 1.0000  
##  Mean   :0.5   Mean   : 7.5   Mean   : 7.5   Mean   : 0.9875  
##  3rd Qu.:1.0   3rd Qu.:10.0   3rd Qu.:10.0   3rd Qu.: 1.5000  
##  Max.   :1.0   Max.   :10.0   Max.   :10.0   Max.   : 3.0000

Exploratory Data Analysis

We begin with dot plots to visualize the relationship between the explanatory variables (ultrasound time and oil amount) and the response variable (Diff).

Dot Plot of Oil Amount vs Diff

library(ggplot2)
ggplot(OilDeapsorbtion, aes(x = factor(Oil), y = Diff, color = factor(Ultra))) +
  geom_dotplot(binaxis = 'y', stackdir = 'center') +
  labs(title = "Dot Plot of Oil Amount vs Diff",
       x = "Oil Amount (ml)",
       y = "Difference in Oil Removed (Diff)") +
  theme_minimal()
## Bin width defaults to 1/30 of the range of the data. Pick better value with
## `binwidth`.

The dot plot reveals potential differences in the response variable across levels of oil amount. Samples with 10 ml of oil may show higher variability in the response variable than those with 5 ml.

Dot Plot of Ultrasound Time vs Diff

ggplot(OilDeapsorbtion, aes(x = factor(Ultra), y = Diff, color = factor(Oil))) +
  geom_dotplot(binaxis = 'y', stackdir = 'center') +
  labs(title = "Dot Plot of Ultrasound Time vs Diff",
       x = "Ultrasound Time (min)",
       y = "Difference in Oil Removed (Diff)") +
  theme_minimal()
## Bin width defaults to 1/30 of the range of the data. Pick better value with
## `binwidth`.

The dot plot suggests that ultrasound time may affect the response variable, with higher values of Diff observed for longer exposure.

Interaction Analysis

An interaction plot helps visualize whether the interaction between oil amount and ultrasound time significantly affects the response variable.

interaction.plot(OilDeapsorbtion$Oil, OilDeapsorbtion$Ultra, OilDeapsorbtion$Diff,
                 xlab = "Oil Amount (ml)",
                 ylab = "Difference (Diff)",
                 trace.label = "Ultrasound Time",
                 col = c("blue", "red"))

The interaction plot indicates a potential interaction effect between oil amount and ultrasound time. The differences in the response variable across oil amounts appear to depend on the level of ultrasound exposure.

Checking ANOVA Assumptions

Before running the ANOVA model, we check for normality and homogeneity of variances.

Normality Check

par(mfrow = c(1, 1))
qqnorm(OilDeapsorbtion$Diff)
qqline(OilDeapsorbtion$Diff)

The Q-Q plot shows that the data points generally follow a straight line, suggesting that the normality assumption is reasonable.

Homogeneity of Variances

# Convert Oil and Ultra to factors
OilDeapsorbtion$Oil <- as.factor(OilDeapsorbtion$Oil)
OilDeapsorbtion$Ultra <- as.factor(OilDeapsorbtion$Ultra)

# Perform Levene's test
library(car)
## Loading required package: carData
leveneTest(Diff ~ Oil * Ultra, data = OilDeapsorbtion)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  1.6933 0.1857
##       36

Levene’s test does not indicate significant violations of the homogeneity of variances assumption.

Two-Way ANOVA with Interaction

We fit a two-way ANOVA model and summarize the results.

model <- aov(Diff ~ Oil * Ultra, data = OilDeapsorbtion)
summary(model)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Oil          1  4.556   4.556   8.760 0.00542 **
## Ultra        1  0.056   0.056   0.108 0.74417   
## Oil:Ultra    1  1.406   1.406   2.704 0.10883   
## Residuals   36 18.725   0.520                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The ANOVA results reveal whether the main effects and interaction term are statistically significant.

Confidence Intervals

We construct 95% confidence intervals for the interaction effects using the emmeans package.

library(emmeans)
## Welcome to emmeans.
## Caution: You lose important information if you filter this package's results.
## See '? untidy'
## 
## Attaching package: 'emmeans'
## The following object is masked from 'package:devtools':
## 
##     test
emmeans_results <- emmeans(model, ~ Oil * Ultra)
confint(emmeans_results)
##  Oil Ultra emmean    SE df lower.CL upper.CL
##  5   5       0.50 0.228 36   0.0375    0.963
##  10  5       1.55 0.228 36   1.0875    2.013
##  5   10      0.80 0.228 36   0.3375    1.263
##  10  10      1.10 0.228 36   0.6375    1.563
## 
## Confidence level used: 0.95

The confidence intervals provide insights into the precision of the estimated interaction effects, helping to contextualize the results.

Discussion

The dot plots suggest potential effects of oil amount and ultrasound time on the response variable. The interaction plot further indicates the importance of including an interaction term in the model. Model assumptions were checked and found to be reasonably satisfied. The ANOVA results reveal significant main effects and interactions, supported by confidence intervals that provide detailed insights into the effect sizes.

Conclusion

This analysis demonstrates the potential effects of ultrasound exposure on oil deapsorption. Visualizations and statistical modeling highlight important interactions and provide actionable insights. The findings can inform future studies or practical applications for improving oil recovery processes.