PROBLEM:
The engineer is interested in a particular gas (C2F6) and gap (0.80 cm) and wants to test four levels of power settings: 160W, 180W, 200W, and 220W. The engineer decided to test five wafers at each level of power. The experiment is replicated % times; runs are made in random order.
# Create a vector for the Power levels
power <- c(160, 160, 160, 160, 160, # Five entries for power level 160
180, 180, 180, 180, 180, # Five entries for power level 180
200, 200, 200, 200, 200, # Five entries for power level 200
220, 220, 220, 220, 220) # Five entries for power level 220
# Create a vector for the corresponding observations
observation <- c(575, 542, 530, 539, 570, # Observations corresponding to power 160
565, 593, 590, 579, 610, # Observations corresponding to power 180
600, 651, 610, 637, 629, # Observations corresponding to power 200
725, 700, 715, 685, 710) # Observations corresponding to power 220
# Combine the power and observation vectors into a data frame
data <- data.frame(Power = power, Observation = observation)
# Print the data frame to verify its structure
data
## Power Observation
## 1 160 575
## 2 160 542
## 3 160 530
## 4 160 539
## 5 160 570
## 6 180 565
## 7 180 593
## 8 180 590
## 9 180 579
## 10 180 610
## 11 200 600
## 12 200 651
## 13 200 610
## 14 200 637
## 15 200 629
## 16 220 725
## 17 220 700
## 18 220 715
## 19 220 685
## 20 220 710
The data provided is a table showing observations of power output (in watts) across five trials. There are four different power levels: 160W, 180W, 200W, and 220W. For each power level, there are five observations, indicating possible measurements of power output in different trials.
EXPLORATORY DATA ANALYSIS
Descriptive Statistics:
# Load the dplyr library for data manipulation
library(dplyr)
# Calculate descriptive statistics grouped by Power levels
summary_stats <- data %>%
group_by(Power) %>% # Group the data by the 'Power' column
summarize(
Mean = mean(Observation), # Calculate the mean of 'Observation' for each power level
Median = median(Observation), # Calculate the median of 'Observation' for each power level
Min = min(Observation), # Find the minimum 'Observation' for each power level
Max = max(Observation), # Find the maximum 'Observation' for each power level
SD = sd(Observation), # Calculate the standard deviation of 'Observation'
Q1 = quantile(Observation, 0.25), # Calculate the 1st quartile (25th percentile) of 'Observation'
Q3 = quantile(Observation, 0.75), # Calculate the 3rd quartile (75th percentile) of 'Observation'
IQR = IQR(Observation) # Calculate the interquartile range (IQR) of 'Observation'
)
# Print the summary statistics table
print(summary_stats)
## # A tibble: 4 × 9
## Power Mean Median Min Max SD Q1 Q3 IQR
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 160 551. 542 530 575 20.0 539 570 31
## 2 180 587. 590 565 610 16.7 579 593 14
## 3 200 625. 629 600 651 20.5 610 637 27
## 4 220 707 710 685 725 15.2 700 715 15
The table above suggests that as power increases, the mean and median observation values increase, indicating a positive relationship between power and observations. Also, observations are generally symmetrical at 180W and 220W, with slight skewness at 160W and 200W. Further, standard deviation and IQR are fairly stable, with minor increases in variability at 200W compared to 160W and 180W, but a lower standard deviation at 220W suggests more consistent observations at higher power.
Boxplot of Observations by Power Level
# Create a boxplot of Observations grouped by Power levels
boxplot(data$Observation ~ data$Power,
main = "Boxplot of Observations by Power Level", # Title of the boxplot
xlab = "Power (W)", # Label for the x-axis (Power levels)
ylab = "Observation", # Label for the y-axis (Observation)
col = "lightgreen", # Set the fill color of the boxplots to light green
border = "darkgreen") # Set the border color of the boxplots to dark green
From the boxplot, we can see that at 160W, the observations are
more tightly distributed around the lower range (530–575). At 180W, the
observations show a slightly wider range (565–610). At 200 W,
there is a noticeable increase in the spread of observations,
with a wider range (600–651). At 220 W, the observations are
again more spread out but tend to be in a higher range (685–725).
Scatterplot of Power vs Observation:
# Plot a scatterplot of Power vs Observation
plot(data$Power, data$Observation,
main = "Scatterplot of Power vs Observation", # Title of the plot
xlab = "Power (W)", # Label for the x-axis (Power)
ylab = "Observation", # Label for the y-axis (Observation)
pch = 19, # Use solid circles as the plotting symbol
col = "darkgreen") # Set the color of the points to dark green
# Add a regression line to the scatterplot
abline(lm(Observation ~ Power, data = data), # Fit a linear model (lm) to the data and add the regression line
col = "red") # Set the color of the regression line to red
The scatterplot indicates a positive relationship: as the power
increases, the observation also increases. A linear trend line
has been added to the scatterplot, showing that the relationship between
power and observation is approximately linear.
Conclusion:
Overall, the descriptive statistics suggest that as power increases, so does the central tendency of the observations. Also, the scatterplot line suggest a positive linear relationship between power and observation. Moreover, the boxplot indicates increasing variability in observations with higher power levels.