PROBLEM: The engineer is interested in a particular gas (C2F6) and gap (0.80 cm) and wants to test four levels of power settings: 160W, 180W, 200W, and 220W. The engineer decided to test five wafers at each level of power. The experiment is replicated 5 times; runs made in random order.

DATA

# Creating the data frame
data <- data.frame(
  POWER = rep(c(160, 180, 200, 220), each = 5),
  Observation = c(575, 542, 530, 539, 570,
                  565, 593, 590, 579, 610,
                  600, 651, 610, 637, 629,
                  725, 700, 715, 685, 710)
)

# Display the data
print(data)
##    POWER Observation
## 1    160         575
## 2    160         542
## 3    160         530
## 4    160         539
## 5    160         570
## 6    180         565
## 7    180         593
## 8    180         590
## 9    180         579
## 10   180         610
## 11   200         600
## 12   200         651
## 13   200         610
## 14   200         637
## 15   200         629
## 16   220         725
## 17   220         700
## 18   220         715
## 19   220         685
## 20   220         710

The data provided above shows the observations obtained from testing four levels of power settings, 160W, 180W, 200W, and 220W, at five wafers. Here, the experiment is replicated 5 times.

DESCRIPTIVE STATISTICS

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Summary statistics for each power level including the mean, median, standard deviation, minimum value, maximum value, 1st and 3rd quantile, and interquartile range
summary_stats <- data %>%
  group_by(POWER) %>%
  summarise(
    Mean = mean(Observation),
    Median = median(Observation),
    Std_Dev = sd(Observation),
    Min = min(Observation),
    Max = max(Observation),
    Q1 = quantile(Observation, 0.25),
    Q3 = quantile(Observation, 0.75),
    IQR = IQR(Observation)
  )

print(summary_stats)
## # A tibble: 4 × 9
##   POWER  Mean Median Std_Dev   Min   Max    Q1    Q3   IQR
##   <dbl> <dbl>  <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1   160  551.    542    20.0   530   575   539   570    31
## 2   180  587.    590    16.7   565   610   579   593    14
## 3   200  625.    629    20.5   600   651   610   637    27
## 4   220  707     710    15.2   685   725   700   715    15

The table provides a summary of key statistical measures for observations at different power levels (160 W, 180 W, 200 W, and 220 W). Each row corresponds to a power level and includes the mean, median, minimum, maximum, standard deviation (SD), first quartile (Q1), third quartile (Q3), and interquartile range (IQR).It shows that there is a consistent increase in central tendency measures (Mean and Median) as the power increases from 160W to 220W, indicating that higher power is associated with higher overall data values. While the minimum and maximum values, as well as the quartiles (Q1 and Q3), also shift upward with increasing Power, the variability within each Power group, indicated by standard deviation (SD) and interquartile range (IQR), varies without a clear pattern. This suggests that while higher Power levels are associated with higher data values, the spread or consistency of these values varies across different Power levels.

BOX-PLOT

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
# Boxplot of Observations by Power Level

boxplot(data$Observation ~ data$POWER,
        main = "Boxplot of Observations by Power Level",
        xlab = "Power (W)",
        ylab = "Observation",
        col = "lightgreen", border = "purple")

The boxplot titled “Boxplot of Observations by Power Level” shows the distribution of observations for different power levels (160 W, 180 W, 200 W, and 220 W). Each boxplot represents the spread of observations, with the central box indicating the interquartile range (IQR), the line inside the box showing the median, and the “whiskers” extending to the minimum and maximum values within 1.5 times the IQR. It shows that as the Power level increases from 160 to 220 watts, the observations consistently shift to higher values. The median value within each Power level also rises, indicating a clear positive trend. The spread (IQR) and variability seem to slightly decrease as Power increases, especially at the highest Power level (220 W), where the observations are more tightly clustered, indicating more consistency in the data at higher power levels.

SCATTERPLOT

# Scatterplot of Observations by Power Level
plot(data$Observation ~ data$POWER,
     main = "Scatterplot of Observations by Power Level",
     xlab = "Power (W)", 
     ylab = "Observation",
     pch = 19, col = "purple")

abline(lm(Observation ~ POWER, data = data), col = "lightgreen")

The scatterplot titled “Scatterplot of Observations by Power Level” illustrates a positive correlation between power level (in watts) and observations. As the power level increases from 160 W to 220 W, the number of observations also rises, indicating a direct relationship between these two variables. The trend line, which passes through the data points, reinforces this positive correlation. This graph suggests that higher power levels are associated with higher observations, which could be significant in fields like physics or engineering where understanding the impact of power output is crucial.

CONCLUSION This exploratory data analysis indicates a positive relationship between the power settings and observation values, that is, with higher power levels (160, 180, 200, 220), each leads to an increased observation values. The plots also shows greater variability in the results, as seen by the wider spread in the data at these higher power levels.