Cadutdut_STAT55A

library(shiny)

Problem: The engineer is interested in a particular gas (C2F6) and gap (0.80 cm) and wants to test four levels of power settings: 160W, 180W, 200W, and 220W. The engineer decided to test five wafers at each level of power. The experiment is replicated % times; runs are made in random order.

# Encoding the data into R

# Creating a vector 'power' that contains power levels in watts
power <- c(160, 160, 160, 160, 160,  # Five entries with power level 160 watts
           180, 180, 180, 180, 180,  # Five entries with power level 180 watts
           200, 200, 200, 200, 200,  # Five entries with power level 200 watts
           220, 220, 220, 220, 220)  # Five entries with power level 220 watts

# Creating a vector 'observation' that contains the observed measurements corresponding to each power level
observation <- c(575, 542, 530, 539, 570,  # Observations for power level 160 watts
                 565, 593, 590, 579, 610,  # Observations for power level 180 watts
                 600, 651, 610, 637, 629,  # Observations for power level 200 watts
                 725, 700, 715, 685, 710)  # Observations for power level 220 watts

# Combining the vectors 'power' and 'observation' into a data frame 'data'
data <- data.frame(Power = power, Observation = observation)

# Displaying the data frame
data

##    Power Observation
## 1    160         575
## 2    160         542
## 3    160         530
## 4    160         539
## 5    160         570
## 6    180         565
## 7    180         593
## 8    180         590
## 9    180         579
## 10   180         610
## 11   200         600
## 12   200         651
## 13   200         610
## 14   200         637
## 15   200         629
## 16   220         725
## 17   220         700
## 18   220         715
## 19   220         685
## 20   220         710

The table provides data on power output (in watts) measured across five trials. It features observations at four specific power levels: 160W, 180W, 200W, and 220W. For each of these power levels, five separate measurements were recorded, reflecting the power output under different trial conditions.

Exploratory Data Analysis

# Descriptive Statistics

# Loading the 'dplyr' library for data manipulation and summarization
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# Calculating summary statistics for each power level in the data
summary_stats <- data %>%
  
# Grouping the data by the 'Power' variable
  group_by(Power) %>%
  
# Summarizing the data by calculating various statistics for 'Observation' within each power group
  summarize(
    
# Calculating the mean of the observations for each power level
    Mean = mean(Observation),
    
# Calculating the median of the observations for each power level
    Median = median(Observation),
    
# Finding the minimum observation value for each power level
    Min = min(Observation),
    
# Finding the maximum observation value for each power level
    Max = max(Observation),
    
# Calculating the standard deviation of the observations for each power level
    SD = sd(Observation),
    
# Calculating the first quartile (25th percentile) of the observations for each power level
    Q1 = quantile(Observation, 0.25),
    
# Calculating the third quartile (75th percentile) of the observations for each power level
    Q3 = quantile(Observation, 0.75),
    
# Calculating the interquartile range (IQR) of the observations for each power level
    IQR = IQR(Observation)
  )

# Printing the summary statistics
print(summary_stats)

## # A tibble: 4 × 9
##   Power  Mean Median   Min   Max    SD    Q1    Q3   IQR
##   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1   160  551.    542   530   575  20.0   539   570    31
## 2   180  587.    590   565   610  16.7   579   593    14
## 3   200  625.    629   600   651  20.5   610   637    27
## 4   220  707     710   685   725  15.2   700   715    15

As power increases, both the mean and median observation values rise, indicating a positive correlation between power and the observations. The data shows symmetry in observations at 180W and 220W, with slight skewness at 160W and 200W. Additionally, the standard deviation and interquartile range (IQR) remain relatively stable, though variability slightly increases at 200W compared to 160W and 180W. However, a lower standard deviation at 220W suggests more consistent observations at the higher power level.

# Boxplot of Observations by Power level

# Creating a boxplot to visualize the distribution of observations for each power level
boxplot(data$Observation ~ data$Power,
  
  # Setting the main title of the boxplot
  main = "Boxplot of Observations by Power Level",
  
  # Labeling the x-axis with the name 'Power (W)' to represent power levels in watts
  xlab = "Power (W)",
  
  # Labeling the y-axis with the name 'Observation' to represent the observed measurements
  ylab = "Observation",
  
  # Setting the color of the boxplot to 'darkblue'
  col = "darkblue",
  
  # Setting the color of the box borders to 'lightblue'
  border = "lightblue"
)

The boxplot reveals that at 160W, the observations are more closely clustered within the lower range of 530–575. At 180W, the data exhibit a slightly broader distribution, spanning from 565 to 610. At 200W, there’s a significant increase in the variability, with observations spreading across a wider range of 600–651. Finally, at 220W, the observations are more dispersed but generally fall within a higher range of 685–725.

# Scatterplot of Observations vs Power

# Creating a scatterplot to visualize the relationship between power and observations
plot(data$Power, data$Observation,
  
  # Setting the main title of the scatterplot
  main = "Scatterplot of Power vs Observation",
  
  # Labeling the x-axis with the name 'Power (W)' to represent power levels in watts
  xlab = "Power (W)", 
  
  # Labeling the y-axis with the name 'Observation' to represent the observed measurements
  ylab = "Observation",
  
  # Setting the plotting character to solid circles
  pch = 19, 
  
  # Setting the color of the points to 'blue'
  col = "darkgreen"
)

# Adding a regression line to the scatterplot to represent the linear relationship between power and observations
abline(lm(Observation ~ Power, data = data), col = "red")

The scatterplot shows a positive correlation: as power increases, the observations also tend to rise. A linear trend line has been added, highlighting that the relationship between power and observation is roughly linear.

Conclusion: Overall, the descriptive statistics indicate that as power increases, the central tendency of the observations also rises. The scatterplot’s trend line suggests a positive linear relationship between power and observation. Additionally, the boxplot shows that variability in observations increases with higher power levels.

```

Cadutdut_STAT55A

Ferbon Cadutdut

2024-08-30