Author: Tafadzwa Banga
Course: IE 5344
Date: 16/03/2025
Semiconductor manufacturing relies heavily on precise control of thin film deposition processes. In this project, we investigate the relationship between film thickness and electrical resistance in semiconductor wafers. The goal is to determine whether film thickness is a strong predictor of resistance, which has implications for process control and quality improvement in semiconductor fabrication.
The dataset consists of 100 observations collected
from a wafer fabrication process. The variables include:
- Film Thickness (X): Measured in nanometers
(nm).
- Electrical Resistance (Y): Measured in ohms (Ω).
The dataset can be accessed here.
We begin by examining the distribution of electrical resistance using a histogram and box plot.
# Load necessary libraries
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#fetching data from url link
url<- "https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/semiconductor_SLR_dataset.csv"
#downloading data from url
download.file(url, destfile = "semiconductor_SLR_dataset.csv")
#load the datasets
semiconductor_SLR_dataset <- read.csv("semiconductor_SLR_dataset.csv")
#view data
head(semiconductor_SLR_dataset)
## Film_Thickness_nm Electrical_Resistance_mOhm
## 1 87.45 15.118
## 2 145.07 23.601
## 3 123.20 19.904
## 4 109.87 16.103
## 5 65.60 12.901
## 6 65.60 13.278
ggplot(semiconductor_SLR_dataset, aes(x = Electrical_Resistance_mOhm)) + geom_histogram(binwidth = 5, fill = "blue", color = "black") + labs(title = "Histogram of Electrical Resistance", x = "Resistance (Ω)", y = "Frequency")
ggplot(semiconductor_SLR_dataset, aes(y = Electrical_Resistance_mOhm)) + geom_boxplot(fill = "orange") + labs(title = "Box Plot of Electrical Resistance", y = "Resistance (Ω)")
ggplot(semiconductor_SLR_dataset, aes(x = Film_Thickness_nm, y = Electrical_Resistance_mOhm)) + geom_point(color = "darkgreen") + labs(title = "Scatterplot of Resistance vs. Film Thickness", x = "Film Thickness (nm)", y = "Resistance (Ω)")
For the simple linear regression we aim to fit the regression model to electrical resistance based on the film thickness. To achive the doal is to ensure that we are able to define how much data does the model explain from the data set. This can help validate its usefulness.
# Fit the linear regression model
model <- lm(Electrical_Resistance_mOhm ~ Film_Thickness_nm, data = semiconductor_SLR_dataset)
summary(model)
##
## Call:
## lm(formula = Electrical_Resistance_mOhm ~ Film_Thickness_nm,
## data = semiconductor_SLR_dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.27640 -0.75508 -0.08631 0.70422 2.69671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.870489 0.356848 13.65 <2e-16 ***
## Film_Thickness_nm 0.122954 0.003518 34.95 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.041 on 98 degrees of freedom
## Multiple R-squared: 0.9257, Adjusted R-squared: 0.925
## F-statistic: 1221 on 1 and 98 DF, p-value: < 2.2e-16
The positive slope indicates that as film thickness increases, electrical resistance increases. The R-squared value suggests that 92.57% of the variation in resistance is explained by film thickness.
The regression equation used for this model is :
\[ Resistance=β0+β1×Thickness+e \]
To check the assumptions of the linear regression of normal residuals and homoscedastacity we start by showing the regression line on the data
ggplot(semiconductor_SLR_dataset, aes(x = Film_Thickness_nm, y = Electrical_Resistance_mOhm)) +
geom_point() +
stat_smooth(method = "lm", col = "red")
## `geom_smooth()` using formula = 'y ~ x'
We check the normality of residuals using a Q-Q plot.
qqnorm(resid(model))
qqline(resid(model))
plot(model, which = 1)
Calculating the 95% confidence interval (CI) and prediction interval (PI) for resistance at 100 nm thickness.
The analysis includes a scatterplot with the regression line, CI, and PI.
new_data <- data.frame(Film_Thickness_nm = 100)
predict(model, new_data, interval = "confidence")
## fit lwr upr
## 1 17.16585 16.95815 17.37354
predict(model, new_data, interval = "prediction")
## fit lwr upr
## 1 17.16585 15.08893 19.24276At 100 nm thickness, the expected resistance is 17.16585 Ω, with a 95% confidence interval of [Lower= 16.95815, Upper = 17.37354] Ω. The prediction interval is [Lower =15.08893, Upper = 19.24276] Ω, indicating the range within which future observations are expected to fall.
The results from the simple linear regression model indicate that film thickness is a statistically significant predictor of electrical resistance. Specifically, the model demonstrates that as film thickness increases, electrical resistance increases. This relationship is supported by a high R-squared value of 92.57%, indicating that the model explains approximately 92.57% of the variability in electrical resistance based on film thickness alone. This is a strong result, suggesting that film thickness is a key factor influencing resistance in this context.
The statistical significance of the model is further confirmed by the p-value associated with the slope coefficient, which is significantly lower than common significance thresholds (e.g., 0.001, 0.01, 0.05, and 0.1). This provides strong evidence against the null hypothesis, reinforcing the validity of the relationship between film thickness and resistance.
Additionally, the 95% confidence and prediction intervals calculated for a film thickness of 100 nm, a critical value in semiconductor manufacturing provide valuable insights for process control. These intervals indicate that the process is stable and predictable, with resistance values falling within an acceptable range. This is crucial for ensuring that semiconductor devices meet design specifications and perform reliably.
From a theoretical perspective, these findings do not quite align with fundamental principles of electrical conductivity at macro scale but this can be understandable since material behave differently at nanoscale. Thicker films generally provide more pathways for current flow, leading to lower resistance. This consistency between the empirical data from the linear model validates the model’s reliability.