Semiconductor manufacturing is an extremely precise process; even the smallest deviations can significantly impact product performance. One critical step in fabricating semiconductor equipment is depositing thin films on silicon wafers. These films have multiple purposes, such as insulating layers, conductive interconnects, or diffusion barriers. Therefore, the quality and uniformity of these films play an important role in ensuring the semiconductor devices function correctly and reliably. Film thickness is typically measured in nanometers (nm) and has electrical properties such as resistance, capacitance, and leakage currents. It is important that these properties meet design specifications.
In this analysis, the film deposition method used is Sputter Deposition. This method is commonly used to deposit metal films in semiconductor manufacturing. Small deviations in deposition rate, chamber rate, or plasma intensity can lead to varying film thickness, affecting electrical resistance. The objective is to determine whether a linear relationship exists between the collected data, film thickness (x) and electrical resistance (y).
Exploratory data analysis aims to better comprehend the dataset. Tools such as histograms, box plots, and scatter plots analyze the data’s characteristics, patterns, and relationships. Histograms and box plots are commonly used to display a variable’s distribution graphically, while a scatter plot is typically used to evaluate the relationship between two variables.
A histogram and box plot were utilized to understand the distribution of the response variable, Electrical Resistance (mOhm). The histogram indicates that the data is right-skewed, with most of the data concentrated in the lower electrical resistance values. The box plot confirms that the data is right-skewed since the upper whisker is longer than the lower whisker of the box plot. The scatter plot indicates there is a positive correlation between Film Thickness (nm) and Electrical Resistance (mOhm), meaning as film thickness increases, electrical resistance also increases. The graph also indicates a linear pattern, which suggests that a linear regression model could describe the relationship between the variables.
Lease Squares Estimation (LSE) is used in regression analysis to find the best fitting line that defines that model by minimizing the sum of squared differences of the predictor variable and the response variable.
\[ y=\beta _{o}+\beta _{1}*x+\varepsilon \]
where
\[ y=\text{response variable, Electrical Resistance (mOhm)} \]
\[ x=\text{predictor variable, Film Thickness (nm)} \]
\[ b_{0} =\text{intercept (when y=0)} \]
\[ b_{1} =\text{slope (change in y for one-unit increase in x)} \]
\[ \varepsilon =\text{error term (random variation not explained by x)} \]
\[ \sum=(y_{i}-\hat{y}_{i})^2 \]
where
\[ y_{i}=\text{observed values} \]
\[ \hat{y}_{i}=\text{predicted values from the regression line} \]
The lm() function in R fits a linear model between the variables, Film Thickness and Electrical Resistance, while minimizing the sum of squared residuals.
The summary() function presents an overview of the statistical properties of the linear regression model that determine how well film thickness predicts Electrical Resistance. Key statistical properties include the values for the intercept, slope, p-value for the intercept, p-value for the slope, and multiple r-squared. The p-value describes the probability of obtaining the observed data under the null hypothesis of a statistical test. The null hypothesis is the baseline assumption that there isn’t a relationship between the variables in a population. The multiple r-squared value measures the amount of variance in the dependent variable explained by the independent variable. The summary statistics show both p-values are less than 0.05, providing sufficient evidence to reject the null hypothesis, and the multiple r-squared value is .93. Although the metrics indicate that the linear regression model is a good fit, model adequacy must be formed to further assess the reliability.
##
## Call:
## lm(formula = Electrical_Resistance_mOhm ~ Film_Thickness_nm,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.27640 -0.75508 -0.08631 0.70422 2.69671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.870489 0.356848 13.65 <2e-16 ***
## Film_Thickness_nm 0.122954 0.003518 34.95 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.041 on 98 degrees of freedom
## Multiple R-squared: 0.9257, Adjusted R-squared: 0.925
## F-statistic: 1221 on 1 and 98 DF, p-value: < 2.2e-16
Model adequacy in statistical analysis assesses whether the model is appropriate for the data and describes the relationships within the data. The Normal Q-Q and Fitted vs Residuals plots will be used for the assessment. With the exception of a small tail in the Normal Q-Q plot, most of the data forms a straight line, which suggests normality. The Fitted vs Residuals plot identifies a pattern in variance, indicating a lack of constant variance. With the lack of constant variance, a Box-Cox transformation will need to be performed.
Typical values range from -2 to 2. A value of 1 indicates that no transformation is needed, a value of 0 indicates a logarithmic transformation needs to be performed, values less than 0 invert the data, and values greater than 1 stretch the data. Using the boxcox() function, the max value in the confidence interval below is -.83. This value will be used to invert the data and stabilize the variance.
## [1] 30
## [1] -0.8282828
After performing the power transformation of -.83 on the Electrical Resistance variable, the p-values in the summary statistics remain the same while showing an increase in the multiple r-squared value from .93 to .95. The Normal Q-Q plot shows a deviation from the expected straight line pattern. Still, the Fitted vs Residuals plot now shows an improvement in constant variance. The last plot identifies the fitted regression line with confidence and prediction intervals. The relationship between the Film Thickness and Electrical Resistance can be described as below:
\[ y=.156-.00057*x \]
##
## Call:
## lm(formula = Electrical_Resistance_mOhm ~ Film_Thickness_nm,
## data = data2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0108661 -0.0019272 -0.0002009 0.0024227 0.0091628
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.555e-01 1.300e-03 119.59 <2e-16 ***
## Film_Thickness_nm -5.747e-04 1.282e-05 -44.84 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.003793 on 98 degrees of freedom
## Multiple R-squared: 0.9535, Adjusted R-squared: 0.9531
## F-statistic: 2011 on 1 and 98 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Electrical_Resistance_mOhm ~ Film_Thickness_nm,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.27640 -0.75508 -0.08631 0.70422 2.69671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.870489 0.356848 13.65 <2e-16 ***
## Film_Thickness_nm 0.122954 0.003518 34.95 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.041 on 98 degrees of freedom
## Multiple R-squared: 0.9257, Adjusted R-squared: 0.925
## F-statistic: 1221 on 1 and 98 DF, p-value: < 2.2e-16
## [1] 30
## [1] -0.8282828
##
## Call:
## lm(formula = Electrical_Resistance_mOhm ~ Film_Thickness_nm,
## data = data2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0108661 -0.0019272 -0.0002009 0.0024227 0.0091628
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.555e-01 1.300e-03 119.59 <2e-16 ***
## Film_Thickness_nm -5.747e-04 1.282e-05 -44.84 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.003793 on 98 degrees of freedom
## Multiple R-squared: 0.9535, Adjusted R-squared: 0.9531
## F-statistic: 2011 on 1 and 98 DF, p-value: < 2.2e-16