Film Thickness vs Electrical Resistance Analysis

Introduction

Semiconductors, a key component in most electronics, are used in a wide variety of industries. This analysis is done to determine the connection between the thickness of the film on the silicon wafers and the electrical resistance of the film. Understanding the relationship between the electrical resistance and the film thickness will help engineers control the quality of the semiconductors produced as sputter deposition can have minor variations in film thickness which has caused problems with the process reliability. Determining if these variations predict electrical resistance will allow engineers to more efficiently determine the quality of the semiconductors produced through this process and potentially determine a way to lower the costs of production.

In the code integrated throughout the report, “x” may be used to reference the film thickness and “y” maybe used to reference the electrical resistance. Other variations of these variables are different iterations with different transformations applied to them. The variable “datafrm” is used to reference the original data provided. All data should still be in the units provided.

Exploratory Data Analysis

Exploratory Data Analysis, or EDA, is where we set up the data and do some preliminary analysis. This is where we scope out the data and what kinds of analysis may be needed for this project. It also can be used as a reference later on to identify any errors as they may occur in the code. Information such as the range of each variable can be helpful when the graphs are not coming out as they should.

Summary of the Data Set

The head function shows the first five rows of the provided data and gives us a snapshot of what we are working with as we set up the analysis. The summary function breaks the provided data into chunks and paints a vague picture of what we are working with.

head(datafrm)

##   Film_Thickness_nm Electrical_Resistance_mOhm
## 1             87.45                     15.118
## 2            145.07                     23.601
## 3            123.20                     19.904
## 4            109.87                     16.103
## 5             65.60                     12.901
## 6             65.60                     13.278

summary(datafrm)

##  Film_Thickness_nm Electrical_Resistance_mOhm
##  Min.   : 50.55    Min.   :11.68             
##  1st Qu.: 69.32    1st Qu.:13.60             
##  Median : 96.42    Median :15.74             
##  Mean   : 97.02    Mean   :16.80             
##  3rd Qu.:123.02    3rd Qu.:19.82             
##  Max.   :148.69    Max.   :25.74

Exploratory Graphing

Starting with the histogram below. While there is a loosely generalized \(\chi^2\) distribution with a few outliers, it is not clear enough to draw any conclusions from it in terms of distribution types. The near hump at the lower resistances could be depicting lower resistances at multiple film thicknesses. If this is the ideal resistance for the semiconductor then it could allow for a less precise manufacturing process and could save the client money. The outliers that keep this graph from more accurately representing a \(\chi^2\) could be indicative of thicknesses that have abnormal resistance either due to data collection error or some chemical abnormality appearing at these thicknesses.

The box plot below gives some interesting insight to the data provided. The median, the line inside the box, is not in the middle of the box. The location of this line in the below box plot could be indicative of a continued lower resistance through multiple film thicknesses. This is also represented in the histogram which has less frequent higher resistance data points throughout the provided data. The absence of any dots or asterisks outside the lines coming off the box, or the whiskers, shows that there are no outliers in the data.

The scatter plot below shows the relationship between the resistance and the film thickness for every data point in the provided data. It shows a positive trend, indicating some general relationship between the film thickness and the resistance. At a closer inspection, the relationship is not quite linear as there is a slight curve to the trend shown below. As indicated in the above box plot, there are no drastic outlines in the scatter plot below. If there were an outliers, then there would be a data point not near the others on the positive trend.

Simple Linear Regression

The data is put into a linear model using the “lm” function in r. As there is only one predictor and one response variable, simple linear regression was performed using least squares estimates for the slope (\(\beta_1\)) and the intercept (\(\beta_0\)). The random error (\(\epsilon\)) represents the difference between the linear model and actual data. The mathematical equation for the linear model is below:

\[ Y = \beta_0 + \beta_1 x + \epsilon \]

The intercept (\(\beta_0\)) is estimated using the equation \(\hat{\beta_0} = \bar{y} -\hat{\beta_1}\bar{x}\). The \(\beta_1\) is estimated with the equation \(\hat{\beta_1} = \frac{\Sigma(x_1 -\bar{x})(y_1-\bar{y})}{\Sigma(x_1 - \bar{x})^2}\). The summary of the model calculates the \(\beta_0\), \(\beta_1\), \(R^2\), and an F-statistic. The summary also calculates the residuals, which will be analyzed with the residual versus fitted values plot further down.

model<-lm(y~x)
summary(model)

## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.27640 -0.75508 -0.08631  0.70422  2.69671 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4.870489   0.356848   13.65   <2e-16 ***
## x           0.122954   0.003518   34.95   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.041 on 98 degrees of freedom
## Multiple R-squared:  0.9257, Adjusted R-squared:  0.925 
## F-statistic:  1221 on 1 and 98 DF,  p-value: < 2.2e-16

Based on the calculations in the summary, the equation for the linear model is below: \[Y = 4.87 + 0.123x\]

This has the y-intercept at 4.87 and the slope at 0.123 or \(\beta_0\) and \(\beta_1\) respectively. The intercept indicates that something else on the semiconductor is causing resistance at 4.87 mOhms, and may be something worth investigating depending on the design parameters of the semiconductors. The slope indicates that for every 1 nm of the film material applied, the resistance increases an average of 0.123 mOhms. This can create a set starting point when changing a film thickness to meet a specific design parameter.

The statistical significance of this model determines if the film thickness is an important indicator of resistance, as calculated through the F-statistic. The F-statistic is \(F_{1,98}\) ~1221 which gives a p-value of less than \(2.2E^{-16}\) which is significantly below multiple industry standards of 0.05 and 0.01. The \(R^2\) is 0.92, which is nearly 1, or approximately 92% of the variance is represented in the model.

Checking Assumptions

Residuals

The residuals calculated in the model summary and represented in the below plot show if there is any unexplained error. The calculated values for the model indicate that there is no systemic error in the model. The calculated standard error for the residuals is 1.041 which indicates minimal unexplained error.

The above Residuals vs Fitted scatter plot should be completely random if there was homoscedasticity. Unfortunately, there is a slight funnel to the data and requires transformation. The homoscedasticity assumption failed meaning there is varied variance.

plot(model,2,
     main = 'Q-Q Plot',
     pch = 20,
     col = 'skyblue')

The above Q-Q Plot determines how normal the residuals are distributed, another assumption for the linear model. While there is some obvious variation from the linear line on this plot, the majority of the points are relatively linear. There should be minimal change in this plot after transformation.

Model Transformation

Box-Cox Transformation

Box-Cox transformations mathematically show which assumptions failed and mitigates the effects of that failed assumption. It does this by showing the \(\lambda\) values, or the power transformation values, are most likely to make the data normally distributed (most likely to turn it into a bell curve). The plot below shows the log-likelihood values, how well the data becomes normally distributed, over the \(\lambda\) value before any kind of transformation. Another plot will be used to check the transformation later.

library(MASS)

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

trans_results<-boxcox(model, plotit = TRUE)

The \(\lambda\) value is chosen or calculated from the confidence interval on the below graph, represented by the vertical dashed lines under the peak. In the above plot, the confidence interval is nearly centered around -1. This indicated that some form of an inverse root transformation will be required. Such a transformation requires an optimal \(\lambda\).

An optimal \(\lambda\) value can be determined from this region by finding the x value associated with the maximum y value from the Box-Cox data produced by the code. Other values from the confidence interval on the plot could be used but when tried, they did not have the desired effect. The optimal \(\lambda\) is calculated below.

Opt_Lambda<-trans_results$x[which.max(trans_results$y)]
cat('Optimal Lambda:', Opt_Lambda, '\n')

## Optimal Lambda: -0.8282828

As the optimal lambda ( \(\lambda_{opt}\)) is closer to -1 than -0.5, the transformation is likely to resemble an inverse transformation than a square root transformation. The \(\lambda_{opt}\) is -0.8282828 and is used int the below equation to calculate a new, transformed resistance for ever known film thickness to mitigate any error. In the code embedded in the report, the transformed data is represented by y2.

\[ Y_2 = \frac{y^{\lambda_{opt}}}{\lambda_{opt}} \]

#gives -.82ish

#Applying Required Transformation
y2<-(y^(Opt_Lambda)-1)/Opt_Lambda
#Inverse Fractional Power

trans_model<-lm(y2~x)
summary(trans_model)

## 
## Call:
## lm(formula = y2 ~ x)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0110860 -0.0029355  0.0002433  0.0023299  0.0131530 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.019e+00  1.574e-03  647.46   <2e-16 ***
## x           6.957e-04  1.551e-05   44.85   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.004592 on 98 degrees of freedom
## Multiple R-squared:  0.9535, Adjusted R-squared:  0.9531 
## F-statistic:  2011 on 1 and 98 DF,  p-value: < 2.2e-16

#boxcox(trans_model, plotit = TRUE)

The transformed linear model follows the below equation:

\[ Y_2 = 1.019+0.0006957x \]

While this equation may look significantly different than the original model, it is just as statistically significant as the original equation calculated earlier. This model is more accurate than the first and should be used by the engineers to estimate the resistance for a measured film thickness.

The transformed data increased the \(R^2\) value, bringing it to 0.95 instead of 0.92. This shows the increase in variance being included in the model and is why this model should be used by the engineers. While the F-statistic changes, \(F_{1,98}\) ~ 2011 instead of 1221, the p-value did not change with the the F-statistic.

Seeing the Box Cox Transformation Applied to the Residuals

plot(trans_model,1,
     main = 'Transformed Residuals vs Fitted',
     pch = 20,
     col = 'plum3')

The above residuals vs. fitted scatter plot is more random than the un-transformed one earlier in the report. The line doesn’t have a typical funnel shape as it is so minimal it is nearly not even there and it is off centered. Any patterns that are seen in the above plot are either so minor they are insignificant or are not actually there.

plot(trans_model,2,
     main = 'Q-Q Plot',
     pch = 20,
     col = 'plum2')

This Q-Q plot is relatively linear, even if it might not seem like it since the points are not as similar to the dashed line as in the first Q-Q plot. The statistical difference in normal residual distribution is checked with the Shapiro-Wilks Test, performed below. If the p-value of the transformed model is above 0.05 then the residuals are not statistically different from normal, even if the Q-Q plot makes them appear otherwise.

shapiro.test(resid(trans_model))

## 
##  Shapiro-Wilk normality test
## 
## data:  resid(trans_model)
## W = 0.98733, p-value = 0.4601

The calculated p-value is well above 0.05 at nearly 0.5. Therefore the transformed model did not significantly interfere with normal residual distribution while improving homoscedasticity.

Confidence and Prediction Interval

Calculating the Intervals

The confidence interval, or set of upper and lower limits, represents how accurate the model is for determining resistance for a give film thickness. To calculate the confidence interval for a specific data point by hand requires multiple different equations. The main one is provided below:

\[ x_{i, conf} = (\hat{\beta_0} + \hat{\beta_1}x_j)\pm t_{\frac{1-\alpha}{2},n-2}(\sqrt{MSE(\frac{1}{n} + \frac{\bar{x}^2}{\Sigma(x_i-\bar{x})^2})} \]

\(\hat{\beta_0}\) and \(\hat{\beta_1}\) use the previously provided equations to calculate by hand. The t-test statistic is harder to calculate by hand but can be found using a t-table if the \(\alpha\) and the n values are known. Mean squared error or MSE in the above equation represents an estimation of population variance. MSE can be calculated by hand using the equation \(MSE = \frac{SSE}{n-2}\) where SSE can be found using \(SSE = \Sigma (y_i - \hat{y})^2\). Combine the two gives the equation \(MSE = \frac{\Sigma(y_i - \hat{y})^2}{n-2} = \hat{\sigma^2}\) where \(\hat{\sigma^2}\) is the mean standard deviation. At the \(\pm\) part of the equation, doing both the addition and the subtraction of the values to the right of that symbol gives the upper and lower confidence intervals for \(x_i\).

The prediction interval, another set of upper and lower limits, represents how accurately the model can be given a new film thickness and determine the resistance of the semiconductor. To calculate the prediction interval for a specific data point by hand requires multiple different equations. The main one, similar to the confidence interval equation, is provided below:

\[ x_{i,pred} = (\hat{\beta_0} + \hat{\beta_1}x_j)\pm t_{\frac{1-\alpha}{2},n-2}(\sqrt{MSE(1+\frac{1}{n} + \frac{x_i - \bar{x}}{\Sigma(x_i - \bar{x})^2}} \]

As all the variables are the same from the confidence interval calculation, please refer back to that section if needed. As with the confidence interval, the calculation using the above equation must be done twice. Once for the upper limit by adding everything right of the \(\pm\) and once for the lower limit by subtracting everything right of the \(\pm\) for a specific data point.

The r code below does all the calculations without any human error wit the transformed model. The data frames conf and pred contain the line of best fit, upper, and lower limits for the transformed model.

xnew<-seq(min(x), max(x), length.out = 100)

conf<-predict(trans_model, data.frame(x = xnew), interval = 'confidence')
pred<-predict(trans_model, data.frame(x = xnew), interval = 'prediction')

Plotting the Confidence and Prediction Intervals

In the above plot, the data is plotted as light purple or pink dots in a scatter plot similar to the resistance vs film thickness plot that was done early one. The biggest difference is that the above plot uses the transformed resistance values instead of the measured ones. The light blue or sky blue line is the line representing the linear regression model for the transformed model, also known as a regression line. The dark maroon or dark magenta colored lines represent the upper and lower confidence intervals. The navy or dark blue lines represent the predictor intervals. The three dots barely outside the predictor intervals, while interesting, are not statistically significant enough to warrant a transformation of the model.

Conclusion

This analysis was done to determine what kind of relationship, if there was one, existed between the electrical resistance the film thickness on semiconductors. Starting with EDA, various plots were constructed to look for trends and outlines. Then the simple linear regression model was applied and checked. When the homoscedasticity assumption failed, the model was transformed using an inverse fractional transformation. This type of transformation was identified with the box cox transformation. The linear regression assumptions were check with the transformed model, and they all passed. Finally, confidence intervals and prediction intervals were applied to the model.

The transformed linear model shows a positive relationship between the film thickness and the electrical resistance. This relationship is stronger when determining the average resistance for a given film thickness than when determining an electrical resistance for a new film thickness. This slight uncertainty could be due to a multitude of factors, many of which are not controllable without severe measures being taken. These slight uncertainties are not statistically significant, as shown multiple times throughout the report through \(R^2\), F-statistics and p-values, various graphs and plots, a shapiro-wilks normality test, and reasonable confidence and predictor intervals. In fact, many of these tests have proven that the film thickness is a strong indicator for electrical resistance in a semiconductor.

The final linear model would be best for determining an average electrical resistance for a range of film thicknesses as shown with the confidence intervals. It can also comfortable be used to predict the resistance of a given film thickness as there were minimal data points outside the predictor intervals. One application would be in meeting a clients specific resistance or film thickness specifications when designing a semiconductor. Another would be in the manufacturing process itself. Quality control can be improved by applying the model to monitor the process stability of applying the film to the semiconductors. This is the recommended application for this model as it can also reduce costs by reducing the man hours associated with process stability monitoring.