Homework 7 Problem 33 of Section 7.3

Kellie Stilson

The article “Measuring and Understanding the Aging of Kraft Insulating Paper in Power Transformers” contained the following observations on degree of polymerization for paper specimens for which viscosity times concentration fell in a certain middle range:

x <- c(418, 421, 421, 422, 425, 427, 431, 434, 437, 439, 446, 447, 448, 453, 
    454, 463, 465)

a.) Create a boxplot of the data and comment on any interesting features.

boxplot(x, main = "Boxplot of Degree of Polymerization", col = c("green"), outline = T)

plot of chunk unnamed-chunk-2

As seen from the boxplot, there are no outliers in the data set (which makes sense since they were chosen from falling in a middle range). Overall the data seems to be slightly positively skewed, but the line for the median falls almost directly in the middle of the box, suggesting that amongst the middle of the data, the data is nearly symmetric. A summary of the data is given below.

summary(x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     418     425     437     438     448     465
sd(x)
## [1] 15.14

(b) Is it plausible that the given sample obsevations were selected from a normal disribution? This can be checked by observing the QQ plot below:

qqnorm(x)
qqline(x)

plot of chunk unnamed-chunk-4

As seen in the graph, most of the middle of the data follows a normal distribution, but especially in the upper quantile, we see that it starts to vary from the line. It is possible the data came from a normal distribution, but it is still hard to say definitely.

© Calculate a two-sided 95% confidence interval for true average degree of the polymerization (as did the authors of the articale). Does the interval suggest that 440 is a plausible value for the true average degree of polymerization? What about 450?

avg <- mean(x)
avg
## [1] 438.3
deg <- t.test(x, mu = avg, conf.level = 0.95)

Specifically, the confidence interval is

deg$conf.int
## [1] 430.5 446.1
## attr(,"conf.level")
## [1] 0.95

Meaning the sample mean is

deg$estimate
## mean of x 
##     438.3
sd(x)
## [1] 15.14

A 95% confidence interval is (430.5077, 446.0805) which includes the value 440 but does not include the value 450. Thus 440 in reasonable and 450 is not.