Final Examination - Stat and DOE Using R

Question 1

The amount of time that it takes to complete a certain job is known to be Normally distributed with a mean of 10 minutes and a standard deviation of 1 minute.

Plot the probability density function corresponding to job completion times What is the probability a randomly selected job will be completed in less than 11 minutes?

curve(pnorm(x,10, 1))

curve(dnorm(x,10, 1),5,15)

pnorm(11, 10, 1)

## [1] 0.8413447

Answer 1

The probability that a randomly selected job will be completed in less than 11 minutes is 0,8413

Question 2

A critical measurement on the diameter of a part that is used in a subassembly is assumed to have a mean of 10mm. The management would like to test this hypothesis against the alternative that it is not equal to 10mm at an alpha= 0.05 level of significance. Towards this end, they have collected a random sample of n=100 parts and measured their diameter, which data is contained in the file https://raw.githubusercontent.com/tmatis12/datafiles/main/diameter.csv

Generate a histogram and boxplot of the collected measurements

State the null and alternative hypothesis, perform the test, and state conclusions.

dat <-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/diameter.csv")
diameter <- dat[1:100,]
boxplot(diameter,main = "Boxplot", xlab="Samples", ylab="Diameter")

?hist
hist(diameter, 10, main= "Histogram", xlab = "Diameter-mm")

\[ Ho:\mu Diameter = \ 10 mm \] \[ Ha:\mu Diameter \ne\ 10 mm\]

str(dat$Diameter)

##  num [1:100] 10.3 10 10.2 10.2 10.4 ...

t.test(dat$Diameter, mu=10, alternative = "two.sided")

## 
##  One Sample t-test
## 
## data:  dat$Diameter
## t = 7.6839, df = 99, p-value = 1.134e-11
## alternative hypothesis: true mean is not equal to 10
## 95 percent confidence interval:
##  10.12638 10.21438
## sample estimates:
## mean of x 
##  10.17038

Answer 2

Since p-value = 1.134e-11 we can reject H0 with a significance level of 95%

Question 3

Researchers at a textile production facility would like to test the hypothesis that the mean breaking strength of abraided fabric is different than that of unabraided fabric at an alpha=0.10 level of significance.
Towards this end, they conducted an experiment in which they measured the breaking force of 8 samples of each type of fabric, which collected data may be found in the file https://raw.githubusercontent.com/tmatis12/datafiles/main/Fabric.csv Assume the populations are approximately Normally distributed and use a two-sample t-test with a pooled variance.
Generate a side-by-side boxplot of the collected measurements on the breaking strength of abraided and unabraided fabric State the null and alternative hypothesis, perform the test, and state conclusions

dat2 <-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/Fabric.csv")


fabric <-c("1","1","1","1","1","1","1","1","2","2","2","2","2","2","2","2")
strength<-c(28.5, 20, 46, 34.5, 36.5, 52.5, 26.5, 46.5, 36.4, 55.0, 51.5, 38.7, 43.2, 48.4, 25.6, 49.8)

dat3<-data.frame(fabric, strength)

AA<- dat3[1:8,]
BB <- dat3[9:16,]

boxplot(AA$strength,BB$strength, names = c("Abraided", "Unabraided"), ylab="Breaking strength")

\[ Ho:\mu ,abraided = \mu ,unabraided \] \[ Ha:\mu ,abraided \ne\mu ,unabraided\]

t.test(AA$strength,BB$strength,var.equal=TRUE)

## 
##  Two Sample t-test
## 
## data:  AA$strength and BB$strength
## t = -1.3729, df = 14, p-value = 0.1914
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.448196   4.048196
## sample estimates:
## mean of x mean of y 
##    36.375    43.575

Answer 3

Since p-value = 0.1914 fail to reject H0 with a 95% of significance

Question 4

Consider a designed experiment in which the crop yield was measured at 2 levels of crop density/spacing (1=dense, 2=sparse) and 3 levels of fertilizer (1=typeA, 2=typeB, 3=typeC). A total of 96 observations were collected. A colleague of yours did some preliminary analysis of the data in R using the following code (you may copy and paste this code).

library(GAD)

## Loading required package: matrixStats

## Loading required package: R.methodsS3

## R.methodsS3 v1.8.1 (2020-08-26 16:20:06 UTC) successfully loaded. See ?R.methodsS3 for help.

dat4<-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/cropdata2.csv")
str(dat4)

## 'data.frame':    96 obs. of  3 variables:
##  $ density   : int  1 2 1 2 1 2 1 2 1 2 ...
##  $ fertilizer: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ yield     : num  177 178 176 178 177 ...

dat4$density<-as.fixed(dat4$density)
dat4$fertilizer<-as.fixed(dat4$fertilizer)
interaction.plot(dat4$fertilizer,dat4$density,dat4$yield)

mod<-lm(yield~density+fertilizer+density*fertilizer,dat4)
gad(mod)

## Analysis of Variance Table
## 
## Response: yield
##                    Df  Sum Sq Mean Sq F value    Pr(>F)    
## density             1  5.1217  5.1217 15.1945 0.0001864 ***
## fertilizer          2  6.0680  3.0340  9.0011 0.0002732 ***
## density:fertilizer  2  0.4278  0.2139  0.6346 0.5325001    
## Residual           90 30.3367  0.3371                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

mod<-lm(yield~density+fertilizer,dat4)
gad(mod)

## Analysis of Variance Table
## 
## Response: yield
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## density     1  5.1217  5.1217 15.3162 0.0001741 ***
## fertilizer  2  6.0680  3.0340  9.0731 0.0002533 ***
## Residual   92 30.7645  0.3344                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(dat4)

##  density fertilizer     yield      
##  1:48    1:32       Min.   :175.4  
##  2:48    2:32       1st Qu.:176.5  
##          3:32       Median :177.1  
##                     Mean   :177.0  
##                     3rd Qu.:177.4  
##                     Max.   :179.1

Answer 4

Is the interaction significant (alpha=0.05)?

For an alpha value of 0.05 the interaction is not significant since its Pr value is 0.5325

Are the main effects significant (alpha=0.05)?

Yes, the density and fertilizer effects are significant

Regardless of how dense the crops are planted, which fertilizer would give you the greatest yield? (justify your answer)

Fertilizer 3=typeC would give me the greatest yield

Suppose that you had to use fertilizer typeA, would you have a greater yield planting the crop dense or sparse? (justify your answer)

2=sparse because the mean of the yield is better than the yield for fertilizer 1=typeA when crop is dense

Question 5

Consider designing an experiment in which we wish to test whether there is a difference in the mean between 4 levels of a single factor (i.e. between 4 populations).
Specifically, this is to be a Completely Randomized Design that will be analyzed using ANOVA.
We would like to collect a sufficient number of samples such that the test with an alpha=0.05 level of significance would be able to detect with a power of 85% a mean difference that is 50% of the standard deviation. Determine the number of samples to be collected and propose a randomized data collection table for this experiment.

library(pwr)
library(agricolae)

pwr.anova.test(k=4,n=NULL,f=0.5,sig.level=0.05, power=0.85)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 13.32146
##               f = 0.5
##       sig.level = 0.05
##           power = 0.85
## 
## NOTE: n is number in each group

Answer 5

According to the ANOVA test is necessary to collect 14 samples for each group

trt1<-c("lvl1","lvl2","lvl3","lvl4")
design<-design.rcbd(trt1,14,seed=2743536)

#Proposed Randomized Data Collection

design$book

##    plots block trt1
## 1    101     1 lvl2
## 2    102     1 lvl3
## 3    103     1 lvl1
## 4    104     1 lvl4
## 5    201     2 lvl2
## 6    202     2 lvl1
## 7    203     2 lvl3
## 8    204     2 lvl4
## 9    301     3 lvl3
## 10   302     3 lvl1
## 11   303     3 lvl2
## 12   304     3 lvl4
## 13   401     4 lvl4
## 14   402     4 lvl1
## 15   403     4 lvl3
## 16   404     4 lvl2
## 17   501     5 lvl3
## 18   502     5 lvl2
## 19   503     5 lvl4
## 20   504     5 lvl1
## 21   601     6 lvl4
## 22   602     6 lvl2
## 23   603     6 lvl3
## 24   604     6 lvl1
## 25   701     7 lvl2
## 26   702     7 lvl4
## 27   703     7 lvl3
## 28   704     7 lvl1
## 29   801     8 lvl2
## 30   802     8 lvl4
## 31   803     8 lvl1
## 32   804     8 lvl3
## 33   901     9 lvl4
## 34   902     9 lvl1
## 35   903     9 lvl3
## 36   904     9 lvl2
## 37  1001    10 lvl1
## 38  1002    10 lvl2
## 39  1003    10 lvl4
## 40  1004    10 lvl3
## 41  1101    11 lvl1
## 42  1102    11 lvl3
## 43  1103    11 lvl4
## 44  1104    11 lvl2
## 45  1201    12 lvl4
## 46  1202    12 lvl2
## 47  1203    12 lvl1
## 48  1204    12 lvl3
## 49  1301    13 lvl1
## 50  1302    13 lvl3
## 51  1303    13 lvl2
## 52  1304    13 lvl4
## 53  1401    14 lvl1
## 54  1402    14 lvl3
## 55  1403    14 lvl4
## 56  1404    14 lvl2

Final Examination - Stat and DOE Using R - CR

Jéssica Montero

2021-05-28