Problem 1 The percentage of cotton in material used to manufacture men’s shirts follows.
- Compute the sample mean, variance and median.
library(openxlsx)
dat= read.xlsx('/Users/seventails/Downloads/Stat/Week 5/HW4_data.xlsx')
mean(dat$Cotton_Percentage)
## [1] 34.79844
var(dat$Cotton_Percentage)
## [1] 1.860791
median(dat$Cotton_Percentage)
## [1] 34.7
- Construct a stem-and-leaf display for the data.
stem(dat$Cotton_Percentage)
##
## The decimal point is at the |
##
## 32 | 156789
## 33 | 11456666688
## 34 | 011122355666667777779
## 35 | 00111234456789
## 36 | 2346888
## 37 | 13689
- Construct a frequency distribution and histogram for the cotton content.
#Creating a numeric vector of cotton percentages as input to the cut method
percentages= dat$Cotton_Percentage
#Creating bins to sort the percentages into
bins= seq(32, 39, by= 1)
#sort the values into the bins created above
freq= cut(percentages, bins)
#view the factor created above as a table.
transform(table(freq))
## freq Freq
## 1 (32,33] 6
## 2 (33,34] 12
## 3 (34,35] 22
## 4 (35,36] 12
## 5 (36,37] 7
## 6 (37,38] 5
## 7 (38,39] 0
#Create the histogram
hist(percentages, main = "Histogram for Cotton Content", xlab = "cotton percentage")
(d) Construct a box plot of the data and comment on the information in this display.
boxplot(dat$Cotton_Percentage)

## The min and max ranges betwen 32 and 38.
## The median lies between 34.5 and 35. There are no outliers in the data.
Problem 2. Suppose X has Bernoulli distribution with parameter p. Show that the sample mean X ̄ is a MVUE of p.
Let \(\sf{X_{1}, X_{2},..., X_{n}}\) be Bernoulli trials with success parameter p and set the estimator for \(p\) to be \(d(X)= \overline{X}\), the sample mean. Then,
\[\begin{aligned}
\ E_{p} \overline{X}&= \frac{1}{n}({EX_{1},EX_{2},...,EX_{n}} )\\
&=\frac{1}{n}(p + p+...+ p)\\
&=p
\end{aligned}\]
Problem 3 Use the MLE method to build estimators for the parameters of the following distributions:
- Bernoulli
\[\begin{aligned}
\text {Likelihood function is-}\\
L(\theta)&= \prod_{i = 1}^{n} p^{X_{i}}(1-p)^{1-X_{i}}\\
\text{The log-likelihood function is-} \\
LL(\theta)&= \sum_{i= 1}^{n}log p^{X_{i}}(1-p)^{1-X_{i}}\\
&=\sum_{i= 1}^{n} X_{i}(\log {p}) + (1-X_{i}) \log{(1-p)}\\
&= Y\log{p}+ (n-Y)\log{(1-p)} &where\ Y= \sum_{i= 1}^{n} X_{i}
\end{aligned}\]
Now finding the first derivative of the function and setting it to 0 to find the MLE:
\[\begin{aligned}
\frac{\partial LL(p)}{\partial p}&= Y\frac{1}{p}\ +(n-Y)\frac{-p}{1-p}= 0\\
\hat{p}&= \frac{Y}{n}= \frac{\sum_{i= 1}^{n} X_{i}}{n}\\
\text{The estimator is the mean here}
\end{aligned}\]
- Exponential
\[\begin{aligned}
\text {Likelihood function is-}\\
L( \lambda, x_{1},x_{2}, ..., x_{n})&= \prod_{i = 1}^{n}\lambda e^{-\lambda x}\\
&= \lambda ^{n}e^{-\lambda \sum_{i= 1}^{n}X_{i}}\\
\text{The log-likelihood function is-}\\
LL(\theta)&= n\log{\lambda}-\lambda \sum_{i= 1}^{n}X_{i} \log{e}\\
&=n\log{\lambda}-\lambda Y &where\ Y= \sum_{i= 1}^{n} X_{i}\\
\frac{\partial LL(p)}{\partial p}&= \frac{n}{\lambda} - Y= 0\\
\hat{\lambda} &= \frac{n}{Y}\\
\hat{\lambda} &= \frac{n}{\sum_{i=1}^{n}X_{i}}\\
\text{The estimator is the reciprocal of the mean}\\
\end{aligned}\]
- Lognormal
\[\begin{aligned}
\text {Likelihood function is-}\\
L(\theta) &= \prod_{i = 1}^{n}\frac{1}{\sqrt{2\pi\theta_{1}}} e^-{\frac{(X_{i}-\theta_{0})^2}{2\theta_{1}}}\\
\text{The log-likelihood function is-}\\
LL(\theta) &= \prod_{i = 1}^{n}\log{\frac{1}{\sqrt{2\pi\theta_{1}}}} e^-{\frac{(X_{i}-\theta_{0})^2}{2\theta_{1}}}\\
\text{Solving for both }\theta_{0} and \theta_{1}\\
\hat{\mu}= \hat{\theta_{0}}\\
\hat{\sigma^2}= \hat{\theta_{1}}
\end{aligned}\]
Problem 6 (24 points total)
- Calculate the least squares estimates of the slope and intercept. Graph the regression line. (12 points)
n= 20
sigma_Y= 12.7
sigma_Y2= 8.8
sigma_X=1487
sigma_X2=143215
sigma_XY= 1083
S_XX= sigma_X2- (sigma_X^2/n)
S_XY= sigma_XY-((sigma_Y*sigma_X)/n)
b1= S_XY/S_XX
b1
## [1] 0.004248918
b0= sigma_Y/n- (b1*(sigma_X/n))
b0
## [1] 0.319093
L= function(x= c(-100:100)){
b0+b1*(x)
}
plot(L)
(b) Use the equation of the fitted line to predict what pavement deflection would be ob- served when the surface temperature is 85◦F. (4 points)
b0+(b1*85)
## [1] 0.680251
- What is the mean pavement deflection when the surface temperature is 90◦F? (4 points)
b0+(b1*90)
## [1] 0.7014956
- What change in mean pavement deflection would be expected for a 1◦F change in surface temperature? (4 points)
b1
## [1] 0.004248918