You can complete this homework by filling the rest of the .Rmd
document. When you click the Knit button in RStudio a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
The NAEP music assessment scores for eighth-grade students are approximately \(N(150,35)\). Find z-scores by standardizing the following scores: 150, 140, 100, 180, 230.
mean = 150
sd = 35
zscore1 = (150-150)/35
zscore1
## [1] 0
zscore2 = (140-150)/35
zscore2
## [1] -0.2857143
zscore3= (100-180)/35
zscore3
## [1] -2.285714
zscore4= (180-150)/35
zscore4
## [1] 0.8571429
zscore5= (230-150)/35
zscore5
## [1] 2.285714
Many random number generators allow users to specify the range of the random numbers to be produced. Suppose that you specify that the outcomes are to be distributed uniformly between 0 and 4. Then the density curve of the outcomes has constant height between 0 and 4, and height 0 elsewhere.
grid = seq(0,4)
height = dnorm(grid)
a=.25 # replace 2 by the height of the density curve between 0 and 4
plot(a,type="n",xlim=c(-1,5), ylim=c(0,a),xlab="Values",ylab="Probability")
segments(-1,0,0,0,col="red")
segments(0,0,0,a,col="red")
segments(0,a,4,a,col="red")
segments(4,a,4,0,col="red")
segments(4,0,5,0,col="red")
25 % of outcomes will be less than 1.
50 % of outcomes will be between .5 and 2.5
\(Z>1.6\) 5.48 % Proportion of observations is .0548
\(-1.6 \leq Z<1.8\) .9641- .0548 = .9093 Proportion of observations is .9093
-.52
.31
Osteoporosis is a condition where bones become weak. Exercise is one way to produce strong bones and to prevent osteoporosis. Since we use our dominant arm (the right arm for most people) more than our nondominant arm, we expect the bone in our dominant arm to be stronger than the bone in our nondominant arm. By comparing the strengths, we can get an idea of the effect that exercise can have on bone strength. Here are some data on the strength of bones,measured in \(cm^4/1000\), for the arms of 15 young men:
bonestrength <- data.frame(Nondominant=c(15.7, 25.2, 17.9, 19.1, 12.0, 20.0, 12.3, 14.4, 15.9, 13.7, 17.7, 15.5, 14.4, 14.1, 12.3),Dominant=c(16.3, 26.9, 18.7, 22.0, 14.8, 19.8, 13.1, 17.5, 20.1, 18.7, 18.7, 15.2, 16.2, 15.0, 12.9))
print(bonestrength)
## Nondominant Dominant
## 1 15.7 16.3
## 2 25.2 26.9
## 3 17.9 18.7
## 4 19.1 22.0
## 5 12.0 14.8
## 6 20.0 19.8
## 7 12.3 13.1
## 8 14.4 17.5
## 9 15.9 20.1
## 10 13.7 18.7
## 11 17.7 18.7
## 12 15.5 15.2
## 13 14.4 16.2
## 14 14.1 15.0
## 15 12.3 12.9
plot(bonestrength, xlab = "NonDominant Hand", ylab = "Dominant Hand")
Describe the overall pattern in the scatterplot and any striking deviations from the pattern. The overall pattern seems to be as the strength is the NonDominant Hand increases, so does the strength in the Dominant hand. These two variables seem to have a positive association. There seems to be a moderate linear relationship between the two variables. There is one clear outlier near the limits of the graph that is far beyond any of the rest of the data.
Describe the form, direction, and strength of the relationship.
The form is a linear relationship.
The direction is positive.
The strength is moderate.
We know that their is a postive correlation by looking at the scatterplot. I would rate the strength of the correlation as moderate. I think that correlation is misleading because correlation is not causation. We cannot be certain that strength in the nondominant hand leads to a stronger dominant hand. Other factors can be at play!
The following 20 observations on \(Y\) and \(X\) were generated by a computer program.
## Y X
## 1 25.66 22.06
## 2 19.53 19.88
## 3 20.59 18.83
## 4 20.50 22.09
## 5 22.65 17.19
## 6 21.88 20.72
## 7 18.25 18.10
## 8 16.96 18.01
## 9 19.52 18.69
## 10 20.52 18.05
## 11 16.80 17.75
## 12 21.35 19.96
## 13 21.04 17.87
## 14 22.73 20.20
## 15 22.02 20.65
## 16 19.12 20.32
## 17 25.19 21.37
## 18 16.72 17.31
## 19 23.59 23.50
## 20 19.76 22.02
plot(gendata$X, gendata$Y, xlab = "X", ylab = "Y")
I think that the relationship between X and Y is a very weak linear relationship. The direction is positive. The strength is very weak.
You would use this these types of calculations to find LSRL.
xbar = mean(logbody) sx = sd(logbody) ybar = mean(logbrain) sy = sd(logbrain) r = cor(logbody, logbrain)
slope = r*sy/sx
intercept = ybar - slope*xbar
plot(gendata$X, gendata$Y, xlab = "X", ylab = "Y")
cor(gendata$X,gendata$Y)
## [1] 0.5967597
line <- lm(gendata$X ~ gendata$Y)
summary(line)
##
## Call:
## lm(formula = gendata$X ~ gendata$Y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3838 -0.5777 -0.1679 0.5304 2.7113
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.6585 2.8949 3.682 0.00171 **
## gendata$Y 0.4378 0.1387 3.155 0.00547 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.532 on 18 degrees of freedom
## Multiple R-squared: 0.3561, Adjusted R-squared: 0.3204
## F-statistic: 9.956 on 1 and 18 DF, p-value: 0.005475
abline(line, col = "red", lwd = 3)
cor(gendata$X,gendata$Y)^2
## [1] 0.3561222
35.6 % of the variability in Y is explained by the liner relationship between X and Y.
If we were given a new x value of 21, the predicted y value (y-hat) would be 19.85.