Range: 93-55 = 38
Variance:
\[ \bar{X} = \frac{(66 + 75 + 72 + 71 + 55 + 56 + 72 + 93 + 73 + 72 + 72 + 73 + 91 + 66 + 71 + 56 + 59)}{17} = 70.177\]
\[ \Sigma(X-\overline{X})^2 = 1800.471\]
\[N-1 = 17-1 = 16\]
\[s^2 = \frac{1800.471}{16} = 112.529\]
\[s = \sqrt{112.529} = 10.608\]
Data sets
Original data: mean = 18.143, sd = 11.682 \[\bar{X} = \frac{(12 + 15 + 34 + 28 + 27 + 9 + 2)}{7} = 18.143\] \[ \Sigma(X-\overline{X})^2 = 818.857\] \[N-1 = 7-1 = 6\] \[s^2 = \frac{818.857}{6} = 136.476\] \[s = \sqrt{136.476} = 11.682\]
Original + 3: mean = 21.143, sd = 11.682 \[\bar{X} = \frac{(15 + 18 + 37 + 31 + 30 + 12 + 5)}{7} = 21.143\] \[ \Sigma(X-\overline{X})^2 = 818.857\] \[N-1 = 7-1 = 6\] \[s^2 = \frac{818.857}{6} = 136.476\] \[s = \sqrt{136.476} = 11.682\]
Original - 3: mean = 15.143, sd = 11.682 \[\bar{X} = \frac{(9 + 12 + 31 + 25 + 24 + 6 - 1)}{7} = 15.143\] \[ \Sigma(X-\overline{X})^2 = 818.857\] \[N-1 = 7-1 = 6\] \[s^2 = \frac{818.857}{6} = 136.476\] \[s = \sqrt{136.476} = 11.682\]
As these transformations show, adding/subtracting a constant affects the mean such that the mean changes by the amount of the constant k (in the case of addition, is mean + k, and in the case of subtraction, is mean - k). However, because all data are transformed by the same k, the variability in the transformed dataset is the same as the original dataset.
Data sets
Original multiplied by 3: mean = 54.429 , sd = 35.047 \[\bar{X} = \frac{(9 + 12 + 31 + 25 + 24 + 6 - 1)}{7} = 54.429\] \[ \Sigma(X-\overline{X})^2 = 7369.714\] \[N-1 = 7-1 = 6\] \[s^2 = \frac{7369.714}{6} = 1228.286\] \[s = \sqrt{1228.286} = 35.047\]
Original divided by 3: mean = 6.048 , sd = 3.894 \[\bar{X} = \frac{(9 + 12 + 31 + 25 + 24 + 6 - 1)}{7} = 6.048\] \[ \Sigma(X-\overline{X})^2 = 90.984\] \[N-1 = 7-1 = 6\] \[s^2 = \frac{90.984}{6} = 15.164\] \[s = \sqrt{15.164} = 3.894\]
As these transformations show, multiplying/dividing a dataset by a constant k multiplies/divides the standard deviation by the constant k, respectively. \[ 11.682*3 = 35.047\] \[ \frac{11.682}{3} = 3.894\] Similarly, multiplying/dividing a dataset by a constant k multiplies/divides the mean by the constant k, respectively. \[18.143*3 = 54.429\] \[\frac{18.143}{3} = 6.048\]
Original data set: {5, 8, 3, 8, 6, 9, 9, 7}
\[\bar{X} = \frac{(5 + 8 + 3 + 8 + 6 + 9 + 9 + 7)}{8} = 6.875\] \[ \Sigma(X-\overline{X})^2 = 30.875\] \[N-1 = 8-1 = 7\] \[s^2 = \frac{30.875}{7}= 4.411\] \[s = \sqrt{4.411} = 2.1002\]
Transformed data set: \[ \frac{observation}{\sigma^2} = (2.381, 3.809, 1.429, 3.809, 2.857, 4.285, 4.285, 3.333)\] \[ \Sigma(X-\overline{X})^2 = 75\] \[N-1 = 8-1 = 7\] \[s^2 = \frac{7}{7}= 1\] \[s = \sqrt{1} = 1\]
Question #1
Question #2
\(\frac{99.4-98.6}{0.7} = 1.143\)
\(\frac{101.1-98.6}{0.7} = 3.57\)
\(\frac{98.5-98.6}{0.7} = -0.143\)
\(\frac{94.1-98.6}{0.7} = -6.429\)
Question #3
Variance is based on squared deviations and is less easy to interpret; taking the square root gives a measure of variability that is in the units of the data.
We square the deviation scores to change the sign of negative deviations.
Question #4
\(\frac{(6 + 3 + 5 + 0 + 6)}{5} = 4\)
Because the sum of the first five deviation scores is -1, and the sum of deviations in a complete data set will always equal zero, the missing deviation value must be equal to 1. Thus: (-3 1 1 -4 4 1) is the complete set of deviation scores.
Question #5
a)
There appears to be a reasonably strong positive linear correlation between the number of hours spent studying and the test score earned on the math test. Stated differently, the more hours spent studying appears to yield a higher math test score.
\(N-1 = 10-1 = 9\)
\(Covariance = \frac{\Sigma{((Time - \overline{Time})(Score - \overline{Score}))}}{9}\)
\(r = \frac{Covariance(time,score)}{(s_{time})(s_{score})}\)
\(cov = \frac{41.95}{9} = 4.661\)
\(s_{time} = \sqrt{\frac{9.8}{9}} = \sqrt{1.089} = 1.044\)
\(s_{score} = \sqrt{\frac{253.301}{9}} = \sqrt{28.145} = 5.305\)
\(r = \frac{4.661}{(1.044)(5.305)} = 0.842\)
This correlation coefficient \(r = 0.842\) is considered to be a strong positive correlation, as previously estimated.
Question #6
a)
\(N-1 = 10-1 = 9\)
\(Covariance = \frac{\Sigma{((Mood - \overline{Mood})(Pos.Actions - \overline{Pos.Actions}))}}{9}\)
\(r = \frac{Covariance(mood,pos.actions)}{(s_{mood})(s_{pos.actions})}\)
\(cov = \frac{-46.44}{9} = -5.16\)
\(s_{mood} = \sqrt{\frac{31.329}{9}} = \sqrt{3.481} = 1.866\)
\(s_{pos.actions} = \sqrt{\frac{86.4}{9}} = \sqrt{9.6} = 3.098\)
\(r = \frac{-5.16}{(1.866)(3.098)} = -0.893\)
b)
Mood Z-scores:
\(s_{mood} = 1.866\)
\(\overline{mood} = 5.99\)
\(Z-transformation = \frac{mood score - \overline{mood}}{s_{mood}} = \frac{mood score - 5.99}{1.866} = (1.131, -0.799, 0.649, 0.059, -1.71, -1.281, 0.756, -0.316, 1.131, 0.381)\)
Positive action Z-scores:
\(s_{pos.action} = 3.098\)
\(\overline{pos.action} = 10.66\)
\(Z-transformation = \frac{pos.action - \overline{pos.action}}{s_{pos.action}} = \frac{pos.action - 10.66}{3.098} = (-0.516, 1.097, -1.162, 0.129, 2.066, 0.452, -0.516, -0.194, -1.162, -0.194)\)
Question #7
Charlie is correct, because the distribution of z-scores is normally distributed. Sasha’s graph shows a distribution that is negatively skewed.