Mean_Rate | Median_Rate | Quart_1 | Quart_3 |
---|---|---|---|
159 | 161 | 138 | 168 |
The mean and median hotel rates are relatively similar indicating that the distribution is roughly symmetric. We shouldn’t see much skew on either side of the distribution.
Mean_Cost | Median_Cost | First_Quart | Third_Quart | P_90 |
---|---|---|---|---|
165 | 152 | 115 | 184 | 237 |
SD | Upper | Lower |
---|---|---|
65 | 360 | -30.4 |
a. Compute the mean, median, and mode. From this information do you believe the distribution to be symmetrical or skewed?
A slight difference between the mean and median would suggest a small amount of right skew.
b. Compute the first and third quartiles.
c. Compute and interpret the 90th Percentile.
d. Compute the sample standard deviation and interpret the results.
A sample standard deviation of $65 suggests a moderate to small amount of variation.
e. Create a histogram for the variable cost. Describe the shape of the distribution. Is this consistent with your response from above using the mean and median? Explain this result.
The histogram suggests there is some right skew in the distribution. There is a large number of observations between 0-200 and then a few observations in the right tail.
f. Determine the value for cost that would be considered an outlier both above and below the mean.
Mean_Price | Median_Price | SDev_Price |
---|---|---|
85659 | 72000 | 49650 |
a. Calculate the mean and median for the variable price. Discuss the difference in values between the two and how it relates to the symmetry in the distribution.
The average and median are approximately $13,600 apart. This would suggest the distribution is right skewed. There are likely a small number of expensive houses that are pulling the distribution to the right.
b. Report the standard deviation for the variable price. How would you characterize the variation in prices across houses? That is, is there a lot of variation or a little?
The standard deviation of $49,560 would indicate that there is a lot of variation in prices across houses. This should make sense because houses are typially very different.
c. Calculate the coefficient of variation for the variable price. Interpret this value in terms of the level of variation for house prices.
Coefficient of Variation = 0.58. This would validate are previous answer that demonstrates there is a lot of variation in price. A C.V. of 58% can be interpreted to mean that the standard deviation is 58% of the mean and is an indicator of a lot of variation in price.
d. What is the z-score for a house that is priced at $125,000?
z = 0.792
e. What is the z-score for a house that is priced at $60,000?
z = -0.517
f. What is the z-score for the minimum priced house? Using the z-score criteria, would this be considered an outlier? Explain.
Zmin = -1.705. This would not be considered an outlier. It is within three standard deviations away from the mean.
g. Using the z-score criteria, calculate the upper boundary above the mean price that would constitute an outlier. Use this value to determine if there are any outliers for the variable price.
Upper Bound = 234609.946, Max Price = 271000. Since the maximum house price is greater than our upper bound outlier criteria, there are outliers above the mean in the data.
h. Assuming house prices are fairly symmetric, approximately what percentage of prices would lie within ± two standard deviations from the mean?
Approximately 95% using the empirical rule.
Use the file SO2Auction to analyze the distribution of permit prices in the Sulfur Dioxide auction market.
Min | First_Quartile | Second_Quartile | Third_Quartile | Max |
---|---|---|---|---|
127 | 131 | 136 | 144 | 250 |
The distance between the minimum value, the first quartile, second quartile, and third quartile are fairly consistent, suggesting nearly three-fourths of our data lies relatively close together. The difference between the max and the third quartile is substantially larger, suggesting that there is some right skew in the distribution. There are a few high bids pulling the average to the right.
b. Calculate the interquartile range for the Price Per Permit variable. Interpret this value in relation to the amount of variability in permit prices.
IQR = 12.159. The IQR represents the range for the middle 50% of the data. In this case the range is only about $12, suggesting a small amount of variation between the first and third quartiles.
c. Using the quartile criteria for determining outliers, what upper threshold level of permit prices would be considered an outlier? Are there values that would be considered outliers from above?
Upper Outlier = 161.739, Max = 250. Since the max auction bid is greater than the upper outlier condition, there are outliers above the median.
d. Using the quartile criteria for determining outliers, what lower threshold level of permit prices would be considered an outlier? Are there values that would be considered an outlier from below?
Lower Outlier = 113.102, Min = 127.296. Since the minimum auction bid is not lower than the lower outlier condition, there are no outliers below the median.
Table 1: Accident Reports by Age
Less than 85 years old | At least 85 years old | Total | |
---|---|---|---|
Accident | 8% | 24% | 32% |
No Accident | 64% | 4% | 68% |
Total | 72% | 28% | 100% |
8%
28%
32%
With the trend line and scatter diagram it is evident that there is a fairly strong positive relationship between the two variables.
Compute the sample covariance between Apple and Netflix stock prices.
COV = 70.18
Compute the correlation coefficient between Apple and Netflix stock prices. Interpret this value in terms of the sign (\(\pm\)) and strength of the relationship. Is this result consistent with your
CORR = 0.257
The correlation coefficient indicates a moderate positive relationship between the two stock prices.