- Use the file HotelRate to analyze nightly hotel rates in the
U.S.
Mean_Rate
|
Median_Rate
|
Quart_1
|
Quart_3
|
159
|
161
|
138
|
168
|

The mean and median hotel rates are relatively similar indicating
that the distribution is roughly symmetric. We shouldn’t see much skew
on either side of the distribution.
- Use the file TaxPrep to answer the following questions about the
cost of tax preparation from a sample of tax returns in the U.S.
Mean_Cost
|
Median_Cost
|
First_Quart
|
Third_Quart
|
P_90
|
165
|
152
|
115
|
184
|
237
|
SD
|
Upper
|
Lower
|
65
|
360
|
-30.4
|

- Compute the mean, median, and mode. From this information do you
believe the distribution to be symmetrical or skewed?
A slight difference between the mean and median would suggest a small
amount of right skew.
- Compute the first and third quartiles.
- Compute and interpret the 90th Percentile.
- Compute the sample standard deviation and interpret the
results.
A sample standard deviation of $65 suggests a moderate to small
amount of variation.
- Create a histogram for the variable cost. Describe the shape of the
distribution. Is this consistent with your response from above using the
mean and median? Explain this result.
The histogram suggests there is some right skew in the distribution.
There is a large number of observations between 0-200 and then a few
observations in the right tail.
- Determine the value for cost that would be considered an outlier
both above and below the mean.
- Use the data file J-Ville-House to answer the following questions.
The data contains a sample from houses in the Jacksonville area.
Mean_Price
|
Median_Price
|
SDev_Price
|
85659
|
72000
|
49650
|
- Calculate the mean and median for the variable price. Discuss the
difference in values between the two and how it relates to the symmetry
in the distribution.
The average and median are approximately $13,600 apart. This would
suggest the distribution is right skewed. There are likely a small
number of expensive houses that are pulling the distribution to the
right.
- Report the standard deviation for the variable price. How would you
characterize the variation in prices across houses? That is, is there a
lot of variation or a little?
The standard deviation of $49,560 would indicate that there is a lot
of variation in prices across houses. This should make sense because
houses are typially very different.
- Calculate the coefficient of variation for the variable price.
Interpret this value in terms of the level of variation for house
prices.
Coefficient of Variation = 0.58. This would validate are previous
answer that demonstrates there is a lot of variation in price. A C.V. of
58% can be interpreted to mean that the standard deviation is 58% of the
mean and is an indicator of a lot of variation in price.
- What is the z-score for a house that is priced at $125,000?
z = 0.792
- What is the z-score for a house that is priced at $60,000?
z = -0.517
- What is the z-score for the minimum priced house? Using the z-score
criteria, would this be considered an outlier? Explain.
Zmin = -1.705. This would not be considered an outlier. It is within
three standard deviations away from the mean.
- Using the z-score criteria, calculate the upper boundary above the
mean price that would constitute an outlier. Use this value to determine
if there are any outliers for the variable price.
Upper Bound = 234609.946, Max Price = 271000. Since the maximum house
price is greater than our upper bound outlier criteria, there are
outliers above the mean in the data.
- Assuming house prices are fairly symmetric, approximately what
percentage of prices would lie within ± two standard deviations from the
mean?
Approximately 95% using the empirical rule.
- Use the file SO2Auction to analyze the distribution of permit prices
in the Sulfur Dioxide auction market.
- For the variable Price Per Permit, report the Five Number Summary.
From this information, discuss the shape of the distribution for permit
prices.
Min
|
First_Quartile
|
Second_Quartile
|
Third_Quartile
|
Max
|
127
|
131
|
136
|
144
|
250
|
The distance between the minimum value, the first quartile, second
quartile, and third quartile are fairly consistent, suggesting nearly
three-fourths of our data lies relatively close together. The difference
between the max and the third quartile is substantially larger,
suggesting that there is some right skew in the distribution. There are
a few high bids pulling the average to the right.
- Calculate the interquartile range for the Price Per Permit variable.
Interpret this value in relation to the amount of variability in permit
prices.
IQR = 12.159. The IQR represents the range for the middle 50% of the
data. In this case the range is only about $12, suggesting a small
amount of variation between the first and third quartiles.
- Using the quartile criteria for determining outliers, what upper
threshold level of permit prices would be considered an outlier? Are
there values that would be considered outliers from above?
Upper Outlier = 161.739, Max = 250. Since the max auction bid is
greater than the upper outlier condition, there are outliers above the
median.
- Using the quartile criteria for determining outliers, what lower
threshold level of permit prices would be considered an outlier? Are
there values that would be considered an outlier from below?
Lower Outlier = 113.102, Min = 127.296. Since the minimum auction bid
is not lower than the lower outlier condition, there are no outliers
below the median.
- The following table categorizes car accidents by age. Specifically,
it is tabulating the percentage of accidents on record from a sample of
insurance customers who are classified as above 85 or below 85. Use the
table to answer the following questions.
Table 1: Accident Reports by Age
Accident |
8% |
24% |
32% |
No Accident |
64% |
4% |
68% |
Total |
72% |
28% |
100% |
- What is the joint probability that someone is less than 85 years old
is also in an accident?
8%
- What percentage of people in the sample are at least 85 years
old?
28%
- What percentage of people in the sample were in an accident?
32%
- Use the file StockPrice to examine the relationship between the
price of Apple stock and the price of Netflix Stock. The data contains a
sample of 10 observations for closing price for each stock over the last
year.

- Create a scatter diagram for Apple and Netflix stock prices. Place
Apple on the x-axis and Netflix on the y-axis. Add a trend line to the
scatter plot. Visually interpret the sign (\(\pm\)) and the strength of the
relationship.
With the trend line and scatter diagram it is evident that there is a
fairly strong positive relationship between the two variables.
Compute the sample covariance between Apple and Netflix stock
prices.
COV = 70.18
Compute the correlation coefficient between Apple and Netflix
stock prices. Interpret this value in terms of the sign (\(\pm\)) and strength of the relationship. Is
this result consistent with your
CORR = 0.257
The correlation coefficient indicates a moderate positive
relationship between the two stock prices.