Name: 1. Major League Baseball has initiated new rules to speed up baseball games. The following data set comes from a sample of 50 games played during the 2017 and 2018 seasons. Game times are reported in hours. Use the worksheet Baseball Time to answer the following questions. a. Specify the Null and Alternative Hypotheses that would indicate that there was a reduction in game times between the 2017 and 2018 season.

\[H_{0}: \mu_{2017} = \mu_{2018} \\ H_{a}: \mu_{2017} > \mu_{2018} \] b. Historical data indicate a population standard deviation of 0.5 hours is a reasonable assumption for both years. Conduct the hypothesis test and report the p-value. At α = .05, what is the conclusion of your test? Report the estimated difference in times between games along with the p-value.

The P-Value for an upper tail test 0.039 which is less than 0.05 the specified level of alpha. This suggests that we reject the null in favor of the alternative. Statistically, there is enough information to suggest that the average baseball game times were reduced from 2017 to 2018.

  1. Approximately how many minutes were baseball game times reduced from 2017 to 2018? From these results, will baseball game times games continue to be an issue in future years? Explain.

The average game times were reduced by about 10.5 minutes. This might not be enough to make a substantial differnce in terms of increasing viewership.

  1. Two of the larger distribution centers for Amazon are located in Texas and Ohio. The worksheet Distribution Center provides data on the weight of goods shipped per day (in tons) from these two centers. Data are measured in thousands of tons.
  1. Report the sample mean, sample standard deviations for each distribution center, and the estimated difference between the two population means.
## Texas Sample Mean  Ohio Sample Mean 
##               9.8               4.9
## Texas St. Dev  Ohio St. Dev 
##           2.6           1.5
## Est. Diff 
##       4.9
  1. At \(\alpha = 0.05\) formulate a hypothesis test that the weight of goods between the two distribution centers is different. Specify the null and alternative hypothesis, report the test statistic and the p-value, and the result of the test. Is it clear that one distribution center moves more goods than the other? Explain.

\[ H_{0}: \mu_{Texas} = \mu_{Ohio} \\ H_{a}: \mu_{Texas} \neq \mu_{Ohio} \]

The p-value for a two-tail test is approximately 0. So, we can reject the null in favor of the alternative suggesting that there is a statistical difference between the two distribution centers. There is enough information to claim that Texas ships more goods than Ohio.

  1. The quality of a school district can have a significant effect on housing prices in the surrounding area. The worksheet School District contains a random sample of housing prices in each of three school districts. Use this information to answer the following questions. House prices are measured in $1,000’s.
  1. Specify the null and alternative hypotheses to determine if at least one average housing price in one district is different.

\[ H_{0}: \mu_{1} = \mu_{2} = \mu_{3}\\ H_{a}: \text{At least one pop. mean is different} \]

  1. Use \(\alpha = 0.05\) and ANOVA to test to see whether there is at least one population mean house price that is different.

The p-Value for the ANOVA test is 0.515. This is greater than the specified level of significance at 0.05. There is not enough evidence to suggest that average housing prices are different across the three school districts.

  1. The worksheet Wages contains a random sample of 30 workers’ wages and education levels from 1980. Use this data to answer the following questions.
  1. Given the two variables, which would be considered the independent variable and which would be considered the dependent variable?

Education would be considered the independent variable and wages would be considered the dependent variable.

  1. Use Excel to estimate the linear regression equation. Interpret the slope and intercept coefficients.

The intercept - \(\hat{\beta_{0}} = -2.91\). For a person with no years of education their predicted wage would be -2.91 per hour. This is an unrealistic estimate since we wouldn’t expect anyone to work for a negative range. It’s more likely that we don’t have any observations of people with 0 yeras of education to provide a good estimate at the intercept.

The coefficient estimate - \(\hat{\beta_{1}} = 0.72\). The predicted change in wages for a one year change in education is 0.72. For every extra year of education a person’s wage is predicted to change by 0.72.

  1. Perform a hypothesis test on the coefficient to determine if it has an effect different from zero. Report the p-values for the test. Let \(\alpha = 0.05\).

\[H_{0} : \beta_{1} = 0\\ H_{a}: \beta_{1} \neq 0: \]

The p-value for the two-tail test to determine if the effect is different from 0 is 0.006. This is less than the specified level of significance. This suggests that the effect is statistically different from 0.

  1. Report the Adjusted R2 and interpret this value.

The adjusted \(R^{2}\) = 0.21. For the model 21% of the variation in wages can be explained by education.