Identify hypotheses, Part II. Write the null and alternative hypotheses in words and using symbols for each of the following situations.
\(H_O:\) There is no change in the average calorie intake for diners.
\(H_A:\) There is a difference in calorie intake for diners.
\(H_O: \mu = 1100\)
\(H_A: \mu \ne 1100\)
\(H_O:\) There is no change in the average VRS.
\(H_A:\) There is an increase in the average VRS.
\(H_O: \mu = 462\)
\(H_A: \mu > 462\)
Age at first marriage, Part II. Exercise 4.14 presents the results of a 2006 - 2010 survey showing that the average age of women at first marriage is 23.44. Suppose a researcher believes that this value has increased in 2012, but he would also be interested if he found a decrease. Below is how he set up his hypotheses. Indicate any errors you see.
\(H_O: \bar x = 23.44\) years old
\(H_A: \bar x > 23.44\) years old
The value of the mean should be \(\mu\) and alternate should be a \(\ne\)
Thanksgiving spending, Part II. Exercise 4.12 provides a 95% confidence interval for the average spending by American adults during the six-day period after Thanksgiving 2009: ($80.31, $89.11).
I think he failed stats… if our estimate is the interval of ($80.31, $89.11), his estimate is much larger than the interval estimate.
No, because when you lower the confidence interval, interval would be narrower so the value of $100 still would not be our interval.
Gifted children, Part I. Researchers investigating characteristics of gifted children collected data from schools in a large city on a random sample of thirty-six children who were identified as gifted children soon after they reached the age of four. The following histogram shows the distribution of the ages (in months) at which these children first counted to 10 successfully. Also provided are some sample statistics.
n 36
min 21
mean 30.69
sd 4.31
max 39
Are conditions for inference satisfied?
Yes, sample is large, >30, random sample, and hist is roughly symmetric.
Suppose you read on a parenting website that children first count to 10 successfully when they are 32 months old, on average. Perform a hypothesis test to evaluate if these data provide convincing evidence that the average age at which gifted children first count to 10 successfully is less than the general average of 32 months. Use a significance level of 0.10. (\(\alpha = 10%\))
\(H_O:\) The mean age is 32 months to count to 10.
\(H_A:\) The mean age is less than 32 months to count to 10.
\(H_O: \mu = 32\)
\(H_A: \mu < 32\)
Decision rule: Reject the null if the pvalue is less than 10%.
Test Statistic: Calculate Z
(30.69-32)/(4.31/sqrt(36))
## [1] -1.823666
Calculate the p-value:
pnorm(-1.823666,0,1)
## [1] 0.03410129
Decision: Reject the null.
There is significant evidence to infer that the gifted children can count to 10 at an earlier age.
30.69-1.64*4.31/sqrt(36)
## [1] 29.51193
30.69+1.64*4.31/sqrt(36)
## [1] 31.86807
Our 90% CI is given by (29.51,31.87)
Yes, we see the value of 32 months is outside our CI. We would conclude that 32 months is an unusual event.
Testing for food safety. A food safety inspector is called upon to investigate a restaurant with a few customer reports of poor sanitation practices. The food safety inspector uses a hypothesis testing framework to evaluate whether regulations are not being met. If he decides the restaurant is in gross violation, its license to serve food will be revoked.
Identify hypotheses, Part I. Write the null and alternative hypotheses in words and then symbols for each of the following situations.
\(H_O:\) New Yorkers sleep at least 8 hours per night.
\(H_A:\) New Yorkers sleep less than 8 hours per night.
\(H_O: \mu > 8\) hours sleep per night
\(H_A: \mu < 8\) hours sleep per night
First of all these employers are assholes.
\(H_O:\) Employees spend 15 minutes on non-business activities
\(H_A:\) Employees spend greater than 15 minutes on non-business activities
\(H_O: \mu = 15\) minutes
\(H_A: \mu > 15\) minutes
Online communication. A study suggests that the average college student spends 2 hours per week communicating with others online. You believe that this is an underestimate and decide to collect your own sample for a hypothesis test. You randomly sample 60 students from your dorm and find that on average they spent 3.5 hours a week communicating with others online. A friend of yours, who offers to help you with the hypothesis test, comes up with the following set of hypotheses. Indicate any errors you see.
\(H_O: \bar x < 2\) hours
\(H_A: \bar x > 3.5\) hours
Value of the mean should be \(\mu\), null should be \(=\) and alternate should be \(=\)
Waiting at an ER, Part II. Exercise 4.11 provides a 95% confidence interval for the mean waiting time at an emergency room (ER) of (128 minutes, 147 minutes).
That would exceed the upper confidence interval by 33 minutes, so it would warrant a closer look at whatever data the newspaper is looking at.
That is almost the mean wait time reported.
It would result in a larger interval so the Dean’s claim would still be reasonable.
Ball bearings. A manufacturer claims that bearings produced by their machine last 7 hours on average under harsh conditions. A factory worker randomly samples 75 ball bearings, and records their lifespans under harsh conditions. He calculates a sample mean of 6.85 hours, and the standard deviation of the data is 1.25 working hours. The following histogram shows the distribution of the lifespans of the ball bearings in this sample. Conduct a formal hypothesis test of this claim. Make sure to check that relevant conditions are satisfied.
\(H_O:\) Bearings last 7 hours under harsh conditions.
\(H_A:\) Bearings a different amount of time than 7 hours under harsh conditions.
\(H_O: = \mu = 7\)
\(H_A: = \mu \ne 7\)
= 6.85
s = 1.25
n = 75
Independent, roughly symmetrical with no outliers, sample size >30.
Waiting at an ER, Part III. The hospital administrator mentioned in Exercise 4.11 randomly selected 64 patients and measured the time (in minutes) between when they checked in to the ER and the time they were first seen by a doctor. The average time is 137.5 minutes and the standard deviation is 39 minutes. He is getting grief from his supervisor on the basis that the wait times in the ER increased greatly from last year’s average of 127 minutes. However, the administrator claims that the increase is probably just due to chance.
Are conditions for inference met? Note any assumptions you must make to proceed.
Using a significance level of alpha = 0.05, is the change in wait times statistically significant? Use a two-sided test since it seems the supervisor had to inspect the data before he suggested an increase occurred.
(137.5-127)/(39/sqrt(64))
## [1] 2.153846
2*(1-pnorm(2.15,0,1))
## [1] 0.03155521
Reject the null.
Would fail to reject the null.
Testing for Fibromyalgia. A patient named Diana was diagnosed with Fibromyalgia, a long-term syndrome of body pain, and was prescribed anti-depressants. Being the skeptic that she is, Diana didn’t initially believe that anti-depressants would help her symptoms. However after a couple months of being on the medication she decides that the anti-depressants are working, because she feels like her symptoms are in fact getting better.
Null: Anti-depressants have no effect on fibromyalgia symptoms.
Alternate: Taking anti-depressants results in a decrease of fibrmyalgia symptoms.
Type 1 error would be to incorrectly dismiss the null, and erroneously conclude that taking antidepressants results in a decrease of fibromyalgia symptoms.
Type 2 error would be to incrorrectly dismiss the alternate, and erroneously conclude that anti-depressants have no effect on fibromyalgia symptoms.
Presuming there are no negative side effects to the antidepressants, a type 1 error would not be harmful.
A type 2 error would lead the patient to refuse drugs that could help her condition.
Ages of pennies, Part I. The histogram below shows the distribution of ages of pennies at a bank.
Most of the pennies are on the newer side, and there are diminishing numbers as the penny age increases.
As the n increases, the histograms fall more into a normal distribution, and I would expect that trend to continue.
Identify distributions, Part I. Four plots are presented below. The plot at the top is a distribution for a population. The mean is 10 and the standard deviation is 3. Also shown below is a distribution of (1) a single random sample of 100 values from this population, (2) a distribution of 100 sample means from random samples with size 5, and (3) a distribution of 100 sample means from random samples with size 25. Determine which plot (A, B, or C) is which and explain your reasoning.
4.43 Spam mail, Part I. The 2004 National Technology Readiness Survey sponsored by the Smith School of Business at the University of Maryland surveyed 418 randomly sampled Americans, asking them how many spam emails they receive per day. The survey was repeated on a new random sample of 499 Americans in 2009. (a) What are the hypotheses for evaluating if the average spam emails per day has changed from 2004 to 2009.
$H_O: mu 2004 - mu 2009 = 0 HA : mu 2004 - mu 2009 does nt = 0
18.5-14.9
## [1] 3.6
Not statistically significant means we failed to reject the null.
Yes