Section 1: True or false (5 points)

Instructions: For each of the following statements decide whether or not the statement is true or false. No justication is required, and no partial credit will be given.

  1. If two events A and B are disjoint, then P(A or B) \(=\) P(A)*P(B)
  2. If two events A and B are independent then P(A and B) \(=\) P(A) + P(B)
  3. A nominal variable is a type of numeric variable
  4. You can use a boxplot to find the median of a variable
  5. The median is a better measure of center when there are large outliers in the data
  6. A probability model is a table containing the dijoint events within a random variable as well as each event’s corresponding probabilities

Section 2: Free response

Instructions: Carefully read each of the following questions and thoughtfully answer each part of the question using complete sentences and precise notation. Partial credit will be given for responses that clearly justify reasoning and outline procedures.

Problem 1: The distribution of car horsepower (4 points)

The histogram below was created using data collected from 32 different car models. Specifically this histogram shows the frequency of cars that fall into horsepower bins that are 40 wide.

  1. Does this distribution appear to exhibit any sort of skew? If so, is it right or left skewed?
  2. In what horsepower range do most cars in this sample fall into?
  3. Is the mean or the median larger in this distribution? How do you know?
  4. Would the mean or the median be a better measure of center for this distribution? Justify your answer.

Problem 2: The distribution of horsepower by number of cylinders (4 points)

The boxplots below were constructed using the same data as problem 1. Each boxplot shows the distribution of horsepower by how many cylinders a car has (note that most cars have either 4, 6, or 8 cylinders).

  1. As a car gets more cylinders, in general what happens to the amount of horsepower it can produce?
  2. Write down the 5 number summary for cars that have 8 cylinders.

Problem 3: Diamonds (3 points)

The scatterplot below uses data on 1000 diamonds and plots the price of the diamond vs the number of carats (weight) of the diamond.

  1. Does there seem to be a association between the weight of a diamond in carats and the price of that diamond? If so is it a positive or negative association?
  2. If there is an association, can we necessarily say that the weight of a diamond causes it to have a high price?
  3. What is a possible confounding variable that might confound a causal relationship between a diamond’s weight and its price?

Problem 4: Colored marbles (3 points)

In a bag there are 15 marbles. 7 of the marbles are blue, 5 are red, 2 green and 1 is yellow.

  1. If you draw a single marble from the bag what is the probability that it will be green?
  2. Suppose you draw two marbles from the bag with replacement. What is the probability that both of the marbles will be blue?
  3. Suppose you draw two marbles out of the bag without replacement. What is the probability that both of the marbles will be red?

Problem 5: Doctors, nurses, and probability (3 points)

A hospital has a total of 75 staff comprised of nurses and doctors. 60 of the employees are nurses, 80% of which are female. The remaining staff are doctors, 40% of which are male.

  1. If you randomly choose a staff member, what is the probability they will be a doctor?
  2. If you randomly choose a staff member, what is the probability they will be a female?
  3. If you randomly choose a staff member, what is the probability they will be a doctor or a female

Problem 6: Coins and Dice

Consider the following game. You flip a coin and roll a fair 6-sided die. If you roll a 6 and flip heads, you win $20. If you roll a 1 and flip a tails, you lose $40. If you flip a heads with any other number other than a six, you win $3. If you flip a tails with any number other than a 1 you get $0.

  1. Write down a probability model for this game of chance.
  2. Calculate the expected value of this game. Show your work.
  3. Is it in your best financial interest to play this game? Explain why you do or do not want to play this game using the expected value as your argument. .

Problem 7: Car accidents (4 points)

The contingency table below shows the number of traffic fatalities in the state of Vermont during 2013, broken down by whether the fatality was a driver or passenger, as well as male or female.

Drivers Passengers
Male 1,020 1,400
Female 980 1,319
  1. Extend the contingency table to include the marginal distributions of gender and driver type.
  2. Conditional on a fatality being a female, what is the probability that the fatality was a passenger?
  3. What is the marginal probability that a fatality was a female?
  4. What is the joint probability that a fatality was both a male and a passenger?

Problem 8: Histograms and probabilities

The barplot below shows 10 outcomes from a single random variable that follows the Poisson distribution. (Note if you don’t know what the Poisson distribution is it won’t effect your ability to do this problem.)

barplot(table(rpois(10,2)))

  1. Given the distribution of outcomes, calculate the empirical probability that the outcome of random variable Y is 3.
  2. Given the distribution of outcomes, calculate the empirical probability that the outcome of random variable Y is 0 or 4.
  3. Calculate the expected value of random variable Y.