*Submit your homework to Canvas by the due date and time. Email your instructor if you have extenuating circumstances and need to request an extension.
*If an exercise asks you to use R, include a copy of the code and output. Please edit your code and output to be only the relevant portions.
*If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manually calculations on your exams, so practice accordingly.
*You must include an explanation and/or intermediate calculations for an exercise to be complete.
*Be sure to submit the HWK3 Autograde Quiz which will give you ~20 of your 40 accuracy points.
*50 points total: 40 points accuracy, and 10 points completion
Question 1 A chemical supply company ships a certain solvent in 10-gallon drums. Let X represent the number of drums ordered by a randomly chosen customer. Assume X has the following probability mass function (pmf). The mean and variance of X is : \(\mu_X=2.3\) and \(\sigma^2_X=1.81\):
| X | P(X=x) |
|---|---|
| 1 | 0.4 |
| 2 | 0.2 |
| 3 | 0.2 |
| 4 | 0.1 |
| 5 | 0.1 |
- Calculate \(P(X \le 2)\) and describe what it means in the context of the problem.
0.4+0.2
## [1] 0.6
There is a 60% chance of the chemical supply company selling 1 or 2 10-gallon drums of a certain solvent to any given customer
- Let Y be the number of gallons ordered, so \(Y=10X\). Determine the probability mass function of Y.
| Y | P(Y=y) |
|---|---|
| 10 | 0.4 |
| 20 | 0.2 |
| 30 | 0.2 |
| 40 | 0.1 |
| 50 | 0.1 |
- Calculate the mean number of gallons ordered \(\mu_Y\).
**weighted avg, or transformation sqrt((1.81*10^2)**
(10*0.4)+(20*0.2)+(30*0.2)+(40*0.1)+(50*0.1)
## [1] 23
- Calculate the standard deviation of the number of gallons ordered, \(\sigma_Y\).
sqrt((0.4*(10-23)^2)+(0.2*(20-23)^2)+(0.2*(30-23)^2)+(0.1*(40-23)^2)+(0.1*(50-23)^2))
## [1] 13.45362
Exercise 2 Prevention after acute myocardial infarction (AMI) is primarily managed through medications. A large cohort study of post-AMI patients >65 years of old (*) found only 74% of patients filled all their discharge prescriptions by 120 days after discharge.
A physician at UW has 4 post-AMI patients >65 yo and would like to use 0.74 has his estimate for \(\pi\), the probability of each of his patients filling all of their discharge prescriptions by 120 days after discharge. Define a random variable F, the count of the physician’s four patients who fill all of their discharge prescriptions by 120 days after discharge. Assume that the filling of prescription behavior is independent between the 4 patients and that \(\pi=0.74\).
- Determine the probability distribution of F (write out the pmf) using probability theory.
dbinom(0,4,0.74)
## [1] 0.00456976
- Compute the probability that F>0. What does this value mean in the context of the scenerio?
sum(dbinom(1:4, 4, 0.74))
## [1] 0.9954302
This means that if four random people were selected that there is a 99.5% chance that at least one of them had filled theor perscription in the past 120 days.
- What is the expected value for F, \(\mu_F\)? What does that value mean in the context of the scenerio?
0.74*4
## [1] 2.96
On average 2.96 people will have filled there prescription in the past 120 days from the study
- What is the standard deviation for F, \(\sigma_F\)?
(4*0.74*0.26)
## [1] 0.7696
- Explain (briefly) how you can use the following simulation to check your answers for part 2a. Some questions to consider: Why did I define FilledPresc as I did? What values are stored into the CountFilled vector? What does the histogram show?
FilledPresc=c(rep(1,74), rep(0,26))
manytimes=100000
CountFilled=rep(0,manytimes)
set.seed(1)
for (i in 1:manytimes){
samp=sample(FilledPresc,4, replace=TRUE)
CountFilled[i]=sum(samp)
}
hist(CountFilled, labels=TRUE, ylim=c(0,.5*manytimes), breaks=seq(-0.5, 4.5, 1))
Using this to check part a makes sense, because the frequency of 0 medications filled is accurate with the value we predicted in part A when you take into account the amount of trials
- Suppose this physician now has 20 post-AMI patients >65 years and wants to use a Binomial model (n=20, \(\pi=0.74\)) to describe the count of those 20 patients who will get all discharge prescriptions filled within 120 days. What the the probability that exactly 15 of those 20 patients get all discharge prescriptions filled within 120 days?
dbinom(15,20,0.75)
## [1] 0.2023312
Question 3 For each of the following questions, say whether the random variable is reasonably approximated by a binomial RV or not, and explain your answer. If it is not a binomial process, explain what assumptions are not well met. If it is a binomial process, comment on the validity of each of things that must be true for a process to be a binomial process (ex: identify \(n:\) the number of Bernoulli trials, \(\pi\) the probability of success, etc) .
- A fair die is rolled until a 1 appears, and X denotes the number of rolls.
This is not a binomial. There are not a set amount of trial, and in this case rolls, so this will not be considered a binomial.
- Twenty of the different Badger basketball players each attempt 1 free throw and X is the total number of successful attempts.
This will not be a binomial because the abilities of players to make free throws depends on the player need more
- A die is rolled 50 times. Let X be the face that lands up.
This will not be a binomial, this is because binomials must have either a success or failure, rolling a dice has 6 possible outcomes. This would be a binomial if X was equal to a certain number on the dice
- In a bag of 10 batteries, I know 2 are old. Let X be the number of old batteries I choose when taking a sample of 4 to put into my calculator.
No, this is not, again, there is no set amount of trials, I assume the person will continue to try different pairing of 4 until the calculater turns on, therefore giving no set limit of trials.”
- It is reported that 20% of Madison homeowners have installed a home security system. Let X be the number of homes without home security systems installed in a random sample of 100 houses in the Madison city limits.
This is a binomial, n=100, prob=0.80
Exercise 4: Weights of female cats of a certain breed (A) are well approximated by a normal distribution with mean 4.1 kg and standard deviation of 0.6 kg \(W_A~\sim N(4.1, 0.6^2)\).
- What proportion of female cats of that breed (A) have weights between 3.7 and 4.4 kg?
pnorm(4.4,4.1,0.6)- pnorm(3.7,4.1,0.6)
## [1] 0.4389699
- A female cat of that breed (A) has a weight that is 0.5 standard deviations above the mean. What proportion of female cats of that breed (A) are heavier than this one?
1 - pnorm(4.4, 4.1, 0.6)
## [1] 0.3085375
1 - pnorm(0.5)
## [1] 0.3085375
- How heavy is a female cat of this breed whose weght is on the 80th percentile?
qnorm(0.8,4.1,0.6)
## [1] 4.604973
- What is the IQR of weights for female cats of this breed using the normal distribution approximation?
qnorm(0.75,4.1,.6)-qnorm(0.25,4.1,.6)
## [1] 0.8093877
- Females from another breed of cats (breed B) have weights well approximated by a normal distribution with mean 10.6 lb and standard deviation of 0.9 lb \(W_{B.lb}~\sim N(10.6, 0.9^2)\). Transform the weights of cat breed B into kilograms using the conversion: 1 lb \(\approx\) 0.454 kgs. You can use the transformation: \(W_{B}=0.454*W_{B.lb}\). Compare the shape, mean, and stanard deviation of the two breeds.
10.6*0.453592
## [1] 4.808075
0.9*0.453592
## [1] 0.4082328
Breed B has a slightly larger average female cat size, while Breed A has slightly larger deviation of weights