There are 10 questions and each question (part of a question) is worth 7.5 points each. When completed, knit the file to a .HTML and save the file as Test#1_LastName and submit the .HTML file to the Test #1 assignment link in Canvas.

Due Date: Wednesday November 27, 2019 by 11:59p.m. EST.

  1. If you have data that is in case form format, and you want to use the xtabs() function to construct a crosstabulation of categorical variables, would there be a variable in front of the ‘~’ sign in the xtabs() function. State Yes or No. Explain your response.

No. because when the data tabulated in frequency form, we will put the frequency variable in front of the ‘~’ sign.

  1. This problem uses the DanishWelfare data frame that you used in Homework #1 (#2.4 on p. 61). The code below uses structable() to create a certain formatted table using the Danish Welfare data frame in the vcd library. Run the code below in the code chunk and examine the output that is produced. In the second code chunk, modify the code (still using structable()) so that marital status (Status) is on the columns instead of the rows.
library(vcd)
## Loading required package: grid
#run code below
data("DanishWelfare",package="vcd")

#creating a crosstabulation of alcohol consumption (Alcohol), location (Urban) and
#marital status(Status)
structable(Alcohol ~ Urban + Status, DanishWelfare)
##                         Alcohol <1 1-2 >2
## Urban         Status                     
## Copenhagen    Widow              4   4  4
##               Married            4   4  4
##               Unmarried          4   4  4
## SubCopenhagen Widow              4   4  4
##               Married            4   4  4
##               Unmarried          4   4  4
## LargeCity     Widow              4   4  4
##               Married            4   4  4
##               Unmarried          4   4  4
## City          Widow              4   4  4
##               Married            4   4  4
##               Unmarried          4   4  4
## Country       Widow              4   4  4
##               Married            4   4  4
##               Unmarried          4   4  4
#insert your modified code below
structable(Status ~ Urban + Alcohol, DanishWelfare)
##                       Status Widow Married Unmarried
## Urban         Alcohol                               
## Copenhagen    <1                 4       4         4
##               1-2                4       4         4
##               >2                 4       4         4
## SubCopenhagen <1                 4       4         4
##               1-2                4       4         4
##               >2                 4       4         4
## LargeCity     <1                 4       4         4
##               1-2                4       4         4
##               >2                 4       4         4
## City          <1                 4       4         4
##               1-2                4       4         4
##               >2                 4       4         4
## Country       <1                 4       4         4
##               1-2                4       4         4
##               >2                 4       4         4
  1. View the DanishWelfare data frame. Is the data frame in case or frequency form?

frequency form

  1. Using your answer from part b, if you were to apply xtabs() to the DanishWelfare data frame, what is the name of the variable that would be to the left of the ‘~’ sign in the xtabs() function? If there should be no variable name to the left of the ‘~’ sign, explain why this is the case.

We can put variable ‘Freq’ to the left of the ‘-’ sign in the xtabs() function.

  1. Describe a binomial experiment that would generate binomail data. That is, explain what the experiement is and make a case for how the experiment meets the three criteria for binomial data. Do not use an example from the book or one discussed in class. Specifically, do not use any flipping coins examples or dice examples. Come up with your own example! Think about some process/experiment that you deal with on a daily basis (work or personal). Place your response below.

Reminder: Three criteria for Binomial experiment (from our class notes): 1. n independent trials (state n and explain why trial are independent) 2. only one of two outcomes; “success” and “failure” (specify what is a “success” and what is a “failure”) 3. the probability of “success” stays the same from trial to trial (state p and why the probability stays the same from trial to trial)

  1. You are going to be working with a researcher assisting him in designing a survey for 100 subjects. One of the questions is ‘Were you vaccinated for the flu this year?’

Is this a binomial experiment? State Yes or No. If Yes, describe the three criteria that make this experiment Binomial. If No, state why this is not a Binomial experiment.

Yes. Three criterias below:

  1. n independent trials. In this experiment, n equals to 100. The 100 trails are independent, because whether a person was vaccinated or not would not afffect whether another person was vaccinated.

  2. Trial consist of only two outcomes - ‘Yes’ or ‘No’. ‘Yes’ means that the person was vaccinated, and ‘No’ means that the person was not vaccinated.

  3. The probability of ‘Yes’ stays the same from trial to trial. The probability of a person was vaccinated is P% as given by collected data.

  1. A student answers 10 quiz questions completely at random; the first five are true/false, the second five are multiple choice, with four options each.

Is this a binomial experiment? State Yes or No. If Yes, describe the three criteria that make this experiment Binomial. If No, state why this is not a Binomial experiment.

No. Because binomial trial only consists of two outcomes. For the fist five questions, there are only ‘true’ or ‘false’ outcomes, but for the second five, there are four outcomes.

  1. Use the appropriate R function (must be one we discussed from class) and find the probability of 6 successes from a Bin(10,1/4) distribution.
dbinom(6,10,0.25)
## [1] 0.016222
  1. Use the appropriate R function (must be one we discussed in class) and find the probability of 5 or less successes from a Bin(10,1/4) distribution.
pbinom(5,10,0.25)
## [1] 0.9802723
  1. You are a data analyst working for an insurance company. You are interested in the number of claims that get submitted per day. X=# claims filed in one day.You know the mean number of claims filed per day is 15.
  1. What is the probability that more than 10 claims will filed in one day?

Use the appropriate R function (must be one we discussed in class) to find the probability.

meanOfClaim = 15
ppois(10,meanOfClaim,lower.tail=FALSE)
## [1] 0.8815356
  1. What is the probability that 10 or more claims will be filed in one day?

Use the appropriate R function (must be one we discussed in class) to find the probability.

ppois(9,meanOfClaim,lower.tail=FALSE)
## [1] 0.9301463
  1. Using the same scenario in #8, find the probability that exactly 10 claims are filed in one day.

Use the appropriate R function (must be one we discussed in class) to find the probability.

dpois(10,meanOfClaim)
## [1] 0.04861075
  1. Read about the space shuttle Challenger explosion on p.p. 23-24. The main cause of the explosion was the failure of an O-ring seal. The SpaceShuttle data frame in the vcd package contains data from 24 test flights and how many O-rings failed on the test flight (nFailures).
  1. Create a data frame or table that has two rows and three columns. The first row should be the values of k (number of O-rings that failed) and the second row should be the frequencies for each value of k (\(n_k\)).
library(vcd)
data("SpaceShuttle", package="vcd")
SpaceShuttle.tab = table(SpaceShuttle$nFailures)
SpaceShuttle.tab
## 
##  0  1  2 
## 16  5  2
  1. Combine two vectors using the cbind() function. The first vector should be the values of k and the second vector should be the probabilities of k=0 (no O-ring failures on test flight), k=1 (1 O-ring failure on test flight) and k=2 (2 O-ring failures on test flight).
k = 0 : 2
PK = prop.table(SpaceShuttle.tab)
cbind(k,Prob=round(PK,7))
##   k      Prob
## 0 0 0.6956522
## 1 1 0.2173913
## 2 2 0.0869565