Question 1.

dbinom (10, size=50, prob=0.1)
## [1] 0.01518333
plot(dbinom (0:50, size=50, prob=0.1))

# The probability of 10 or less influenza-related cases out of 50
pbinom(10, size=50, prob=0.1)
## [1] 0.9906454
# The probability of 10 or more influenza-related cases out of 50
1-pbinom(9, size=50, prob=0.1)        # Pr(X>=10) = 1-Pr(X<=9) = 1 – pbinom(9, size=50, prob=0.1)
## [1] 0.02453794

Discussion

If 10 out of 50 visitors in the emergency room had influenza-related cases today, today rate is 20%. But, the probability of 10 influenza-related cases out of 50 is ~1.518%, and the cumulative probability of 10 or more influenza-related cases out of 50 is ~2.454%. Therefore, today rate is way higher than ~1.518% and ~2.454% , and it is very weird situation.

Question 2.

# negative binomial distribution of mean expression  
# n = number of observations
# size = target for number of successful trials
# prob = probability of success in each trial

negativeRandomSample = rnbinom (10000, size=1, prob=0.0001)
# on average, it takes ~10,000 times to have one success
summary(negativeRandomSample) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    2882    6970   10020   13830  125300
#  This makes sense because I set the probability to 0.0001  

sorted=sort(negativeRandomSample)
cumulatedSum=cumsum(negativeRandomSample)    # cumulative summation

pnbinom(6926, size=1,prob=0.0001)  # median 6926
## [1] 0.4997937
qnbinom(0.5, size=1,prob=0.0001)   #6931, close to median
## [1] 6931

Discussion

• In the generation of random sample of 10,000 negative binomial variates, as I set “size=1, prob=0.0001”, on average, it takes ~10,000 times to have one success. Therefore, it makes sense that I got mean=~10,040.
• In the 10,000 random samples of negative binomial variates, median was 6,926. It was confirmed when I used CDF; I got the probability ~0.4998. In the quantile function, I got 6,931, which is close to median. It is comparable to the sorted output; the medians for the sample and the nbinom distribution are almost the same

Question 3.

basicProbability=c(0.855,0.027,0.018,0.003,0.095,0.002)


basicProbability
## [1] 0.855 0.027 0.018 0.003 0.095 0.002
oneSample_size100 = rmultinom( n = 1, size =100, prob = basicProbability)
nineSample_size100 = rmultinom( n = 9, size =100, prob = basicProbability)

oneSample_size100 = array(oneSample_size100, dim=c(6,1))
nineSample_size100 = array(nineSample_size100, dim=c(6,9))

tenSample_size100 = cbind(oneSample_size100, nineSample_size100)
tenSample_size100
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]   86   75   83   85   87   87   91   82   85    88
## [2,]    5    7    1    1    3    1    0    4    4     2
## [3,]    1    3    3    3    1    1    3    0    5     3
## [4,]    0    1    0    0    0    0    0    0    0     0
## [5,]    8   13   12   11    9   11    6   14    6     6
## [6,]    0    1    1    0    0    0    0    0    0     1
rownames(tenSample_size100) <- c("negative_healthy","positive_healthy", "indeterminate_healthy",
                "negative_sick","positive_sick","indeterminate_sick")
healthyPositive <- tenSample_size100["positive_healthy", ]
table(healthyPositive)
## healthyPositive
## 0 1 2 3 4 5 7 
## 1 3 1 1 2 1 1
summary(healthyPositive)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0     1.0     2.5     2.8     4.0     7.0
hist(healthyPositive)