Due 9/20/2024 Friday before the class begins.

Problem 1

Four buses carrying 100 students from the same school arrive at a football stadium. The buses carry, respectively, 10, 20, 30, and 40 students. One of the students is randomly selected. Let \(X\) denote the number of students who were on the bus carrying the randomly selected student. Now one of the 4 buses is randomly selected, and let \(Y\) denote the number of students on the bus.

(a) Which of \(\mathbb{E}[X]\) or \(\mathbb{E}[Y]\) do you think is larger? Why?

Solution

\(\mathbb{E}[X]\) is larger than \(\mathbb{E}[Y]\) since \(X\) has more chance to become a big number (there are 40 out of 100 ways to have \(X\) to be 40. On the other hand, there are 1 out of 4 ways to have \(Y\) to be 40).

(b) Compute \(\mathbb{E}[X]\) and \(\mathbb{E}[Y]\).

Solution

\[ \mathbb{E}[X]=10\times \frac{10}{100}+ 20\times \frac{20}{100}+30\times \frac{30}{100}+40\times \frac{40}{100}=30. \] \[ \mathbb{E}[Y]=10\times\frac{1}{4}+20\times\frac{1}{4}+30\times\frac{1}{4}+40\times\frac{1}{4}=25. \]

(c) Explain (without any computation) which of \(\text{Var}(X)\) or \(\text{Var}(Y)\) you think is smaller? Why?

Solution

\(\text{Var}(X)\) is smaller than \(\text{Var}(Y)\). \(Y\) is evenly distributed, whereas \(X\) has higher probability of having larger values. Hence, \(X\) is less scattered than evenly scattered \(Y\) and \(\text{Var}(X)\) is smaller than \(\text{Var}(Y)\).

(d) Compute \(\text{Var}(X)\) and \(\text{Var}(Y)\) to confirm your guess.

Solution

Note that \[ \mathbb{E}[X]=10\times \frac{10}{100}+ 20\times \frac{20}{100}+30\times \frac{30}{100}+40\times \frac{40}{100}=30. \] \[ \mathbb{E}[Y]=10\times\frac{1}{4}+20\times\frac{1}{4}+30\times\frac{1}{4}+40\times\frac{1}{4}=25. \] Hence, \[\begin{align*} &\text{Var}(X)=\mathbb{E}[(X-\mathbb{E}[X])^{2}]\\ =&(10-30)^{2}\times \frac{10}{100} +(20-30)^{2}\times \frac{20}{100} +(30-30)^{2}\times \frac{30}{100} +(40-30)^{2}\times \frac{40}{100} \\ =&30, \end{align*}\] and \[\begin{align*} &\text{Var}(Y)=\mathbb{E}[(Y-\mathbb{E}[Y])^{2}]\\ =&(10-25)^{2}\times \frac{1}{4} +(20-25)^{2}\times \frac{1}{4} +(30-25)^{2}\times \frac{1}{4} +(40-25)^{2}\times \frac{1}{4} \\ =&125. \end{align*}\]

(e) On their way back, they need to give a ride to other students in a nearby school and decide to carry 20, 30, 40, and 50 students respectively. One of the 4 buses is randomly selected and let \(Z\) be the number of students on the bus. What is \(\text{Var}(Z)\)?

(Hint: you don’t need any extra computation. Note that \(Z=Y+10\).)

Solution

\[ \text{Var}(Z)=\text{Var}(Y+10)=\text{Var}(Y)=125. \]

Problem 2 Poisson Distribution

The number of typos per page of a certain book has a Poisson distribution of rate \(\lambda=5\). What is the probability that there is no typo on the page?

Solution

Let \(X\) be the number of typos, then \(X\sim \text{Poisson}(5)\)
\[ \mathbb{P}(X=0)=e^{-5}. \]

Problem 3 Poisson Approximation

Mass-produced needles are packed in boxes of 1000. On overage 1 needle in 2000 is defective.

(a) Find the probability that a box contains 2 or more defectives using the binomial distribution (you may want to use R to compute the actual probability.)

Solution

Let \(X\) be the number of defective needles in one box. Then, \(X\sim \text{Binom}(1000, \frac{1}{2000})\). Hence, \[\begin{align*} &\mathbb{P}(X\geq 2) =1-\mathbb{P}(X=0 \text{ or 1}) \\ =&1- \binom{1000}{0}(\frac{1}{2000})^{0}(\frac{1999}{2000})^{2000} -\binom{1000}{1}(\frac{1}{2000})^{1}(\frac{1999}{2000})^{1999}\\ =&0.09016608. \end{align*}\]

1-pbinom(1,1000,1 /2000)
## [1] 0.09016608

(b) Use the Poisson approximation to estimate the probability that 2 or more are defective.

Solution

Note that \(\lambda=np=1000\times \frac{1}{2000}=\frac12\). Hence, \[ \mathbb{P}(X\geq 2)=1-\mathbb{P}(X\leq 1) \approx 1- e^{-1/2}(1+\frac{1}{2})=0.09020401. \]

1-exp(1)^{-1/2}*1.5
## [1] 0.09020401

Problem 4 Overbooking Problem

Here is an example where the binomial distribution is used in real life-how most airlines use overbooking to maximize their revenue. Watch the video before you proceed. \[ \href{https://www.youtube.com/watch?v=ZFNstNKgEDI}{\text{Why do airlines sell too many tickets? - Nina Klietsch, TED-Ed}}. \] In this problem, you will find an optimal number of airline tickets that maximizes revenue. Here is a setup. In a certain airline route, there are 500 seats and each ticket is sold at $1000. It costs $2500 for the airline for each bumped passenger for accommodations. For simplicity, we assume that each person travels individually rather than in groups, and all airline tickets are sold completely. The probability that each person who purchases a ticket shows up on time at the airport is \(p=95\%\).

Theoretic derivation

(a) What is the total revenue (the money that the airline earns) without any overbooking?

Solution

\[ \$1000 \times 500=\$500,000. \]

(b) Assume that the airline sells \(n\) tickets with \(n\geq 500\) (they decide to overbook). Let \(X\) be the number of customers who show up at the airport on time. What is the distribution of \(X\)?

Solution

\[ X\sim Binom(n,0.95). \]

(c) Let \(Y\) be the amount of money to pay for all bumped passengers. Find an expression for \(Y\) in terms of \(X\).

Solution

\[ Y= \begin{cases} 0 &\text{if } X\leq 500,\\ 2500 \times (X-500) &\text{if } X>500. \end{cases} \]

(d) Find the expression for \(\mathbb{E}[Y]\).

Solution

\[ \mathbb{E}[Y]=2500\times \mathbb{P}(X=501) + 2500\times 2\times \mathbb{P}(X=502) + \cdots + 2500\times (n-500)\times \mathbb{P}(X=n). \]

Monte-Carlo simulation

(a) Suppose that the airline sells 520 tickets. Write down a code that computes the expected revenue for the airline. Does the airline earn or lose money on average by selling 20 extra tickets?

Solution

p<-0.95;
TicketSale<-520*1000;
AverageCostForBumped<-0;
  for (i in 500:520) {
    AverageCostForBumped<-AverageCostForBumped+2500*(i-500)*dbinom(i,520,p)
  }
TotalRevenue<-TicketSale-AverageCostForBumped;
TotalRevenue
## [1] 519411.9

Its average revenue for selling 520 tickets is $519,411.9 and they earn an extra $19,411.9 by overselling.

(b) Now use the previous code to compute the expected revenue for the airline if they sell \(n\) tickets where \(n\) ranges from 500 to 550. Make a chart that shows the expected revenue for each value of \(n\).

p<-0.95;
#TotalRevenue will record the expected revenue (total ticket price-average bumping cost)
TotalRevenue<-NULL;

#Loop starts from 501 to 550 passengers. 
for (n in 1:50) {
  Overbooking<-n;
  TicketRevenue<-1000*(500+Overbooking);
  BumpAverage<-0
  for (i in 1:Overbooking) {
    BumpAverage<-BumpAverage+2500*i*dbinom(500+i,500+Overbooking,p)
  }
  TotalRevenue<-c(TotalRevenue,TicketRevenue-BumpAverage);
}
print(TotalRevenue)
##  [1] 501000.0 502000.0 503000.0 504000.0 505000.0 506000.0 507000.0 508000.0
##  [9] 508999.9 509999.6 510998.8 511996.8 512992.4 513983.2 514965.2 515932.0
## [17] 516874.4 517779.8 518632.0 519411.9 520097.7 520667.2 521098.9 521374.2
## [25] 521479.2 521405.6 521151.8 520722.3 520127.7 519382.6 518504.9 517513.8
## [33] 516428.7 515267.7 514047.3 512781.6 511482.1 510158.3 508817.2 507464.3
## [41] 506103.3 504737.1 503367.4 501995.6 500622.5 499248.6 497874.3 496499.6
## [49] 495124.8 493749.9
plot(TotalRevenue, col = "red", xlab="Number of Extra Ticket", 
     ylab="Total Revenue")

(c) What is the optimal value of the number of tickets that maximizes the revenue? You must provide a full revenue chart to show that your answer maximizes the expected revenue.

Solution

\(n=525\) maximizes their revenue, which is $52,1479.2.