Problem 2.6

0 because you roll a pair of fair dice, you should have got at least sum of two if one comes come in each dice.
Let’s calculate to see what are the possibilites to get a sum of 5; (1,4), (2,3), (3,2) and (4,1). So there are only four possibilities so the probability of getting a sum of 5 is: 4/36 or 0.111 or 11.11 percent
Again, first we have to identify the possibilities of getting a sum 12 which is (6,6) that means 1/36 or 0.02777 or 2.777 percent

Problem 2.8

No, living below the poverty line and speaking a foreign language at home is not disjoint because 4.2 percent fall into both categories.
Venn Diagram

library(VennDiagram)

## Loading required package: grid

## Loading required package: futile.logger

grid.newpage()
draw.pairwise.venn(area1= 14.6, area2= 20.7, cross.area= 4.2, c("Below Poverty Line", "Foreign Language Speakers"))

## (polygon[GRID.polygon.1], polygon[GRID.polygon.2], polygon[GRID.polygon.3], polygon[GRID.polygon.4], text[GRID.text.5], text[GRID.text.6], text[GRID.text.7], text[GRID.text.8], text[GRID.text.9])

The percent of Americans live below the poverty line and English speaker at home is: English Speakers: 1 - 0.27 = 0.793 or 79.3 percent Answer = > 0.793 x 0.146 = 0.115778 or 11.5778 percent Americans live under the poverty line who speak English at home
% of Americans live below poverty line or speak foreign language at home: => 0.146 + 0.207 - 0.042 => 0.311 or 31.1 percent Americans live below poverty line or speak foreign language at home
% of Americans live above the poverty line and speak English only at home: => (1 - 0.146) x 0.793 => 0.677222 or 67.7222 percent Americans live above poverty line and speak only English at home
The event that someones lives below the poverty line is NOT independent of the event that the person speaks a foreign language at home because of the factor that the foreign language spoken at home could provide the information about the probability of someone lives below the poverty line.

Problem 2.20

Probability that a randomly chosen male respondent or his partner has blue eyes:

=> P(A or B) = P(A) + P(B) - P(A & B) => (114 / 204) + (108 / 204) - (78 / 204) => 0.70588 OR 70.588 is the probability that a male or his partner has blue eyes.

Probability that a randomly chosen male respondent with blues eyes has a partner with blue eyes:

=> 78 / 114 =0.68421 or 68.421%

Probability that a randomly chosen male with brown eyes has partner with blue eyes?

=> Male with brown eyes = 54, male with brown eyes and partner with blues = 19 => 19 / 54 = 0.35185 = 35.185 percent is the probability ofrandomly chosen male respondent of brown eyes with partner having blue eyes.

Probability that a randomly chosen male respondent with green eyes having a partner with blue eyes:

=> Male with green eyes = 36, male with green eyes and partner with blue eyes = 11 => 11 / 36 = 0.30555 or 30.555 percent of probability that a male with green eyes are having a partner with blue eyes

Does it appear that the eye colors of male respondents and their partners are independent? Explain your reasoning.

It seems that the the eye colors of male respondents and their partners are not independent because of the factor that a man with blue eyes have different probabilities to have their partners with blue, green and brown eyes. With reference to part (b) and (c), it also indicates that the probabilities are different and if they were independent, their probabilities should have had same.

Problem 2.30

Probability of drawing a fiction book first and then a hardcover book second, when drawing without replacement

=> P(Hardcover first) x P(Paperback fiction) = (28 / 95) x (58 / 94) => 0.18185 = 18.185 %

Probability of drawing a fiction book first and then a hardcover book second, when drawing without replacement

=> P(Fiction) x P(Hardcover) = (72 / 95) x (28 / 94) => 0.225755 = 22.5755 %

Probability of scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book.

=> P(Fiction) x P(Hardcover) = (72 / 95) x (28 / 95) => 0.22337 = 22.337%

Yes because while calculating the probability of hardcover book second, the first book was placed back and hence the denominator was changed from 94 to 95 that’s why although it slightly changed the result from 22.5755 to 22.337 but because the difference in denominator was just 1 that’s why it didn’t make huge difference.

Problem 2.38

read.csv(file="Revenue.csv", header=TRUE, sep=",")

##                        i  X  No.bags  X1.bag  X2.bags  Total
## 1                     xi NA   $0.00  $25.00   $60.00      NA
## 2              P(x = xi) NA     0.54    0.34     0.12     NA
## 3           xi x P(x=xi) NA        0     8.5      7.2  15.70
## 4              xi - Mean NA    -15.7     9.3     44.3     NA
## 5           (xi - Mean)2 NA   246.49   86.49  1962.49     NA
## 6 (xi - Mean)2 x P(x=xi) NA 133.1046 29.4066 235.4988 398.01

Revenue $15.70
Variance $398.01
Standard Deviation $19.95

How much should revenue should the airline expect from 120 passengers and with what standard deviation.

Expected Revenue for 120 Passengers => 120 x 15.70 = $1884 Variance => 120 x 1884 = 47761.2 Standard deviation => (Sq.rt of 47761.2) Standard deviation = $218.5434

Problem 2.44

Income <- data.frame(c("< 10", "10-14", "15-24", "25-34","35-49", "50 - 64", "65-74", "75-99", "> 100"), c(2.2, 4.7, 15.8, 18.3, 21.2, 13.9, 5.8, 8.4, 9.7))
colnames(Income) <- c("Category", "Percent")

barplot(Income$Percent, names.arg=Income$Category, xlab = "Income Group", ylab = "Population percentage")

Describe the distribution of total personal income.

The barplot shows that the data is symmetric and multimodal (more than 2 modes) and the most people’s income is between $35,000 - $49,999.

Probability that a randomly chosen US resident makes less than $50,000 per year.

The probability that a randomly chosen US resident makes less than $50,000 a year is 62.20% (0.022 + 0.047 + 0.158 +0.183 + 0.212)

Probability that a randomly chosen US resident makes less than $50,000 per year and is female

The probability that a randomly chosen US emale resident making less than $50,000 per year is 0.622 x 0.41 = 0.25502 or 25.502

The same data source indicates that 71.8 percent of females make less than $50,000 per year. Use this value to determine whether or not the assumption you made in part (c) is valid.

= 0.718 x 0.622 => 0.446596 or 44.6596

The assumption made in part (c) is wrong because high number of females make less than $50,000 a year than it was expected in part (c)

Data 606 - Homework 2

Habib U Khan

February 17, 2019