| dice_rolls | |
|---|---|
| a) | P(getting sum of 1) = 0 |
| b) | P(getting sum of 5) = countOfDifferentSum[(1,4),(2,3),(3,2),(4,1)] / totalCountOfDiceNumbers = 4/36 = 1/9 |
| c) | P(getting sum of 12) = countOfDifferentSum[(6,6)] / totalCountOfDiceNumbers = 1/36 |
| Poverty:Yes | Poverty:No | Total | |
|---|---|---|---|
| English:Yes | 0.104 | 0.689 | 0.793 |
| English:No | 0.042 | 0.165 | 0.207 |
| Total | 0.146 | 0.854 | 1.000 |
| is_it_disjoint |
|---|
| No, not a disjoint because there’s 4.2% that are both |
P(Poverty:Yes and English:Yes) = P(Poverty:Yes) - P(Poverty:Yes and English:No) = 14.6% - 4.2% = 10.4% = 0.104P(Poverty:Yes or English:No) = P(Poverty:Yes) + P(English:No) - P(Poverty:Yes and English:No) =20.7%+14.6%-4.2%=31.1%= 0.311P(Poverty:No and English:Yes) = 1 - P(Poverty:Yes or English:No) = 1 - 0.311 = 0.689 = 68.9%P(Poverty:Yes|English:No) = P(Poverty:Yes and English:No) / P(English:No) = 0.042 / 0.207 = ~ 0.203, not equal to 0.042 therefore it's not independent.P(Sm_Blue or Pf_Blue) = [P(Sm_Blue)+P(Pf_Blue)-P(Sm_Blue and Pf_Blue)] / P(total_population)
= (108+114-78)/204
= 144/204
= 0.71P(Sm_Blue|Pf_Blue) = [P(Sm_Blue)/P(total_population)] / [P(sm_blue_total_population)/P(total_population)]
= (78/204) / (114/204)
= 78 / 114
= 0.684P(Sm_Brown|Pf_Blue) = [P(Sm_Brown)/P(total_population)] / [P(sm_brown_total_population)/P(total_population)]
= (19/204) / (54/204)
= 19/54
= 0.352P(Sm_Green|Pf_Blue) = [P(Sm_Green)/P(total_population)] / [P(sm_green_total_population)/P(total_population)]
= (11/204) / (36/204)
= 11/36
= 0.305
From c), if we consider P(Sm_Green|Pf_Blue) = 0.305 and P(Pf_Blue) = 108/204 = 0.53. We see that P(Sm_Green|Pf_Blue) and P(Pf_Blue) are not equal, therefore not independent.| Type | X | P(X) | X*P(X) | Xi-E(X) | (Xi-E(X))^2 | P(X)*(Xi-E(X))^2 |
|---|---|---|---|---|---|---|
| 1st bag | 25 | 0.34 | 8.5 | 9.3 | 86.49 | 29.4066 |
| 2 bags | 60 | 0.12 | 7.2 | 44.3 | 1962.49 | 235.4988 |
| no luggage | 0 | 0.54 | 0.0 | -15.7 | 246.49 | 133.1046 |
| expected_value | variance | standard_deviation |
|---|---|---|
| 15.7 | 398.01 | 19.95 |
| population | airline_revenue | revenue_per_passenger | variance | sd |
|---|---|---|---|---|
| 120 | 1884 | 15.7 | 47761.2 | 218.5434 |
| Income | Total |
|---|---|
| $1 to $9,999 or loss | 2.2% |
| $10,000 to $14,999 | 4.7% |
| $15,000 to $24,999 | 15.8% |
| $25,000 to $34,999 | 18.3% |
| $35,000 to $49,999 | 21.2% |
| $50,000 to $64,999 | 13.9% |
| $65,000 to $74,999 | 5.8% |
| $75,000 to $99,999 | 8.4% |
| $100,000 or more | 9.7% |
| peak_is_in_between |
|---|
| $35,000 to $49,999 |
library(ggplot2)
barplot(Total1, names.arg = Income, main="Personal Income Distribution", ylab="Total", col="lightblue")
The peak of the distribution is skewed to the right and is in between $35,000 to $49,999.
## [1] "P(less than $50,000) = 62.2 % = 0.622"
## [1] "P(less than $50,000 and is female) = 0.25502 = 25.502 %"
Assuming that the personal income and gender are independent.
P(less than $50,000|71.8% of females) = P(less than $50,000 and females) / P(females)
= (p/100 * female_percent * 0.718) / 0.25502
= 0.25502 * 0.718 / 0.25502
= 0.718We noticed that P(less than $50,000|71.8% of females) = 0.718 is not equal to P(less than $50,000 and is female) = 0.25502 in c), threfore the assumption made is not because the gender and the icome are not independent.