dice_rolls | |
---|---|
a) | P(getting sum of 1) = 0 |
b) | P(getting sum of 5) = countOfDifferentSum[(1,4),(2,3),(3,2),(4,1)] / totalCountOfDiceNumbers = 4/36 = 1/9 |
c) | P(getting sum of 12) = countOfDifferentSum[(6,6)] / totalCountOfDiceNumbers = 1/36 |
Poverty:Yes | Poverty:No | Total | |
---|---|---|---|
English:Yes | 0.104 | 0.689 | 0.793 |
English:No | 0.042 | 0.165 | 0.207 |
Total | 0.146 | 0.854 | 1.000 |
is_it_disjoint |
---|
No, not a disjoint because there’s 4.2% that are both |
P(Poverty:Yes and English:Yes) = P(Poverty:Yes) - P(Poverty:Yes and English:No) = 14.6% - 4.2% = 10.4% = 0.104
P(Poverty:Yes or English:No) = P(Poverty:Yes) + P(English:No) - P(Poverty:Yes and English:No) =20.7%+14.6%-4.2%=31.1%= 0.311
P(Poverty:No and English:Yes) = 1 - P(Poverty:Yes or English:No) = 1 - 0.311 = 0.689 = 68.9%
P(Poverty:Yes|English:No) = P(Poverty:Yes and English:No) / P(English:No) = 0.042 / 0.207 = ~ 0.203, not equal
to 0.042 therefore it's not independent.
P(Sm_Blue or Pf_Blue) = [P(Sm_Blue)+P(Pf_Blue)-P(Sm_Blue and Pf_Blue)] / P(total_population)
= (108+114-78)/204
= 144/204
= 0.71
P(Sm_Blue|Pf_Blue) = [P(Sm_Blue)/P(total_population)] / [P(sm_blue_total_population)/P(total_population)]
= (78/204) / (114/204)
= 78 / 114
= 0.684
P(Sm_Brown|Pf_Blue) = [P(Sm_Brown)/P(total_population)] / [P(sm_brown_total_population)/P(total_population)]
= (19/204) / (54/204)
= 19/54
= 0.352
P(Sm_Green|Pf_Blue) = [P(Sm_Green)/P(total_population)] / [P(sm_green_total_population)/P(total_population)]
= (11/204) / (36/204)
= 11/36
= 0.305
From c), if we consider P(Sm_Green|Pf_Blue) = 0.305 and P(Pf_Blue) = 108/204 = 0.53. We see that P(Sm_Green|Pf_Blue)
and P(Pf_Blue) are not equal, therefore not independent.
Type | X | P(X) | X*P(X) | Xi-E(X) | (Xi-E(X))^2 | P(X)*(Xi-E(X))^2 |
---|---|---|---|---|---|---|
1st bag | 25 | 0.34 | 8.5 | 9.3 | 86.49 | 29.4066 |
2 bags | 60 | 0.12 | 7.2 | 44.3 | 1962.49 | 235.4988 |
no luggage | 0 | 0.54 | 0.0 | -15.7 | 246.49 | 133.1046 |
expected_value | variance | standard_deviation |
---|---|---|
15.7 | 398.01 | 19.95 |
population | airline_revenue | revenue_per_passenger | variance | sd |
---|---|---|---|---|
120 | 1884 | 15.7 | 47761.2 | 218.5434 |
Income | Total |
---|---|
$1 to $9,999 or loss | 2.2% |
$10,000 to $14,999 | 4.7% |
$15,000 to $24,999 | 15.8% |
$25,000 to $34,999 | 18.3% |
$35,000 to $49,999 | 21.2% |
$50,000 to $64,999 | 13.9% |
$65,000 to $74,999 | 5.8% |
$75,000 to $99,999 | 8.4% |
$100,000 or more | 9.7% |
peak_is_in_between |
---|
$35,000 to $49,999 |
library(ggplot2)
barplot(Total1, names.arg = Income, main="Personal Income Distribution", ylab="Total", col="lightblue")
The peak of the distribution is skewed to the right and is in between $35,000 to $49,999.
## [1] "P(less than $50,000) = 62.2 % = 0.622"
## [1] "P(less than $50,000 and is female) = 0.25502 = 25.502 %"
Assuming that the personal income and gender are independent.
P(less than $50,000|71.8% of females) = P(less than $50,000 and females) / P(females)
= (p/100 * female_percent * 0.718) / 0.25502
= 0.25502 * 0.718 / 0.25502
= 0.718
We noticed that P(less than $50,000|71.8% of females) = 0.718 is not equal to P(less than $50,000 and is female) = 0.25502 in c), threfore the assumption made is not because the gender and the icome are not independent.