http://blog.linkedin.com/2014/12/17/the-25-hottest-skills-that-got-people-hired-in-2014/
Monday, November 30, 2015
http://blog.linkedin.com/2014/12/17/the-25-hottest-skills-that-got-people-hired-in-2014/
Coursera: "Generating conclusions about a population from noisy data"
Issues:
P(0) =
P(1) =
P(odd) =
P(any number 1-6) =
In groups, roll the die 12 times. Record the expected and actual counts of rolls for each number 1-6.
# set possible values for die roll die <- 1:6 # take a sample of 100 die rolls die_roll <- sample(die,100,replace=TRUE) table(die_roll)
## die_roll ## 1 2 3 4 5 6 ## 22 15 16 20 13 14
Re-run this simulation with either a two-sided coin or 10-sided die.
Suppose 10% of Americans have sleep apnea, and 3% of Americans have restless leg syndrome. How many Americans have at least one of the two sleep problems?
## (polygon[GRID.polygon.1], polygon[GRID.polygon.2], polygon[GRID.polygon.3], polygon[GRID.polygon.4], text[GRID.text.5], text[GRID.text.6], text[GRID.text.7], text[GRID.text.8])
## (polygon[GRID.polygon.9], polygon[GRID.polygon.10], polygon[GRID.polygon.11], polygon[GRID.polygon.12], text[GRID.text.13], text[GRID.text.14], text[GRID.text.15], text[GRID.text.16], text[GRID.text.17])
## (polygon[GRID.polygon.18], polygon[GRID.polygon.19], polygon[GRID.polygon.20], polygon[GRID.polygon.21], text[GRID.text.22], text[GRID.text.23], text[GRID.text.24], text[GRID.text.25])
## (polygon[GRID.polygon.26], polygon[GRID.polygon.27], polygon[GRID.polygon.28], polygon[GRID.polygon.29], text[GRID.text.30], text[GRID.text.31], text[GRID.text.32], text[GRID.text.33], text[GRID.text.34])
What is the probability that you will draw a spade OR an ace from a standard deck of playing cards?
Consider influenza epidemics for two parent heterosexual families.
Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has influenza is 10% and the probability that the mother has influenza is 9%.
What is the probability that both parents contracted influenza?
Use sample() to create vectors A & B, each containing at least 20 unique numbers. Only use numbers >= 0 and <= 50.
Find the union and intersection for A & B. Hint:
?union ?intersect
Can be discrete / categorical (binary) vs. continuous
- coin flip (D) - die roll (D) - web site traffic (D) - BMI (C) - BMI category (D) - IQ (C)
A PMF evaluated at a value corresponds to the probability that a discrete random variable takes that value
Rules:
Example: 6-sided die
PDF: probability that a continuous random variable takes a specific value
Rules:
Areas under PDFs correspond to probabilities for that random variable
Suppose the proportion of calls that are answered in a call center can be represented by f(x) = 2x for 0 < x < 1, and 0 otherwise. Is this a valid PDF?
What is the probability that 75% or fewer calls get addressed?
# find area of blue triangle 1.5 * 0.75/2
## [1] 0.5625
# find probability of this outcome pbeta(0.75, 2, 1)
## [1] 0.5625
CDF of a random variable (X) returns probability that random variable X is <= the value of x.
F(x) = P(x <= x)
Can be applied to discrete or continuous variables
Probability that random variable X is > the value of x.
S(x) = P(X > x)
S(x) = 1 - F(x)
Use pbeta() to calculate the probability that 40%, 50%, and 60% of calls are answered on a given day.
## [1] 0.16 0.25 0.36
Using R to approximate quantiles:
qbeta(0.5, 2, 1)
## [1] 0.7071068
Tells us that for right triangle with base 1 and height 2, we have 50% chance of answering about 70.7% of the calls for a given day
Using qbeta(), determine the percent likelihood of answering calls for a given day with the parameters below:
xkcd: lightning and statistics
http://californiadiver.com/were-not-on-the-menu-california-shark-attacks-down-91-in-past-60-years/
"Abalone divers may be at the greatest risk, statistically. The Stanford study shows abalone diving creates the greatest exposure to shark incidents, followed by surfing, scuba diving and swimming. In 2013, the chances of a shark attack on an abalone diver were one in 1.44 million. For surfers, the chances were one in 17 million, and for scuba divers, one in 136 million. Swimmers had the lowest chance of being attacked by a shark, with one attack for every 738 million beach visits."
P(A | B): P(A) given B has occurred
\[ P(A ~|~ B) = [P(A \cap B)] / [P(B)] \]
If A and B are independent, then:
\[ P(A ~|~ B) = P(A)*P(B)/P(B) = P(A) \]
A: roll 1
B: roll odd
What is the probability of rolling a 1 or an odd?
P(A) = 1/6
P(B) = 1/2
If you know the roll was an odd number, what is the probability the roll was 1?
\[ P(A | B) = [P(A \cap B)]/[P(B)] \] = P(A) / P(B)
= (1/6) / (3/6)
= 1/3
Useful tool for calculating conditional probaiblities
\[ P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ B^c)P(B^c)} \]
Two-Face has one unfair (heads on both sides) and one fair (heads & tails) coin in his pocket.
He takes one coin from his pocket at random, tosses it, and obtains a heads.
What is the probability that Two-Face flipped the fair coin?
\[ P(FC ~|~ H) = \] \[ = \frac{P(H ~|~ FC) P(FC)}{P(H ~|~ FC) P(FC) + P(H ~|~ UFC)P(UFC)} \] \[ = \frac{.5 * .5}{.5 * .5 + 1 * .5} \] \[ = \frac{.25}{.25 + .5} = \frac{.25}{.75} = 1/3 \]
Web demo: http://www.math.ucsd.edu/~crypto/Monty/monty.html
R demo: http://math.ucsd.edu/~crypto/cgi-bin/MontyKnows/monty2?0+4054
\(D^c\) : subject does not have disease
Specificity: \(P(- ~|~ D^c)\)
Prevalence: \(P(D)\)
\[ P(D ~|~ +) = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)} \]
\[ = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + {1-P(-~|~D^c)}{1 - P(D)}} \]
\[ = \frac{.997\times .001}{.997 \times .001 + .015 \times .999} = .062 \]
\(A^c\) is independent of \(B\)
\(A\) is independent of \(B^c\)
\(A^c\) is independent of \(B^c\)
What is the probability of getting two consecutive heads?
Steph Curry is currently shooting about 94% from the free throw line.
Assuming his free throw attempts are independent events, what is the probability of Steph making 10 free throws in a row?
Come up with with two examples of: