Part 1 Types of Distribution

pdf & cdf

  • Probability Density Functions (pdfs):

    • pdf allows us to obtain the area (probabiliyty) under the distribution curve of any continuous random variable given a bound.
  • Cumulative Distribution Functions (cdfs):

    • cdf allows us to obtain the probability that the random variable X is less than or equal to x.
  • Example:

    • Take a 6-sided dice for example, the probability of each side is \(\frac{1}{6}\).

      • The probability density function of 1 is the probability of landing on 1 which is \(\frac{1}{6}\) or 16.67%. And the probability density function 2, 3, 4, 5, 6, respectively are all the same which is equal to \(\frac{1}{6}\).

      • The cumulative distribution of 1 is the probability that the next roll will take a value less than or equal to 1 and the probability of that is \(\frac {1}{6}\) or 16.67% because of the only to get is to throw a 1. The cumulative distribution of 2 is \(\frac{1}{6} + \frac{1}{6} = 1/3\) or 33.33% as there are two ways of getting a 2 or below.

  • PDF & CDF Graphing Examples

    # Graph pdf
    x<-seq(from=-3,to=+3,length.out=100) 
    plot(x,dnorm(x))

    # Graph cdf
    sample <- data.frame(x = c(-3,3))
    
    ggplot(sample, aes(x = x)) +
      stat_function(fun = pnorm)

Normal Distribution (Gaussian Distribution)

  • Description: A distribution that is center symmetrically around the mean of the data set, the data closer to the mean are more frequent in occurrence than data further away from the mean.

  • Parameters: The mean (\(\mu\)), The standard deviation (\(\sigma\)),

    • 68% of data are within +/- 1 standard deviation away from the mean

    • 95% of data are within +/- 2 standard deviation away from the mean

    • 99.7% of data are within +/- 3 standard deviation away from the mean

  • Example: Individual testosterone level, SAT scores, Shoe size, Birth weight…

Binomial

  • Description: A statistical distribution that summarizes the probability of observing a certain outcome when performing a series of tests for which there are only two possible outcome.

  • Parameters: Success rate(p), total number of observations is fiexd, each observation is independent, each observation can only represent one of two outcomes.

  • Example: coin flip, statistical results that can be answered by T/F.

Poisson

  • Description: A probability distribution that is used to show how many times an event is likely to occur over a specified period of time. It is a discrete function which means that the variable can’t take all values in any continuous range, for example the whole numbers, 1, 2, 3, 4…

  • Parameters: Average event rate (\(\lambda\)t), \(\lambda\) is the rate, t is the interval

  • Example: Number of cars pass an intersection in one cycle of traffic light

Part 2 Converge of Distributions

  • We set the following parameters for the problem:

    • Total number of procedures (N): 20

    • Total number of death resulted from this procedure (x): 3

    • The success rate = the death rate (\(\pi\)) : 0.45

  • Binomial Model:

    # Set parameters
    n <- 20
    x <- 3
    p <- .45
    pbinom(3,20,0.45,lower.tail = F)
    ## [1] 0.9950666
  • Poisson Model

    # applied poisson model function
    ppois(
          q=0.45,
          lambda = n*p,
          lower.tail = FALSE, 
          log.p = FALSE
    )
    ## [1] 0.9998766