Cumulative distributions

Demonstrate how to calculate cumulative distributions manually and using the function cumsum.

##Example 1: Three coins

Here X is the random variable for the number of heads that appear after tossing 3 fair coins.

three_coins <- data.frame(x = c(0, 1, 2, 3),
                          px = c(1/8, 3/8, 3/8, 1/8))

Now we build the cumulative distribution function, firstly by hand.

# set all points to the first probability value (will only keep i=1)
three_coins$cdf <- three_coins$px[1]
# loop over i = 2,3,4
for (i in seq(2, dim(three_coins)[1])){
  three_coins$cdf[i] <- three_coins$cdf[i - 1] + three_coins$px[i]
}

# plot the result
plot(three_coins$x, three_coins$cdf)

Now we use the base R function cumsum to calculate the cumulative distribution function.

# calculate the cdf using the function
three_coins$cdf2 <- cumsum(three_coins$px)

# plot this identical version
plot(three_coins$x, three_coins$cdf2)

# you can check they are indentical by uncommenting the following
#View(three_coins)

##Example 2: Uniform distribution

Here X is a random variable with uniform probability of occuring for a range of 20 integer values.

# create data for the distribution function
n_values <- 20
uniform_df <- data.frame(x = seq(1, n_values), px = 1 / n_values)

# calculate the cdf using the function
uniform_df$cdf <- cumsum(uniform_df$px)

# plot this identical version
plot(uniform_df$x, uniform_df$cdf)

##Example 3: Random distribution

Here X is a random variable with random probability of occuring for a range of 20 integer values. Note that the probabilities must be normalized to sum to 1 before calculating the cdf.

# create data for the distribution function
n_values <- 20
random_df <- data.frame(x = seq(1, n_values), px = runif(n_values))
# normalize
random_df$px <- random_df$px / sum(random_df$px)

# calculate the cdf using the function
random_df$cdf <- cumsum(random_df$px)

# plot this identical version
plot(random_df$x, random_df$cdf)