Demonstrate how to calculate cumulative distributions manually and using the function cumsum.
##Example 1: Three coins
Here X is the random variable for the number of heads that appear after tossing 3 fair coins.
three_coins <- data.frame(x = c(0, 1, 2, 3),
px = c(1/8, 3/8, 3/8, 1/8))
Now we build the cumulative distribution function, firstly by hand.
# set all points to the first probability value (will only keep i=1)
three_coins$cdf <- three_coins$px[1]
# loop over i = 2,3,4
for (i in seq(2, dim(three_coins)[1])){
three_coins$cdf[i] <- three_coins$cdf[i - 1] + three_coins$px[i]
}
# plot the result
plot(three_coins$x, three_coins$cdf)
Now we use the base R function cumsum to calculate the cumulative distribution function.
# calculate the cdf using the function
three_coins$cdf2 <- cumsum(three_coins$px)
# plot this identical version
plot(three_coins$x, three_coins$cdf2)
# you can check they are indentical by uncommenting the following
#View(three_coins)
##Example 2: Uniform distribution
Here X is a random variable with uniform probability of occuring for a range of 20 integer values.
# create data for the distribution function
n_values <- 20
uniform_df <- data.frame(x = seq(1, n_values), px = 1 / n_values)
# calculate the cdf using the function
uniform_df$cdf <- cumsum(uniform_df$px)
# plot this identical version
plot(uniform_df$x, uniform_df$cdf)
##Example 3: Random distribution
Here X is a random variable with random probability of occuring for a range of 20 integer values. Note that the probabilities must be normalized to sum to 1 before calculating the cdf.
# create data for the distribution function
n_values <- 20
random_df <- data.frame(x = seq(1, n_values), px = runif(n_values))
# normalize
random_df$px <- random_df$px / sum(random_df$px)
# calculate the cdf using the function
random_df$cdf <- cumsum(random_df$px)
# plot this identical version
plot(random_df$x, random_df$cdf)