Binomial_Simulation

Demonstration on simulating random numbers from the binomial distribution using the Julia programming language.

using Distributions  # For the Uniform distribution (used in some methods)
using Plots # For plotting (optional, but recommended)

# Method 1: Using the built-in rand() function (Easiest and most efficient)

n = 10  # Number of trials
p = 0.3  # Probability of success

# Generate a single binomial random number
binomial_rn = rand(Binomial(n, p))
println("Single Binomial random number (Method 1): ", binomial_rn)

# Generate an array of binomial random numbers
num_samples = 1000
binomial_rns = rand(Binomial(n, p), num_samples)

# Verify mean and variance (optional)
mean_simulated = mean(binomial_rns)
var_simulated = var(binomial_rns)
println("Mean of simulated (Method 1): ", mean_simulated)
println("Variance of simulated (Method 1): ", var_simulated)


# Method 2: Using the definition (sum of Bernoulli trials)

function generate_binomial_rvs_bernoulli(n, p, num_samples)
    binomial_rvs = zeros(Int, num_samples)  # Preallocate for efficiency
    for i in 1:num_samples
        num_successes = 0
        for j in 1:n
            if rand() < p  # Simulate a Bernoulli trial
                num_successes += 1
            end
        end
        binomial_rvs[i] = num_successes
    end
    return binomial_rvs
end

binomial_rns_bernoulli = generate_binomial_rvs_bernoulli(n, p, num_samples)

# Verify mean and variance (optional)
mean_simulated_bernoulli = mean(binomial_rns_bernoulli)
var_simulated_bernoulli = var(binomial_rns_bernoulli)
println("Mean of simulated (Method 2): ", mean_simulated_bernoulli)
println("Variance of simulated (Method 2): ", var_simulated_bernoulli)


# Method 3: Using the inverse transform method (Less efficient for Binomial)

# Note: The inverse transform method is generally *less* efficient for the binomial distribution
# than the other methods, but I include it for completeness and because it's useful for other
# discrete distributions.

function generate_binomial_rvs_inverse(n, p, num_samples)
    binomial_rvs = zeros(Int, num_samples)
    cumulative_probabilities = zeros(n + 1) # Cumulative probabilities for 0 to n successes

    # Calculate and store cumulative probabilities *once* (important for efficiency)
    for k in 0:n
        cumulative_probabilities[k + 1] = cdf(Binomial(n, p), k) # Using Distributions.cdf
    end

    for i in 1:num_samples
        u = rand()
        k = 0
        while k < n && u > cumulative_probabilities[k + 1]
            k += 1
        end
        binomial_rvs[i] = k
    end
    return binomial_rvs
end

binomial_rns_inverse = generate_binomial_rvs_inverse(n, p, num_samples)

# Verify mean and variance (optional)
mean_simulated_inverse = mean(binomial_rns_inverse)
var_simulated_inverse = var(binomial_rns_inverse)
println("Mean of simulated (Method 3): ", mean_simulated_inverse)
println("Variance of simulated (Method 3): ", var_simulated_inverse)


# Plotting a histogram (optional)
histogram(binomial_rns, bins = -0.5:1:n+0.5,  xlabel="Binomial Outcome", ylabel="Frequency",
          title="Histogram of Binomial Random Variables", normalize=true, label="Simulated")

# Plot the PMF as points (optional, for comparison)
x_vals = 0:n
pmf_vals = pdf.(Binomial(n, p), x_vals) # Using Distributions.pdf and broadcasting
plot!(x_vals, pmf_vals, seriestype = :scatter, label = "PMF", markersize = 5)

savefig("binomial_histogram.png")



# --- Key Improvements and Explanations ---

# 1. Three Methods: The code demonstrates three ways to generate binomial random numbers:
#    - Using the built-in `rand(Binomial(n, p))` (most efficient).
#    - Simulating Bernoulli trials (conceptually clear).
#    - Using the inverse transform method (less efficient for Binomial, but useful to know).

# 2. Efficiency: Preallocation of arrays (`binomial_rvs`) is used for efficiency, especially important for large sample sizes.

# 3. Verification: The code calculates and prints the mean and variance of the generated random numbers to verify that they are close to the expected theoretical values (n*p and n*p*(1-p) respectively).

# 4. Plotting: The plotting code includes both a histogram of the generated samples and the PMF for direct visual comparison. The `xticks` are set to correctly represent the discrete nature of the data. Bins are adjusted to center on integers.

# 5. Clear Comments: The comments explain the different methods and important points.

# 6. Use of Distributions package functions: The code uses `cdf` and `pdf` from the `Distributions` package to calculate and plot the PMF.  This is the recommended way to work with probability distributions in Julia.

Key Points and Recommendations:

Use rand(Binomial(n, p)): For generating binomial random numbers in Julia, the most efficient and recommended method is to use the built-in rand(Binomial(n, p)) function. It’s optimized for this purpose.
Inverse Transform (Generally Less Efficient for Binomial): While the inverse transform method is shown for completeness, it’s generally not the most efficient way to generate binomial random numbers. It’s more useful for other discrete distributions where direct methods might not be available.
Bernoulli Trials: Simulating the binomial distribution as a sum of Bernoulli trials is conceptually clear and can be useful for understanding the underlying process. However, it’s not as efficient as the built-in rand() function.
Plotting: Always plot a histogram of your generated random numbers and compare it to the PMF to verify that the simulation is working correctly. This is a vital step in any simulation.