The Law of Large Numbers (LLN)

Random Number Generation, Sampling, and Simulation with Julia

The Law of Large Numbers (LLN) is a fundamental concept in probability and statistics. It essentially states that as the number of trials or observations increases, the average of the results will converge towards the expected value or true mean. In simpler terms, the more data you have, the more reliable your sample average becomes.

Demonstration

Here’s how we can illustrate the LLN in the context of the Julia programming language with a practical example:

using Plots # For plotting (optional, but highly recommended for visualization)

# Simulate rolling a fair six-sided die

function roll_dice(n_rolls)
    return rand(1:6, n_rolls) # Generates n_rolls random integers between 1 and 6 (inclusive)
end

n_values = [10, 100, 1000, 10000, 100000, 1000000] # Different numbers of rolls

average_rolls = zeros(length(n_values))

for (i, n) in enumerate(n_values)
    rolls = roll_dice(n)
    average_rolls[i] = mean(rolls) # Calculate the average of the rolls
end

# Expected value of a fair six-sided die is 3.5
expected_value = 3.5

# Plotting the results
plot(n_values, average_rolls, 
     xlabel="Number of Rolls (n)", 
     ylabel="Average Roll", 
     title="Law of Large Numbers Demonstration",
     xscale=:log10, # Log scale for x-axis to better visualize the convergence
     label="Simulated Average",
     linewidth=2)

hline!([expected_value], label="Expected Value (3.5)", color=:red, linestyle=:dash) # Horizontal line for expected value

savefig("law_of_large_numbers.png")

println("Number of Rolls | Average Roll")
for (i, n) in enumerate(n_values)
  println("$(n)\t\t\t| $(average_rolls[i])")
end

Explanation and how it relates to Julia:

  1. Simulation: The roll_dice function simulates rolling a fair six-sided die multiple times using rand(1:6, n_rolls). This generates an array of n_rolls random integers between 1 and 6, representing the outcomes of the dice rolls.

  2. Averaging: The code then calculates the average of the dice rolls using the mean() function. This is the sample mean.

  3. Varying Sample Size: The n_values array holds different numbers of rolls (from small to large). The loop iterates through these values, calculating the average for each sample size.

  4. Convergence: As you can see from the plot (and the printed output), as the number of rolls (n) increases, the calculated average gets closer and closer to the expected value of a fair die roll, which is 3.5. This demonstrates the Law of Large Numbers: the sample average converges to the true expected value as the sample size grows.

  5. Julia’s Role: Julia’s ability to efficiently generate random numbers and perform numerical calculations (like mean()) makes it well-suited for demonstrating the LLN. The Plots.jl package allows for easy visualization of the convergence. The logarithmic scale on the x-axis helps to visualize the convergence across orders of magnitude.

Key takeaway: The Law of Large Numbers is a cornerstone of statistical inference. It justifies the use of sample averages to estimate population means, especially when dealing with large datasets. In Julia, we can easily simulate and visualize this principle, gaining a deeper understanding of its implications.