1 Introduction

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

The Law of Large Numbers (LLN) is a fundamental theorem in probability theory and statistics. It describes the behavior of the sample mean of a random variable as the sample size increases.

1.1 There are two main versions of the Law of Large Numbers:

  1. Weak Law of Large Numbers (WLLN): The Weak Law of Large Numbers states that as the sample size (the number of observations or trials) increases, the sample mean (average) of those observations will converge in probability to the expected value (mean) of the random variable. In other words, as you collect more data, the sample mean becomes a better and better estimate of the true population mean.

    • Mathematically, if X1, X2, …, Xn are independent and identically distributed random variables with a common mean (μ) and finite variance, then as n (the sample size) approaches infinity:

      \((1/n) * (X_1 + X_2 + … + Xn)\) converges in probability to \(\mu\) .

  2. Strong Law of Large Numbers (SLLN): The Strong Law of Large Numbers is a more powerful version of the LLN. It states that as the sample size increases, the sample mean converges almost surely (with probability 1) to the expected value of the random variable. This means that not only does the sample mean converge in probability to the population mean, but it almost surely equals the population mean for sufficiently large sample sizes.

    • Mathematically, for the same conditions as in the WLLN, the SLLN states that:

      \((1/n) * (X_1 + X_2 + … + X_n)\) converges almost surely to \(\mu\) as n approaches infinity.

The LLN is a fundamental concept in statistics and provides a theoretical basis for many statistical techniques. It underlines the idea that as you collect more and more data, your sample statistics (such as the sample mean) become increasingly reliable estimators of the population parameters (such as the population mean).

2 Example: Heads on a fair coin toss

A fair coin flip should be heads \(50\%\) of the times.

2.1 Create Variables for Observations

  • N represents the total number of observations, or in this example, coin flips.

Comment out set.seed() to achieve randomness on subsequent runs.

N <- 10000  
set.seed(1963)  # added for the example - comment out for test

2.2 Create Variables for Iterations

Now we will set 3 variables to simulate the coin flips.

  • flip_outcome - stores the sample flips as a 0 or 1. The number of flips will me set by the value of N set previously.
  • flip_cumulative_sum - stores a running total of the occurrences of a value of “1”, say heads.
  • running_avg - stores the running avg with each flip.
?sample  # sample takes a sample of the specified size from the elements of x using either with or without replacement.

?cumsum # Returns a vector whose elements are the cumulative sums, products, minima or maxima of the elements of the argument.

flip_outcome <- sample(x = 0:1,        #    either a vector of one or more elements from which to choose, or a positive integer.
                    size = N,          #   a non-negative integer giving the number of items to choose.
                 replace = T          #  should sampling be with replacement?
                      )

Explore what you just created.

flip_outcome[1:10]
##  [1] 1 0 0 1 0 1 0 1 1 0
flip_cumulative_sum <- cumsum(flip_outcome) 
flip_cumulative_sum[1:10]
##  [1] 1 1 1 2 2 3 3 4 5 5
running_avg <- flip_cumulative_sum/(1:N)
running_avg[1:10]
##  [1] 1.0000000 0.5000000 0.3333333 0.5000000 0.4000000 0.5000000 0.4285714
##  [8] 0.5000000 0.5555556 0.5000000
# r.avg2 <- s/length(N)
# r.avg == r.avg2

3 Store the Means

Assign the running statistics gathered to the r.stats variable.

Show the results (should limit to approx 100 observations in this list)

r.stats <- round(x = cbind(flip_outcome,
                           flip_cumulative_sum,
                           running_avg
                           ),
                 digits =  3)[1:10,]

# 10 represents the number of observations to display to look at the results as they are run. It should not exceed 100 for practical purposes.

print(r.stats)
##       flip_outcome flip_cumulative_sum running_avg
##  [1,]            1                   1       1.000
##  [2,]            0                   1       0.500
##  [3,]            0                   1       0.333
##  [4,]            1                   2       0.500
##  [5,]            0                   2       0.400
##  [6,]            1                   3       0.500
##  [7,]            0                   3       0.429
##  [8,]            1                   4       0.500
##  [9,]            1                   5       0.556
## [10,]            0                   5       0.500

4 Graph the Results

Create a plot chart to illustrate how the means of the sample approximately equals the population with large sample sizes. scipen used to influence the x axis to use whole numbers for large observation counts

The plot uses line charts to reflect

  1. the running averages of the coin flips and
  2. the expected average of the population (.5).
?options.      # Allow the user to set and examine a variety of global options which affect the way in which R computes and displays its results.
## No documentation for 'options.' in specified packages and libraries:
## you could try '??options.'
options(scipen = 10)    # integer. A penalty to be applied when deciding to print numeric values in fixed or exponential notation. Positive values bias towards fixed and negative towards scientific notation: fixed notation will be preferred unless it is more than scipen digits wider.

plot(x    = running_avg , 
     ylim = c(.30, .70), 
     type = "l", 
     xlab = "Observations",
     ylab = "Probability", 
     lwd  = 2
     )

?lines  # A generic function taking coordinates given in various ways and joining the corresponding points with line segments.

# to add the 50% chance line
lines(x = c(0,N), 
      y = c(.50,.50),
      col="red", 
      lwd = 2
      )

4.1 Zoom In

To see better.

plot(x    = running_avg[1:100], 
     ylim = c(.30, .70), 
     type = "l", 
     xlab = "Observations",
     ylab = "Probability", 
     lwd  = 2
     )

?lines  # A generic function taking coordinates given in various ways and joining the corresponding points with line segments.

# to add the 50% chance line
lines(x = c(0,N), 
      y = c(.50,.50),
      col="red", 
      lwd = 2
      )