The normal and lognormal distributions are closely related, but the variability of the former is the result of independent additive effects, whereas the variability of the latter is based on independent multiplicative effects. Here we do a simple simulation to explore these two types of effects.

The article Log-normal Distributions across the Sciences: Keys and Clues suggests the following as a way to get an intuition about the difference between additive and multiplicative effects:

Some basic principles of additive and multiplicative effects can easily be demonstrated with the help of two ordinary dice with sides numbered from 1 to 6. Adding the two numbers, which is the principle of most games, leads to values from 2 to 12, with a mean of 7, and a symmetrical frequency distribution. The total range can be described as 7 plus or minus 5 (that is, 7 ± 5) where, in this case, 5 is not the standard deviation. Multiplying the two numbers, however, leads to values between 1 and 36 with a highly skewed distribution. The total variability can be described as 6 multiplied or divided by 6 (or 6 ×/ 6). In this case, the symmetry has moved to the multiplicative level.

Simulation 1: Ordinary dice

sum_dice <- function(){
    sample(1:6, 1, replace = TRUE) + sample(1:6, 1, replace = TRUE)
}


multiply_dice <- function(){
    sample(1:6, 1, replace = TRUE) * sample(1:6, 1, replace = TRUE)
}


samples <- 10000

df1.results <- data.frame(sums = replicate(samples, sum_dice()), 
                          products = replicate(samples, multiply_dice()))



# plot the histograms: 
df1.results %>% 
    ggplot(aes(x = sums)) + 
    geom_bar(fill = "steelblue4") + 
    scale_x_continuous(breaks = 1:12) + 
    labs(title = "Distribution of possible outcomes when you sum the values of 2 dice", 
         subtitle = sprintf("Based on %i replications", 
                            samples)) + 
    theme_light() + 
    theme(panel.grid.minor = element_line(colour="grey95"), 
        panel.grid.major = element_line(colour="grey95"))

Note that the distribution is symmetric. That’s not the case when effects are multiplicative:

df1.results %>% 
    ggplot(aes(x = products)) + 
    geom_bar(fill = "steelblue4") + 
    scale_x_continuous(breaks = 1:36) + 
    labs(title = "Distribution of possible outcomes when you multiply the values of 2 dice", 
         subtitle = sprintf("Based on %i replications", 
                            samples)) + 
    theme_light() + 
    theme(panel.grid.minor = element_line(colour="grey95"), 
        panel.grid.major = element_line(colour="grey95"))

Simulation 2: “Continuous” dice

The discrete nature of the outcomes of a die slightly obscures the multiplicative effects. Let’s repeat the experiment with “continuous” dice, which can retun any real number between 1 and 6.

sum_dice_continuous <- function(){
    runif(1, 1, 6) + runif(1, 1, 6)
}


multiply_dice_continuous <- function(){
    runif(1, 1, 6) * runif(1, 1, 6)
}


samples <- 10000

df2.results <- data.frame(sums = replicate(samples, sum_dice_continuous()), 
                          products = replicate(samples, multiply_dice_continuous()))



# plot the histograms: 
df2.results %>% 
    ggplot(aes(x = sums)) + 
    geom_histogram(fill = "steelblue4") + 
    scale_x_continuous(breaks = 1:12) + 
    labs(title = "Distribution of possible outcomes when you sum the values of 2 \"continuous\" dice", 
         subtitle = sprintf("Based on %i replications", 
                            samples)) + 
    theme_light() + 
    theme(panel.grid.minor = element_line(colour="grey95"), 
        panel.grid.major = element_line(colour="grey95"))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

And now the multiplicative effects:

df2.results %>% 
    ggplot(aes(x = products)) + 
    geom_histogram(fill = "steelblue4") + 
    scale_x_continuous(breaks = 1:36) + 
    labs(title = "Distribution of possible outcomes when you multiply the values \nof 2 \"continuous\" dice", 
         subtitle = sprintf("Based on %i replications", 
                            samples)) + 
    theme_light() + 
    theme(panel.grid.minor = element_line(colour="grey95"), 
        panel.grid.major = element_line(colour="grey95"))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The shape of the curve is much clearer here.

# Now let's try one more thing: is the distribution of the $log(product)$ approximately normal? 

df2.results %>% 
    ggplot(aes(x = log(products))) + 
    geom_histogram(fill = "steelblue4") + 
    labs(title = "Distribution of possible outcomes when you multiply the values of 2 dice", 
         subtitle = sprintf("Based on %i replications", 
                            samples)) + 
    theme_light() + 
    theme(panel.grid.minor = element_line(colour="grey95"), 
        panel.grid.major = element_line(colour="grey95"))

Some Interpretations:

Analogous to the regression context, we can think of multiplicative effects as follows: if the total output \(Y = ab\), then the marginal impact of \(a\) is dependent on the level of \(b\). When the latter is held at \(b_1\), then the marginal impact of increasing \(a\) by 1 is to increase \(Y\) by \(b_1\).

However, if \(b\) is held at \(b_2\), then the marginal impact of \(a\) is \(b_2\).

This dependence does not exist in the additive context.

Some Implications:

It’s probably a good idea to characterize lognormally distributed data using the median rather than the mean, both because it fits more intuitively with our idea of a “typical” value, and because it is easier to transform to get the mean/median of the normal distribution on the log-scale.

For example if \(Y\) is the median on the untransformed scale, then \(lnY\) is the mean/median on the symmetric log-scale.