This is the first of the challenge questions for the course. They are meant to be challenging and stretch your skills, but remember you can always ask for help or collaborate with your classmates. There are two articles at the bottom which will help provide broader context for the questions we’re studying here.

The evolution of wealth and income in a simple society

Consider the wealth and labor income distributions for a single generation of a simple society, described in the dataset simple_society.csv. People in this society work, save, and accumulate wealth over generations. For simplicity, we’ll assume that the labor income for a given household is constant over time and that every household saves a fixed portion of their total income. A household’s wealth accumulation in this society is characterized by two equations:

\[\begin{align} \text{(Total income)}~~~~~~~~ Y_t &= rW_t + i \\ \text{(Wealth accumulation)}~~~ W_{t+1} &= sY_t + (1+x_t)W_t, \end{align}\]

where \(Y_t\) is total household income in generation \(t\), \(W_t\) is total household wealth in generation \(t\), \(W_{t+1}\) is household wealth in generation \(t+1\), \(i\) is household labor income, \(r\) is the rate of interest on assets, \(s\) is the savings rate, and \(x_t\) is a random variable representing other factors (we’ll call \(x_t\) “luck”).1 (The second component is expressed as \(1+x_t\) so that we can interpret \(x_t\) as the non-savings-related growth/decline in wealth, i.e. \(x_t=1\) means wealth grew by 100%, \(x_t=-1\) means 100% of wealth was lost, and \(x_t=0\) means no change.)

The first equation describes a household’s total income for one generation. In words, it says, “total income for the household in generation \(t\) is the sum of capital rents earned on their wealth (\(rW_t\)) and labor income (\(i\))”.

The second equation describes how wealth accumulates. In words, it says, “wealth for the household in generation \(t+1\) is the sum of income that was saved by generation \(t\) (\(sY_t\)) and any other losses/gains (\((1+x_t)W_t\)).”

We’re going to track this society over two generations and see how the distributions of wealth and total income evolve. For simplicity, the amounts have been normalized so that wealth in the first generation ranges from 50 to 150. The savings rate is 20% (i.e., \(s=0.2\)) and the interest rate is 8% (i.e., \(r=0.08\)). “Luck” (\(x_t\)) is a random variable distributed uniformly over \([-1,1]\), i.e., \(x_t \sim U[-1,1]\).

To start, we’ll load tidyverse and cowplot, set the savings rate and interest rate parameters, and set the “seed” for random number generation later:

library(tidyverse)
library(cowplot)
interest_rate <- 0.08
savings_rate <- 0.2
set.seed(210)

The seed is a parameter that controls random number generation for simulations. By setting the seed to a specific number (the number itself doesn’t matter), we ensure that our results can be replicated later and by others.

Questions

1. Analyzing the first generation

a. Read in the data using read_csv. If each row is a household, how many households are in this society? Assign the number of households to a variable n_households for later use.

Hint: You can use nrow() to calculate the number of rows.

b. Calculate total income (\(Y_t\)) for the first generation, then se the summary function to display summary statistics about the first generation.

Hint: Use mutate.

c. Plot the following and combine the plots into a single output: a histogram of wealth, a histogram of total income, a scatterplot relating wealth (x axis) to labor income (y axis), and a scatterplot relating wealth (x axis) to total income (y axis)

Hint: Use the plot_grid function from the cowplot package.

d. What amount of wealth qualifies a household to be in the top 10% of the wealth distribution?

Hint: Use the quantile() function.

e. Calculate the share of wealth owned by the top 10% of the population. How much does the bottom 10% hold?

Hint: Calculate the total amount of wealth in the society, and the amount of wealth owned by the top 10% and bottom 10%.

f. Describe the first generation’s distributions of wealth and income. How much inequality do you see?
g. How do labor income and total income relate to wealth in this generation? What accounts for any differences between them?

2. Analyzing the second generation

a. Generate vectors of “luck” (as described above) for each generation. What is the average value of luck in each generation?

Hint: Use runif() to generate luck as draws from a uniform distribution with min -1 and max 1.

b. Iterate the society’s wealth and total income forward one generation and make a two-panel plot showing an overlay of histograms of wealth in each generation and an overlay of wealth-total income scatterplots for each generation. Color the second generation’s wealth and total income in coral.

Hint: To iterate it forward one generation, you’ll need to calculate \(W_{t+1}\) and then \(Y_{t+1}\). Supposing you read the data in as the object data, then you would calculate \(W_{t+1}\) as

data <- data %>% mutate(second_generation_wealth = savings_rate*total_income + (1 + luck)*wealth)

To make the overlays easier to read, set the alpha to 0.25 on the second generations’ plots.

c. What share of the total wealth in society does the top 10% hold in the second generation? What about the bottom 10%?
d. Describe the changes between the wealth distribution in the first and second generation. What accounts for these changes?
e. How has the relationship between total income and wealth changed in this generation compared to the previous one? What accounts for the change?

3. Putting it in context

Read the following articles by John Cassidy (pdf version) and Benjamin Wallace-Wells (pdf version). Finally, purely for your edification, check out this article about the magnitude of income inequality in the US over the last few decades.

a. The model in this challenge question is trying to capture some aspects of wealth accumulation and inequality. As with any model, it simplifies some features to try to focus on other relevant features. What relevant features do you feel this model captures well? What relevant features do you feel the model misses?

  1. A random variable is just a variable whose outcome is uncertain. For example, the outcome of a coin toss before you flip it is a random variable: it could come up heads, it could come up tails. Or the amount of money you get from a lottery ticket: before you scratch the ticket, you don’t know how much you’ll get.↩︎