Brief Introduction

This analysis models the US population growth between 1790 and 1850 using an exponential growth model from Kinetic First Order. The population at a given time t (year since 1970) is described by the equation:

\[ P(t) = \alpha \cdot e^{k \cdot t} \]

where

Steps in the Analysis

  1. Define the Data: Create the data-set with US population data from 1790 to 1850, recorded at 10-year intervals. Population values are in millions.

  2. Simulate Noisy Data: Adding random noise to the population data to simulate real-world variability using a normal distribution with a standard deviation of 0.5.

  3. Visualize the Data: Plot the observed population data points.

  4. Fit the Model Using nls2: Use a brute-force grid search method to fit the exponential growth model to the data.

  5. Extract and Display Fitted Values: Extract the fitted values for \(\alpha\) and \(k\) from the model and display them.

  6. Optimize Parameters Using subplex: Refine the parameter estimates for \(\alpha\) and \(k\) by minimizing the sum of squared residuals.

  7. Plot and Compare the Fitted Curves: Overlay the fitted curves from both methods on the observed data plot.

Load Required Libraries

library(nls2)
## Loading required package: proto
library(subplex)

Step 1: Define the Data

year <- c(0, 10, 20, 30, 40, 50, 60)  # Years since 1790
pop <- c(3.929, 5.308, 7.240, 9.638, 12.866, 17.069, 23.192)  # Population in millions
t <- seq(0, 60, by = 10)  # Time in years since 1790
alpha <- 3.929              # Initial population in millions
k <- 0.03                  # Exponential growth rate

# Add noise to simulate data
set.seed(42)
noise <- rnorm(length(t), mean = 0, sd = 0.5)
y <- alpha * exp(k * t) + noise

Step 2: Simulate Noisy Data

# Data has been defined with added noise
y  # View noisy population data
## [1]  4.614479  5.021246  7.340669  9.980212 13.246874 17.555494 24.524826

Step 2: Simulate Noisy Data

# Data has been defined with added noise
y  # View noisy population data
## [1]  4.614479  5.021246  7.340669  9.980212 13.246874 17.555494 24.524826

Step 3: Visualize the Data

plot(t, y, main = "US Population Growth (1790-1850)", 
     xlab = "Time (years)", ylab = "Population (millions)", 
     pch = 19, col = "blue")

Step 4: Fit the Model Using nls2

kinetic_model <- y ~ alpha * exp(k * t)

fit <- nls2(kinetic_model, 
            start = expand.grid(alpha = seq(3, 5, 0.1), k = seq(0.02, 0.05, 0.001)), 
            algorithm = "brute-force")

# Display the summary of the fitted model
summary(fit)
## 
## Formula: y ~ alpha * exp(k * t)
## 
## Parameters:
##        Estimate Std. Error t value Pr(>|t|)    
## alpha 4.0000000  0.1622733   24.65 2.05e-06 ***
## k     0.0300000  0.0007979   37.60 2.51e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3972 on 5 degrees of freedom
## 
## Number of iterations to convergence: 651 
## Achieved convergence tolerance: NA
# Extract fitted parameters from nls2
fit_coeff <- coef(fit)
alpha_fit <- fit_coeff["alpha"]
k_fit <- fit_coeff["k"]

cat("Fitted Initial Population (alpha):", alpha_fit, "million\n")
## Fitted Initial Population (alpha): 4 million
cat("Fitted Growth Rate (k):", k_fit, "\n")
## Fitted Growth Rate (k): 0.03

Step 5: Extract and Display Fitted Values

cat("Extracted values from nls2 fitting:\n")
## Extracted values from nls2 fitting:
cat("Initial Population (alpha):", alpha_fit, "\n")
## Initial Population (alpha): 4
cat("Growth Rate (k):", k_fit, "\n")
## Growth Rate (k): 0.03

Step 6: Optimize Parameters Using subplex

# Define the sum of squared residuals 
myfun <- function(params) {
  alpha_trial <- params[1]
  k_trial <- params[2]
  sum((y - alpha_trial * exp(k_trial * t))^2)
}

# Perform optimization with subplex
sub_fit <- subplex(par = c(3, 0.03), fn = myfun)
sub_fit
## $par
## [1] 4.00440135 0.03002519
## 
## $value
## [1] 0.7817157
## 
## $counts
## [1] 527
## 
## $convergence
## [1] 0
## 
## $message
## [1] "success! tolerance satisfied"
## 
## $hessian
## NULL
# Extract subplex fitted parameters
alpha_subplex <- sub_fit$par[1]
k_subplex <- sub_fit$par[2]
cat("Subplex Fitted Initial Population (alpha):", alpha_subplex, "million\n")
## Subplex Fitted Initial Population (alpha): 4.004401 million
cat("Subplex Fitted Growth Rate (k):", k_subplex, "\n")
## Subplex Fitted Growth Rate (k): 0.03002519

Step 7: Plot and Compare the Fitted Curves

# Plot the simulated data
plot(t, y, main = "US Population Growth (1790-1850)", 
     xlab = "Time (years)", ylab = "Population (millions)", 
     pch = 19, col = "blue")

# Plot the nls2 fitted curve
curve(alpha_fit * exp(k_fit * x), from = 0, to = 60, add = TRUE, col = "red", lwd = 2)

# Adding legend

legend("topleft", legend = c( "nls2 Fit", "Subplex Fit"), 
       col = c( "blue", "red"), 
       pch = c(19, NA, NA), lty = c(NA, 1, 2), lwd = c(NA, 2, 2))

Results

Using the nls2 brute-force grid search method, the following parameters were fitted:

The plot displays:

The optimized parameters using subplex are:

These values closely match the data and validate the exponential growth model.