Question 2 Revised Report

Report about New Changes and Implementations

Experimenting with Different Proposal Standard Deviations (sigma)

Dear Professor Shestopaloff,

In response to the your suggestion to test with different proposal standard deviations to improve performance, I made the following changes:

  • 1- Overview of the changes: I changed the code to test several values for the scaling parameter (sigma) of the proposal distribution. The objective is to look at the effects of proposal distribution scaling on the Effective Sample Size or ESS. Also, the ESS results were computed and displayed for every combination of proposal scaling and sample size.

  • 2- Introduction of Experimentation with Proposal Scaling (sigma): The code is updated to include multiple values for the proposal distribution’s standard deviation (sigma). This adjustment permit us to test with how the proposal distribution’s scaling affects the performance of the importance sampling process. I used the following sigma values for testing:

0.5 1.0 1.5 2.0 3.0

  • 3- Tracking ESS for various Proposal Scaling: For every combination of sample size and proposal distribution scaling, the Effective Sample Size (ESS) was calculated. One important statistic for assessing the effectiveness of the importance sampling procedure is ESS. Higher values of the ESS indicate better sampling efficiency and show how closely the resampled weights reflect the real posterior distribution.

  • 4- Visualization of Results: The results were visualized using plots that show how ESS changes with respect to the sample size and proposal scaling (sigma).

  • 5- Posterior Mean Calculation: After performing importance sampling, the posterior means of the parameters 𝛽0,𝛽1 and 𝛽2 were computed and stored for analysis. The results for posterior means weree displayed in histograms for each sample size and proposal scaling (sigma).

Results and Conclusion for this Question:

set.seed(170)

intercept <- 0.1
coef_x1 <- 1.1
coef_x2 <- -0.9

sample_sizes <- c(10, 50, 100)

create_data <- function(N) {
  x1 <- runif(N, min = -2, max = 2)
  x2 <- runif(N, min = -2, max = 2)
  linear_predictor <- intercept + coef_x1 * x1 + coef_x2 * x2
  probabilities <- 1 / (1 + exp(-linear_predictor))
  response <- rbinom(N, 1, probabilities)
  return(data.frame(x1 = x1, x2 = x2, y = response))
}

simulated_data <- lapply(sample_sizes, create_data)

num_samples <- 20000  
num_resamples <- 10000  

compute_log_likelihood <- function(beta, data) {
  linear_predictor <- beta[1] + beta[2] * data$x1 + beta[3] * data$x2
  probabilities <- 1 / (1 + exp(-linear_predictor))
  log_likelihood_value <- sum(data$y * log(probabilities) + (1 - data$y) * log(1 - probabilities))
  return(log_likelihood_value)
}

proposal_distribution <- function(N, sigma = 1.5) {
  matrix(rnorm(3 * N, mean = 0, sd = sigma), ncol = 3)
}

compute_importance_weights <- function(beta, data) {
  log_posterior <- compute_log_likelihood(beta, data) - 0.5 * sum(beta^2)
  return(exp(log_posterior))
}

importance_sampling <- function(data, N, sigma) {
  proposal_samples <- proposal_distribution(N, sigma)
  weights <- sapply(1:N, function(i) compute_importance_weights(proposal_samples[i, ], data))
  normalized_weights <- weights / sum(weights)
  resample_indices <- sample(1:N, num_resamples, replace = TRUE, prob = normalized_weights)
  resampled_beta <- proposal_samples[resample_indices, ]
  posterior_means <- colSums(proposal_samples * normalized_weights)
  ESS <- 1 / sum(normalized_weights^2)
  return(list(posterior_means = posterior_means, ESS = ESS, resampled_beta = resampled_beta))
}

experiment_with_sigma <- function(data, sample_sizes, sigmas) {
  results <- list()
  for (sigma in sigmas) {
    sampling_results <- lapply(sample_sizes, function(N) importance_sampling(data, N, sigma))
    ESS_results <- lapply(sampling_results, function(result) result$ESS)
    posterior_means_results <- lapply(sampling_results, function(result) result$posterior_means)
    results[[paste("Sigma =", sigma)]] <- list(ESS = ESS_results, posterior_means = posterior_means_results)
  }
  return(results)
}

sigma_values <- c(0.5, 1, 1.5, 2, 3)

experiment_results <- experiment_with_sigma(simulated_data[[1]], sample_sizes, sigma_values)

par(mfrow = c(3, 3))

for (sigma in sigma_values) {
  ESS_result <- experiment_results[[paste("Sigma =", sigma)]]$ESS
  posterior_means_result <- experiment_results[[paste("Sigma =", sigma)]]$posterior_means
  
  plot(sample_sizes, unlist(ESS_result), type = "b", col = "blue", 
       main = paste("ESS for", paste("Sigma =", sigma)), xlab = "Sample Size", ylab = "ESS")
  
  posterior_means_matrix <- do.call(rbind, posterior_means_result)
  
  hist(posterior_means_matrix[, 1], main = paste("β0 (Sigma =", sigma, ")"), xlab = "β0", col = "yellow", breaks = 50)
  hist(posterior_means_matrix[, 2], main = paste("β1 (Sigma =", sigma, ")"), xlab = "β1", col = "lightblue", breaks = 50)
  hist(posterior_means_matrix[, 3], main = paste("β2 (Sigma =", sigma, ")"), xlab = "β2", col = "green", breaks = 50)
}

glm_models <- lapply(simulated_data, function(data) glm(y ~ x1 + x2, data = data, family = binomial))

glm_coefficients <- lapply(glm_models, function(model) coef(model))

Question 3 Revised

  • 1- Actions Presented:

    Testing Various Proposal Standard Deviations (σ): The proposal distribution’s spread is managed by the proposal standard deviation (sigma). Whereas a smaller value of sigma concentrates the distribution around the mean, a bigger value results in a broader spread

  • 2- The Effect of σ on ESS:

Larger σ: The diversity of the samples may be enhanced by a wider proposal distribution that may encompass a larger region of the parameter space. Poorer convergence may result from it, though, as it could increase the important weights’ variance.

Smaller σ: The suggested distribution will be concentrated nearer the origin (i.e., around zero) with a smaller σ, which may result in more precise sampled betas. But if σ is too small, it might not adequately investigate the posterior distribution, which would result in low ESS and not enough coverage.

set.seed(170)

intercept <- 0.1
coef_x1 <- 1.1
coef_x2 <- -0.9

additional_betas <- runif(6, min = -1, max = 1)
names(additional_betas) <- paste("beta_add", 1:6, sep = "_")
print("Generated additional beta values:")
## [1] "Generated additional beta values:"
print(additional_betas)
## beta_add_1 beta_add_2 beta_add_3 beta_add_4 beta_add_5 beta_add_6 
##  0.9683449 -0.5339030  0.5368513 -0.8417813  0.1257677 -0.8274445
sample_sizes <- c(10, 50, 100)

create_data <- function(N) {
  x1 <- runif(N, min = -2, max = 2)
  x2 <- runif(N, min = -2, max = 2)
  additional_covariates <- matrix(runif(N * 6, min = -2, max = 2), ncol = 6)
  colnames(additional_covariates) <- paste("x_add", 1:6, sep = "_")
  linear_predictor <- intercept + coef_x1 * x1 + coef_x2 * x2
  for (i in 1:6) {
    linear_predictor <- linear_predictor + additional_covariates[, i] * additional_betas[i]
  }
  probabilities <- 1 / (1 + exp(-linear_predictor))
  response <- rbinom(N, 1, probabilities)
  data <- data.frame(x1 = x1, x2 = x2, y = response, additional_covariates)
  return(data)
}

simulated_data <- lapply(sample_sizes, create_data)

num_samples <- 20000
num_resamples <- 10000

compute_log_likelihood <- function(beta, data) {
  linear_predictor <- beta[1] + beta[2] * data$x1 + beta[3] * data$x2
  for (i in 1:6) {
    linear_predictor <- linear_predictor + data[, paste("x_add", i, sep = "_")] * beta[3 + i]
  }
  probabilities <- 1 / (1 + exp(-linear_predictor))
  log_likelihood_value <- sum(data$y * log(probabilities) + (1 - data$y) * log(1 - probabilities))
  return(log_likelihood_value)
}

proposal_distribution <- function(N, sigma = 1.5) {
  matrix(rnorm(9 * N, mean = 0, sd = sigma), ncol = 9)
}

compute_importance_weights <- function(beta, data) {
  log_likelihood <- compute_log_likelihood(beta, data)
  log_prior <- -0.5 * sum(beta^2)
  log_posterior <- log_likelihood + log_prior
  return(exp(log_posterior))
}

importance_sampling <- function(data, N) {
  proposal_samples <- proposal_distribution(N)
  weights <- sapply(1:N, function(i) compute_importance_weights(proposal_samples[i, ], data))
  if (any(is.na(weights)) | any(is.nan(weights))) {
    weights[is.na(weights) | is.nan(weights)] <- 1e-10
  }
  normalized_weights <- weights / sum(weights)
  resample_indices <- sample(1:N, num_resamples, replace = TRUE, prob = normalized_weights)
  resampled_beta <- proposal_samples[resample_indices, ]
  posterior_means <- colSums(proposal_samples * normalized_weights)
  ESS <- 1 / sum(normalized_weights^2)
  return(list(posterior_means = posterior_means, ESS = ESS, resampled_beta = resampled_beta))
}

sampling_results <- lapply(simulated_data, function(data) importance_sampling(data, num_samples))

posterior_means_results <- lapply(sampling_results, function(result) result$posterior_means)
ESS_results <- lapply(sampling_results, function(result) result$ESS)

print("Posterior means for each sample size:")
## [1] "Posterior means for each sample size:"
print(posterior_means_results)
## [[1]]
## [1]  0.31279720  0.64257649 -0.63977290  0.10788616 -0.70143446  0.29467828
## [7]  0.01030248 -0.39261481 -0.83282578
## 
## [[2]]
## [1] -0.007941247  1.540101108 -0.774561409  1.322760275 -0.241152351
## [6]  0.362213366 -0.611630032 -0.218248803 -0.778815135
## 
## [[3]]
## [1]  0.1952008  1.1087517 -1.0680444  0.3095585 -0.3575560 -0.1548563 -0.7879833
## [8]  0.1095985 -1.2033588
print("Effective Sample Size (ESS) for each sample size:")
## [1] "Effective Sample Size (ESS) for each sample size:"
print(ESS_results)
## [[1]]
## [1] 42.81159
## 
## [[2]]
## [1] 2.806032
## 
## [[3]]
## [1] 1.121614
par(mfrow = c(3, 3))
for (i in 1:length(sample_sizes)) {
  resampled_betas <- sampling_results[[i]]$resampled_beta
  hist(resampled_betas[, 1], main = paste("β0 (N =", sample_sizes[i], ")"), xlab = "β0", col = "yellow", breaks = 30)
  hist(resampled_betas[, 2], main = paste("β1 (N =", sample_sizes[i], ")"), xlab = "β1", col = "lightblue", breaks = 30)
  hist(resampled_betas[, 3], main = paste("β2 (N =", sample_sizes[i], ")"), xlab = "β2", col = "green", breaks = 30)
  for (j in 1:6) {
    hist(resampled_betas[, 3 + j], main = paste("β_add", j, " (N =", sample_sizes[i], ")"), xlab = paste("β_add", j, sep = ""), col = rainbow(6)[j], breaks = 30)
  }
}

glm_models <- lapply(simulated_data, function(data) glm(y ~ x1 + x2 + x_add_1 + x_add_2 + x_add_3 + x_add_4 + x_add_5 + x_add_6, data = data, family = binomial))
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
glm_coefficients <- lapply(glm_models, function(model) coef(model))
print("GLM coefficients:")
## [1] "GLM coefficients:"
print(glm_coefficients)
## [[1]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##    6.433204   14.798308   -1.828293  -12.216024  -10.451806    7.680328 
##     x_add_4     x_add_5     x_add_6 
##   -7.121071  -20.632866  -34.417421 
## 
## [[2]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##  -0.3287673   1.9150312  -0.6886903   1.6157019  -0.7749457   0.2304285 
##     x_add_4     x_add_5     x_add_6 
##  -0.9472955  -0.0231685  -0.8970437 
## 
## [[3]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##  -0.2661212   0.9181552  -1.2082422   0.7910228  -0.4107330   0.3579283 
##     x_add_4     x_add_5     x_add_6 
##  -1.1341144   0.4847570  -0.8623633