Revised Assignmemt 2

Question 2 Revised Report

Report about New Changes and Implementations

Experimenting with Different Proposal Standard Deviations (`sigma`)

Dear Professor Shestopaloff,

In response to the your suggestion to test with different proposal standard deviations to improve performance, I made the following changes:

1- Overview of the changes: I changed the code to test several values for the scaling parameter (sigma) of the proposal distribution. The objective is to look at the effects of proposal distribution scaling on the Effective Sample Size or ESS. Also, the ESS results were computed and displayed for every combination of proposal scaling and sample size.
2- Introduction of Experimentation with Proposal Scaling (sigma): The code is updated to include multiple values for the proposal distribution’s standard deviation (sigma). This adjustment permit us to test with how the proposal distribution’s scaling affects the performance of the importance sampling process. I used the following sigma values for testing:

0.5 1.0 1.5 2.0 3.0

3- Tracking ESS for various Proposal Scaling: For every combination of sample size and proposal distribution scaling, the Effective Sample Size (ESS) was calculated. One important statistic for assessing the effectiveness of the importance sampling procedure is ESS. Higher values of the ESS indicate better sampling efficiency and show how closely the resampled weights reflect the real posterior distribution.
4- Visualization of Results: The results were visualized using plots that show how ESS changes with respect to the sample size and proposal scaling (sigma).
5- Posterior Mean Calculation: After performing importance sampling, the posterior means of the parameters 𝛽0,𝛽1 and 𝛽2 were computed and stored for analysis. The results for posterior means weree displayed in histograms for each sample size and proposal scaling (sigma).

Results and Conclusion for this Question:

Proposal Scaling and ESS:

Inefficient sampling results from a poorly scaled proposal distribution, which is predicted to have a low ESS. We can see how changing the spread of the proposal distribution affects the ESS by experimenting with different sigma values. If the proposal is too heavily concentrated around zero, a smaller sigma could lead to low ESS (inefficient sampling); conversely, if a bigger sigma does not introduce excessive variation, it may result in greater exploration of the parameter space and higher ESS.
Proposal Distribution Scaling:

To see how it affected the sampling efficiency, the proposal distribution’s scaling (which was run by sigma) was changed. I think I managed the distance that the sampled betas may be from the origin by varying the standard deviation of the proposal distribution. This investigation helps in determining the ideal scale that finds an agreement between parameter space exploration and sampling effectiveness.
Impact on Posterior Means: Histograms of the resampled betas for every sigma value were produced as part of the analysis. These histograms provide greater clarity on the connection between ESS and proposal scaling by illustrating how the posterior distribution of the parameters changes when various proposal scailing are used.

set.seed(170)

intercept <- 0.1
coef_x1 <- 1.1
coef_x2 <- -0.9

sample_sizes <- c(10, 50, 100)

create_data <- function(N) {
  x1 <- runif(N, min = -2, max = 2)
  x2 <- runif(N, min = -2, max = 2)
  linear_predictor <- intercept + coef_x1 * x1 + coef_x2 * x2
  probabilities <- 1 / (1 + exp(-linear_predictor))
  response <- rbinom(N, 1, probabilities)
  return(data.frame(x1 = x1, x2 = x2, y = response))
}

simulated_data <- lapply(sample_sizes, create_data)

num_samples <- 20000  
num_resamples <- 10000  

compute_log_likelihood <- function(beta, data) {
  linear_predictor <- beta[1] + beta[2] * data$x1 + beta[3] * data$x2
  probabilities <- 1 / (1 + exp(-linear_predictor))
  log_likelihood_value <- sum(data$y * log(probabilities) + (1 - data$y) * log(1 - probabilities))
  return(log_likelihood_value)
}

proposal_distribution <- function(N, sigma = 1.5) {
  matrix(rnorm(3 * N, mean = 0, sd = sigma), ncol = 3)
}

compute_importance_weights <- function(beta, data) {
  log_posterior <- compute_log_likelihood(beta, data) - 0.5 * sum(beta^2)
  return(exp(log_posterior))
}

importance_sampling <- function(data, N, sigma) {
  proposal_samples <- proposal_distribution(N, sigma)
  weights <- sapply(1:N, function(i) compute_importance_weights(proposal_samples[i, ], data))
  normalized_weights <- weights / sum(weights)
  resample_indices <- sample(1:N, num_resamples, replace = TRUE, prob = normalized_weights)
  resampled_beta <- proposal_samples[resample_indices, ]
  posterior_means <- colSums(proposal_samples * normalized_weights)
  ESS <- 1 / sum(normalized_weights^2)
  return(list(posterior_means = posterior_means, ESS = ESS, resampled_beta = resampled_beta))
}

experiment_with_sigma <- function(data, sample_sizes, sigmas) {
  results <- list()
  for (sigma in sigmas) {
    sampling_results <- lapply(sample_sizes, function(N) importance_sampling(data, N, sigma))
    ESS_results <- lapply(sampling_results, function(result) result$ESS)
    posterior_means_results <- lapply(sampling_results, function(result) result$posterior_means)
    results[[paste("Sigma =", sigma)]] <- list(ESS = ESS_results, posterior_means = posterior_means_results)
  }
  return(results)
}

sigma_values <- c(0.5, 1, 1.5, 2, 3)

experiment_results <- experiment_with_sigma(simulated_data[[1]], sample_sizes, sigma_values)

par(mfrow = c(3, 3))

for (sigma in sigma_values) {
  ESS_result <- experiment_results[[paste("Sigma =", sigma)]]$ESS
  posterior_means_result <- experiment_results[[paste("Sigma =", sigma)]]$posterior_means
  
  plot(sample_sizes, unlist(ESS_result), type = "b", col = "blue", 
       main = paste("ESS for", paste("Sigma =", sigma)), xlab = "Sample Size", ylab = "ESS")
  
  posterior_means_matrix <- do.call(rbind, posterior_means_result)
  
  hist(posterior_means_matrix[, 1], main = paste("β0 (Sigma =", sigma, ")"), xlab = "β0", col = "yellow", breaks = 50)
  hist(posterior_means_matrix[, 2], main = paste("β1 (Sigma =", sigma, ")"), xlab = "β1", col = "lightblue", breaks = 50)
  hist(posterior_means_matrix[, 3], main = paste("β2 (Sigma =", sigma, ")"), xlab = "β2", col = "green", breaks = 50)
}

glm_models <- lapply(simulated_data, function(data) glm(y ~ x1 + x2, data = data, family = binomial))

glm_coefficients <- lapply(glm_models, function(model) coef(model))

Question 3 Revised

1- Actions Presented:

Testing Various Proposal Standard Deviations (σ): The proposal distribution’s spread is managed by the proposal standard deviation (sigma). Whereas a smaller value of sigma concentrates the distribution around the mean, a bigger value results in a broader spread
2- The Effect of σ on ESS:

Larger σ: The diversity of the samples may be enhanced by a wider proposal distribution that may encompass a larger region of the parameter space. Poorer convergence may result from it, though, as it could increase the important weights’ variance.

Smaller σ: The suggested distribution will be concentrated nearer the origin (i.e., around zero) with a smaller σ, which may result in more precise sampled betas. But if σ is too small, it might not adequately investigate the posterior distribution, which would result in low ESS and not enough coverage.

set.seed(170)

intercept <- 0.1
coef_x1 <- 1.1
coef_x2 <- -0.9

additional_betas <- runif(6, min = -1, max = 1)
names(additional_betas) <- paste("beta_add", 1:6, sep = "_")
print("Generated additional beta values:")

## [1] "Generated additional beta values:"

print(additional_betas)

## beta_add_1 beta_add_2 beta_add_3 beta_add_4 beta_add_5 beta_add_6 
##  0.9683449 -0.5339030  0.5368513 -0.8417813  0.1257677 -0.8274445

sample_sizes <- c(10, 50, 100)

create_data <- function(N) {
  x1 <- runif(N, min = -2, max = 2)
  x2 <- runif(N, min = -2, max = 2)
  additional_covariates <- matrix(runif(N * 6, min = -2, max = 2), ncol = 6)
  colnames(additional_covariates) <- paste("x_add", 1:6, sep = "_")
  linear_predictor <- intercept + coef_x1 * x1 + coef_x2 * x2
  for (i in 1:6) {
    linear_predictor <- linear_predictor + additional_covariates[, i] * additional_betas[i]
  }
  probabilities <- 1 / (1 + exp(-linear_predictor))
  response <- rbinom(N, 1, probabilities)
  data <- data.frame(x1 = x1, x2 = x2, y = response, additional_covariates)
  return(data)
}

simulated_data <- lapply(sample_sizes, create_data)

num_samples <- 20000
num_resamples <- 10000

compute_log_likelihood <- function(beta, data) {
  linear_predictor <- beta[1] + beta[2] * data$x1 + beta[3] * data$x2
  for (i in 1:6) {
    linear_predictor <- linear_predictor + data[, paste("x_add", i, sep = "_")] * beta[3 + i]
  }
  probabilities <- 1 / (1 + exp(-linear_predictor))
  log_likelihood_value <- sum(data$y * log(probabilities) + (1 - data$y) * log(1 - probabilities))
  return(log_likelihood_value)
}

proposal_distribution <- function(N, sigma = 1.5) {
  matrix(rnorm(9 * N, mean = 0, sd = sigma), ncol = 9)
}

compute_importance_weights <- function(beta, data) {
  log_likelihood <- compute_log_likelihood(beta, data)
  log_prior <- -0.5 * sum(beta^2)
  log_posterior <- log_likelihood + log_prior
  return(exp(log_posterior))
}

importance_sampling <- function(data, N) {
  proposal_samples <- proposal_distribution(N)
  weights <- sapply(1:N, function(i) compute_importance_weights(proposal_samples[i, ], data))
  if (any(is.na(weights)) | any(is.nan(weights))) {
    weights[is.na(weights) | is.nan(weights)] <- 1e-10
  }
  normalized_weights <- weights / sum(weights)
  resample_indices <- sample(1:N, num_resamples, replace = TRUE, prob = normalized_weights)
  resampled_beta <- proposal_samples[resample_indices, ]
  posterior_means <- colSums(proposal_samples * normalized_weights)
  ESS <- 1 / sum(normalized_weights^2)
  return(list(posterior_means = posterior_means, ESS = ESS, resampled_beta = resampled_beta))
}

sampling_results <- lapply(simulated_data, function(data) importance_sampling(data, num_samples))

posterior_means_results <- lapply(sampling_results, function(result) result$posterior_means)
ESS_results <- lapply(sampling_results, function(result) result$ESS)

print("Posterior means for each sample size:")

## [1] "Posterior means for each sample size:"

print(posterior_means_results)

## [[1]]
## [1]  0.31279720  0.64257649 -0.63977290  0.10788616 -0.70143446  0.29467828
## [7]  0.01030248 -0.39261481 -0.83282578
## 
## [[2]]
## [1] -0.007941247  1.540101108 -0.774561409  1.322760275 -0.241152351
## [6]  0.362213366 -0.611630032 -0.218248803 -0.778815135
## 
## [[3]]
## [1]  0.1952008  1.1087517 -1.0680444  0.3095585 -0.3575560 -0.1548563 -0.7879833
## [8]  0.1095985 -1.2033588

print("Effective Sample Size (ESS) for each sample size:")

## [1] "Effective Sample Size (ESS) for each sample size:"

print(ESS_results)

## [[1]]
## [1] 42.81159
## 
## [[2]]
## [1] 2.806032
## 
## [[3]]
## [1] 1.121614

par(mfrow = c(3, 3))
for (i in 1:length(sample_sizes)) {
  resampled_betas <- sampling_results[[i]]$resampled_beta
  hist(resampled_betas[, 1], main = paste("β0 (N =", sample_sizes[i], ")"), xlab = "β0", col = "yellow", breaks = 30)
  hist(resampled_betas[, 2], main = paste("β1 (N =", sample_sizes[i], ")"), xlab = "β1", col = "lightblue", breaks = 30)
  hist(resampled_betas[, 3], main = paste("β2 (N =", sample_sizes[i], ")"), xlab = "β2", col = "green", breaks = 30)
  for (j in 1:6) {
    hist(resampled_betas[, 3 + j], main = paste("β_add", j, " (N =", sample_sizes[i], ")"), xlab = paste("β_add", j, sep = ""), col = rainbow(6)[j], breaks = 30)
  }
}

glm_models <- lapply(simulated_data, function(data) glm(y ~ x1 + x2 + x_add_1 + x_add_2 + x_add_3 + x_add_4 + x_add_5 + x_add_6, data = data, family = binomial))

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

glm_coefficients <- lapply(glm_models, function(model) coef(model))
print("GLM coefficients:")

## [1] "GLM coefficients:"

print(glm_coefficients)

## [[1]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##    6.433204   14.798308   -1.828293  -12.216024  -10.451806    7.680328 
##     x_add_4     x_add_5     x_add_6 
##   -7.121071  -20.632866  -34.417421 
## 
## [[2]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##  -0.3287673   1.9150312  -0.6886903   1.6157019  -0.7749457   0.2304285 
##     x_add_4     x_add_5     x_add_6 
##  -0.9472955  -0.0231685  -0.8970437 
## 
## [[3]]
## (Intercept)          x1          x2     x_add_1     x_add_2     x_add_3 
##  -0.2661212   0.9181552  -1.2082422   0.7910228  -0.4107330   0.3579283 
##     x_add_4     x_add_5     x_add_6 
##  -1.1341144   0.4847570  -0.8623633

Revised Assignmemt 2

Mohammad Soleimani

2024-11-25

Question 2 Revised Report

Report about New Changes and Implementations

Experimenting with Different Proposal Standard Deviations (`sigma`)

Results and Conclusion for this Question:

Question 3 Revised

Revised Assignmemt 2

Mohammad Soleimani

2024-11-25

Question 2 Revised Report

Report about New Changes and Implementations

Experimenting with Different Proposal Standard Deviations (sigma)

Results and Conclusion for this Question:

Question 3 Revised

Experimenting with Different Proposal Standard Deviations (`sigma`)