Problem set 6

Setup environment

#STEP 1

# Load the data
load("Civil.Rdata")

Civil <- corruption

# Display the first few rows and structure of the dataset
head(Civil)

##   country_name country_text_id year                           region
## 1       Mexico             MEX 2020  Latin America and the Caribbean
## 2     Suriname             SUR 2020  Latin America and the Caribbean
## 3       Sweden             SWE 2020 Western Europe and North America
## 4  Switzerland             CHE 2020 Western Europe and North America
## 5        Ghana             GHA 2020               Sub-Saharan Africa
## 6 South Africa             ZAF 2020               Sub-Saharan Africa
##   disclose_donations_ord public_sector_corruption polyarchy civil_liberties
## 1                      3                     48.8      64.7            71.2
## 2                      1                     24.8      76.1            87.7
## 3                      2                      1.3      90.8            96.9
## 4                      0                      1.4      89.4            94.8
## 5                      2                     65.2      72.0            90.4
## 6                      1                     57.1      70.3            82.2
##   disclose_donations iso2c population gdp_percapita     capital longitude
## 1               TRUE    MX  128932753      8922.612 Mexico City  -99.1276
## 2              FALSE    SR     586634      7529.614  Paramaribo  -55.1679
## 3              FALSE    SE   10353442     51541.656   Stockholm   18.0645
## 4              FALSE    CH    8636561     85685.290        Bern   7.44821
## 5              FALSE    GH   31072945      2020.624       Accra  -0.20795
## 6              FALSE    ZA   59308690      5659.207    Pretoria   28.1871
##   latitude              income log_gdp_percapita
## 1   19.427 Upper middle income          9.096344
## 2   5.8232 Upper middle income          8.926599
## 3  59.3327         High income         10.850146
## 4   46.948         High income         11.358436
## 5  5.57045 Lower middle income          7.611162
## 6  -25.746 Upper middle income          8.641039

str(Civil)

## 'data.frame':    168 obs. of  17 variables:
##  $ country_name            : chr  "Mexico" "Suriname" "Sweden" "Switzerland" ...
##  $ country_text_id         : chr  "MEX" "SUR" "SWE" "CHE" ...
##  $ year                    : num  2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 ...
##  $ region                  : Factor w/ 6 levels "Eastern Europe and Central Asia",..: 2 2 5 5 4 4 6 6 1 1 ...
##  $ disclose_donations_ord  : num  3 1 2 0 2 1 3 2 3 2 ...
##  $ public_sector_corruption: num  48.8 24.8 1.3 1.4 65.2 57.1 3.7 36.8 70.6 71.2 ...
##  $ polyarchy               : num  64.7 76.1 90.8 89.4 72 70.3 83.2 43.6 26.2 48.5 ...
##  $ civil_liberties         : num  71.2 87.7 96.9 94.8 90.4 82.2 92.8 56.9 43 85.5 ...
##  $ disclose_donations      : logi  TRUE FALSE FALSE FALSE FALSE FALSE ...
##  $ iso2c                   : chr  "MX" "SR" "SE" "CH" ...
##  $ population              : num  1.29e+08 5.87e+05 1.04e+07 8.64e+06 3.11e+07 ...
##   ..- attr(*, "label")= chr "Population, total"
##  $ gdp_percapita           : num  8923 7530 51542 85685 2021 ...
##   ..- attr(*, "label")= chr "GDP per capita (constant 2015 US$)"
##  $ capital                 : chr  "Mexico City" "Paramaribo" "Stockholm" "Bern" ...
##  $ longitude               : chr  "-99.1276" "-55.1679" "18.0645" "7.44821" ...
##  $ latitude                : chr  "19.427" "5.8232" "59.3327" "46.948" ...
##  $ income                  : chr  "Upper middle income" "Upper middle income" "High income" "High income" ...
##  $ log_gdp_percapita       : num  9.1 8.93 10.85 11.36 7.61 ...
##   ..- attr(*, "label")= chr "GDP per capita (constant 2015 US$)"

# Check if the required columns exist
required_columns <- c("public_sector_corruption", "polyarchy")
missing_columns <- setdiff(required_columns, colnames(Civil))

if(length(missing_columns) > 0) {
  stop(paste("The following required columns are missing:", paste(missing_columns, collapse = ", ")))
}

# Clean the data: Remove rows with NA values in the required columns
Civil_clean <- Civil %>%
  filter(!is.na(public_sector_corruption) & !is.na(polyarchy))

# Perform a simple linear regression
model <- lm(public_sector_corruption ~ polyarchy, data = Civil_clean)

# Summarize the regression model
print(summary(model))

## 
## Call:
## lm(formula = public_sector_corruption ~ polyarchy, data = Civil_clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -69.498 -14.334   1.448  16.985  44.436 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 89.44444    3.95373   22.62   <2e-16 ***
## polyarchy   -0.82641    0.06786  -12.18   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.68 on 166 degrees of freedom
## Multiple R-squared:  0.4718, Adjusted R-squared:  0.4686 
## F-statistic: 148.3 on 1 and 166 DF,  p-value: < 2.2e-16

# Visualize the relationship with a scatter plot and overlay the regression line
plot_data <- Civil_clean %>%
  mutate(highlight = polyarchy == min(polyarchy) | polyarchy == max(polyarchy))

ggplot(plot_data, aes(x = polyarchy, y = public_sector_corruption)) +
  geom_point(aes(color = highlight)) +
  stat_smooth(method = "lm", formula = y ~ x, linewidth = 1, color = "blue") +
  geom_label_repel(data = filter(plot_data, highlight == TRUE), 
                   aes(label = country_name), seed = 1234) +
  scale_color_manual(values = c("grey30", "red"), guide = "none") +
  labs(title = "Relationship between Polyarchy and Public Sector Corruption",
       x = "Polyarchy",
       y = "Public Sector Corruption") +
  theme_minimal()

# Create regression tables using sjPlot
tab_model(model, show.ci = TRUE, show.se = TRUE, show.stat = TRUE, show.df = TRUE)

	public_sector_corruption
Predictors	Estimates	std. Error	CI	Statistic	p	df
(Intercept)	89.44	3.95	-Inf – Inf	22.62	<0.001	166.00
polyarchy	-0.83	0.07	-Inf – Inf	-12.18	<0.001	166.00
Observations	168
R² / R² adjusted	0.472 / 0.469

#INTERPRETATION

The results of the simple linear regression analysis indicate a significant negative relationship between polyarchy and public sector corruption. The model’s intercept is 89.44, meaning that when the polyarchy score is zero, the public sector corruption index is predicted to be 89.44. The coefficient for polyarchy is -0.83, suggesting that for each unit increase in polyarchy, the public sector corruption index decreases by 0.83 units. This relationship is statistically significant, as indicated by the p-value of less than 0.001. The model explains approximately 47.2% of the variance in public sector corruption, as shown by the R-squared value of 0.472. These findings suggest that higher levels of polyarchy, indicative of more democratic and inclusive governance, are associated with lower levels of public sector corruption.

#TASK 2

# Extend the model by adding a quadratic term for polyarchy
model_poly <- lm(public_sector_corruption ~ polyarchy + I(polyarchy^2), data = Civil_clean)

# Summarize the extended model
summary(model_poly)

## 
## Call:
## lm(formula = public_sector_corruption ~ polyarchy + I(polyarchy^2), 
##     data = Civil_clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -64.462  -7.475   1.418  14.107  35.187 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    53.116104   7.019792   7.567 2.55e-12 ***
## polyarchy       0.974653   0.305335   3.192  0.00169 ** 
## I(polyarchy^2) -0.017310   0.002874  -6.023 1.08e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.69 on 165 degrees of freedom
## Multiple R-squared:  0.567,  Adjusted R-squared:  0.5618 
## F-statistic:   108 on 2 and 165 DF,  p-value: < 2.2e-16

# Visualize the polynomial relationship with ggplot2
ggplot(Civil_clean, aes(x = polyarchy, y = public_sector_corruption)) +
  geom_point() +
  stat_smooth(method = "lm", formula = y ~ polyarchy + I(polyarchy^2), linewidth = 1, color = "blue") +
  labs(title = "Polynomial Relationship between Polyarchy and Public Sector Corruption",
       x = "Polyarchy",
       y = "Public Sector Corruption") +
  theme_minimal()

# Calculate marginal effects manually at different levels of polyarchy (30, 60, 90)
calculate_marginal_effects <- function(model, level) {
  coef <- coef(model)
  marginal_effect <- coef["polyarchy"] + 2 * coef["I(polyarchy^2)"] * level
  return(marginal_effect)
}

marginal_effects_30 <- calculate_marginal_effects(model_poly, 30)
marginal_effects_60 <- calculate_marginal_effects(model_poly, 60)
marginal_effects_90 <- calculate_marginal_effects(model_poly, 90)

print(paste("Marginal effect at polyarchy 30:", marginal_effects_30))

## [1] "Marginal effect at polyarchy 30: -0.0639275946596716"

print(paste("Marginal effect at polyarchy 60:", marginal_effects_60))

## [1] "Marginal effect at polyarchy 60: -1.10250799728726"

print(paste("Marginal effect at polyarchy 90:", marginal_effects_90))

## [1] "Marginal effect at polyarchy 90: -2.14108839991484"

# Calculate marginal effects using the marginaleffects package
marginal_effects_results <- marginaleffects(model_poly, newdata = data.frame(polyarchy = c(30, 60, 90)))

print(marginal_effects_results)

## 
##       Term Estimate Std. Error       z Pr(>|z|)     S 2.5 % 97.5 %
##  polyarchy  -0.0639     0.1408  -0.454     0.65   0.6 -0.34  0.212
##  polyarchy  -1.1025     0.0768 -14.351   <0.001 152.7 -1.25 -0.952
##  polyarchy  -2.1411     0.2268  -9.439   <0.001  67.8 -2.59 -1.697
## 
## Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, polyarchy, public_sector_corruption 
## Type:  response

#INTERPRETATION

The extended model with a quadratic term for polyarchy reveals a significant non-linear relationship between polyarchy and public sector corruption. The positive coefficient for polyarchy (0.9747) and the negative coefficient for its squared term (-0.0173) suggest that the relationship between polyarchy and public sector corruption is initially positive but becomes negative as polyarchy increases. The model is statistically significant (p-value < 2.2e-16) and explains about 56.7% of the variance in public sector corruption (Adjusted R-squared: 0.5618). The marginal effects at polyarchy levels of 30, 60, and 90 show that the impact of polyarchy on corruption becomes increasingly negative at higher levels: -0.0639 at 30, -1.1025 at 60, and -2.1411 at 90. These results indicate that while moderate increases in polyarchy initially have a smaller effect on reducing corruption, higher levels of polyarchy significantly enhance the reduction of public sector corruption, highlighting the complex and diminishing marginal effects of polyarchy on corruption.

#TASK 3

# Fit a logistic regression model
model_logit <- glm(disclose_donations ~ public_sector_corruption + log_gdp_percapita, 
                   data = Civil_clean, 
                   family = binomial)

# Summarize the logistic regression model
summary(model_logit)

## 
## Call:
## glm(formula = disclose_donations ~ public_sector_corruption + 
##     log_gdp_percapita, family = binomial, data = Civil_clean)
## 
## Coefficients:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -0.50466    2.18953  -0.230    0.818    
## public_sector_corruption -0.05964    0.01191  -5.007 5.54e-07 ***
## log_gdp_percapita         0.24907    0.21785   1.143    0.253    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 217.79  on 167  degrees of freedom
## Residual deviance: 131.30  on 165  degrees of freedom
## AIC: 137.3
## 
## Number of Fisher Scoring iterations: 5

# Create regression tables using sjPlot
tab_model(model_logit, show.ci = TRUE, show.se = TRUE, show.stat = TRUE, show.df = TRUE)

	disclose_donations
Predictors	Odds Ratios	std. Error	CI	Statistic	p	df
(Intercept)	0.60	1.32	0.00 – Inf	-0.23	0.818	Inf
public_sector_corruption	0.94	0.01	0.00 – Inf	-5.01	<0.001	Inf
GDP per capita (constant 2015 US$)	1.28	0.28	0.00 – Inf	1.14	0.253	Inf
Observations	168
R² Tjur	0.454

#Interpretating

The coefficient for public_sector_corruption is -0.05964 with a p-value of 5.54e-07, indicating that higher levels of public sector corruption significantly decrease the likelihood of having campaign finance disclosure laws. Specifically, for each unit increase in public sector corruption, the log-odds of having disclosure laws decrease by 0.05964. On the other hand, the coefficient for log_gdp_percapita is 0.24907 with a p-value of 0.253, suggesting that GDP per capita does not have a statistically significant effect on the presence of these laws. The intercept is not significant, and the model’s AIC value is 137.3, indicating the relative quality of the model compared to others. Overall, the findings highlight that higher public sector corruption is a strong predictor of the absence of campaign finance disclosure laws, while economic wealth, as measured by GDP per capita, does not significantly influence the likelihood of these laws being in place.

#TASK 4

# Fit a logistic regression model
model_logit <- glm(disclose_donations ~ public_sector_corruption + log_gdp_percapita, 
                   data = Civil_clean, 
                   family = binomial)

# Summarize the logistic regression model
summary(model_logit)

## 
## Call:
## glm(formula = disclose_donations ~ public_sector_corruption + 
##     log_gdp_percapita, family = binomial, data = Civil_clean)
## 
## Coefficients:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -0.50466    2.18953  -0.230    0.818    
## public_sector_corruption -0.05964    0.01191  -5.007 5.54e-07 ***
## log_gdp_percapita         0.24907    0.21785   1.143    0.253    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 217.79  on 167  degrees of freedom
## Residual deviance: 131.30  on 165  degrees of freedom
## AIC: 137.3
## 
## Number of Fisher Scoring iterations: 5

# Calculate marginal effects using the marginaleffects package at representative values (20, 50, 80)
marginal_effects_values <- marginaleffects(model_logit, newdata = data.frame(public_sector_corruption = c(20, 50, 80), log_gdp_percapita = mean(Civil_clean$log_gdp_percapita)))

print(marginal_effects_values)

## 
##                      Term Estimate Std. Error      z Pr(>|z|)    S    2.5 %
##  log_gdp_percapita         0.05939   0.053953  1.101  0.27097  1.9 -0.04635
##  log_gdp_percapita         0.04066   0.037150  1.094  0.27378  1.9 -0.03216
##  log_gdp_percapita         0.00989   0.011758  0.841  0.40036  1.3 -0.01316
##  public_sector_corruption -0.01422   0.002347 -6.059  < 0.001 29.4 -0.01882
##  public_sector_corruption -0.00973   0.001655 -5.881  < 0.001 27.9 -0.01298
##  public_sector_corruption -0.00237   0.000819 -2.890  0.00386  8.0 -0.00397
##     97.5 %
##   0.165139
##   0.113468
##   0.032935
##  -0.009621
##  -0.006490
##  -0.000762
## 
## Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, public_sector_corruption, log_gdp_percapita, disclose_donations 
## Type:  response

# Calculate marginal effects using the emmeans package at representative values (20, 50, 80)
emm <- emmeans(model_logit, ~ public_sector_corruption, at = list(public_sector_corruption = c(20, 50, 80)))
marginal_effects_emm <- summary(pairs(emm))

print(marginal_effects_emm)

##  contrast                                                estimate    SE  df
##  public_sector_corruption20 - public_sector_corruption50     1.79 0.357 Inf
##  public_sector_corruption20 - public_sector_corruption80     3.58 0.715 Inf
##  public_sector_corruption50 - public_sector_corruption80     1.79 0.357 Inf
##  z.ratio p.value
##    5.007  <.0001
##    5.007  <.0001
##    5.007  <.0001
## 
## Results are given on the log odds ratio (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 3 estimates

# Visualize the predicted probabilities across a range of public_sector_corruption values
public_sector_corruption_values <- seq(min(Civil_clean$public_sector_corruption), max(Civil_clean$public_sector_corruption), length.out = 100)
predicted_probs <- predict(model_logit, newdata = data.frame(public_sector_corruption = public_sector_corruption_values, log_gdp_percapita = mean(Civil_clean$log_gdp_percapita)), type = "response")

plot_data <- data.frame(public_sector_corruption = public_sector_corruption_values, predicted_prob = predicted_probs)

ggplot(plot_data, aes(x = public_sector_corruption, y = predicted_prob)) +
  geom_line(color = "blue") +
  labs(title = "Predicted Probabilities of Having Campaign Finance Disclosure Laws",
       x = "Public Sector Corruption",
       y = "Predicted Probability") +
  theme_minimal()

#RESULTS

The log odds of having campaign finance disclosure laws for public_sector_corruption of 20 compared to 50 is 1.79, with a standard error of 0.357, indicating a significant positive difference. The log odds of having disclosure laws for public_sector_corruption of 20 compared to 80 is 3.58, with a standard error of 0.715, showing a substantial positive difference. The log odds of having disclosure laws for public_sector_corruption of 50 compared to 80 is also 1.79, with a standard error of 0.357, indicating a significant positive difference.

The coefficient for public_sector_corruption is -0.05964 (SE = 0.01191), with a z-value of -5.007 and a p-value < 0.001. This significant negative coefficient indicates that higher levels of public sector corruption are associated with lower odds of having campaign finance disclosure laws. Specifically, for each unit increase in public sector corruption, the log odds of having disclosure laws decrease by 0.05964. The coefficient for log_gdp_percapita is 0.24907 (SE = 0.21785), with a z-value of 1.143 and a p-value of 0.253. This non-significant coefficient suggests that GDP per capita does not have a statistically significant effect on the likelihood of having campaign finance disclosure laws.

At public_sector_corruption of 20, the marginal effect is significantly different from both 50 and 80, highlighting a notable change in the predicted probability of having disclosure laws as corruption levels increase. The significant differences in the marginal effects between the specified levels emphasize the diminishing probability of having campaign finance disclosure laws as public sector corruption increases.

#TASK 5

# Fit a logistic regression model with interaction term
model_interaction <- glm(disclose_donations ~ public_sector_corruption * region + log_gdp_percapita, 
                         data = Civil_clean, 
                         family = binomial)

# Summarize the logistic regression model
summary(model_interaction)

## 
## Call:
## glm(formula = disclose_donations ~ public_sector_corruption * 
##     region + log_gdp_percapita, family = binomial, data = Civil_clean)
## 
## Coefficients:
##                                                                 Estimate
## (Intercept)                                                      3.21658
## public_sector_corruption                                        -0.06335
## regionLatin America and the Caribbean                           -2.47593
## regionMiddle East and North Africa                              -0.65585
## regionSub-Saharan Africa                                        -1.61845
## regionWestern Europe and North America                          -1.05205
## regionAsia and Pacific                                          -0.92044
## log_gdp_percapita                                                0.01539
## public_sector_corruption:regionLatin America and the Caribbean   0.02499
## public_sector_corruption:regionMiddle East and North Africa     -0.05436
## public_sector_corruption:regionSub-Saharan Africa               -0.02279
## public_sector_corruption:regionWestern Europe and North America -0.03829
## public_sector_corruption:regionAsia and Pacific                 -0.03145
##                                                                 Std. Error
## (Intercept)                                                        3.52849
## public_sector_corruption                                           0.02299
## regionLatin America and the Caribbean                              1.53294
## regionMiddle East and North Africa                                 2.36552
## regionSub-Saharan Africa                                           1.79649
## regionWestern Europe and North America                             1.57290
## regionAsia and Pacific                                             1.85141
## log_gdp_percapita                                                  0.34500
## public_sector_corruption:regionLatin America and the Caribbean     0.03045
## public_sector_corruption:regionMiddle East and North Africa        0.06720
## public_sector_corruption:regionSub-Saharan Africa                  0.03978
## public_sector_corruption:regionWestern Europe and North America    0.07872
## public_sector_corruption:regionAsia and Pacific                    0.04645
##                                                                 z value
## (Intercept)                                                       0.912
## public_sector_corruption                                         -2.755
## regionLatin America and the Caribbean                            -1.615
## regionMiddle East and North Africa                               -0.277
## regionSub-Saharan Africa                                         -0.901
## regionWestern Europe and North America                           -0.669
## regionAsia and Pacific                                           -0.497
## log_gdp_percapita                                                 0.045
## public_sector_corruption:regionLatin America and the Caribbean    0.821
## public_sector_corruption:regionMiddle East and North Africa      -0.809
## public_sector_corruption:regionSub-Saharan Africa                -0.573
## public_sector_corruption:regionWestern Europe and North America  -0.486
## public_sector_corruption:regionAsia and Pacific                  -0.677
##                                                                 Pr(>|z|)   
## (Intercept)                                                      0.36198   
## public_sector_corruption                                         0.00586 **
## regionLatin America and the Caribbean                            0.10628   
## regionMiddle East and North Africa                               0.78158   
## regionSub-Saharan Africa                                         0.36764   
## regionWestern Europe and North America                           0.50358   
## regionAsia and Pacific                                           0.61908   
## log_gdp_percapita                                                0.96442   
## public_sector_corruption:regionLatin America and the Caribbean   0.41188   
## public_sector_corruption:regionMiddle East and North Africa      0.41851   
## public_sector_corruption:regionSub-Saharan Africa                0.56671   
## public_sector_corruption:regionWestern Europe and North America  0.62666   
## public_sector_corruption:regionAsia and Pacific                  0.49833   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 217.79  on 167  degrees of freedom
## Residual deviance: 114.37  on 155  degrees of freedom
## AIC: 140.37
## 
## Number of Fisher Scoring iterations: 7

# Generate fitted values using marginaleffects
regions_to_use <- unique(Civil_clean$region)
fitted_values_me <- slopes(model_interaction, 
                           variables = "public_sector_corruption",
                           newdata = datagrid(model = model_interaction, 
                                              public_sector_corruption = c(20, 50, 80),
                                              region = regions_to_use))
print(fitted_values_me)

## 
##                      Term public_sector_corruption
##  public_sector_corruption                       20
##  public_sector_corruption                       20
##  public_sector_corruption                       20
##  public_sector_corruption                       20
##  public_sector_corruption                       20
##  public_sector_corruption                       20
##  public_sector_corruption                       50
##  public_sector_corruption                       50
##  public_sector_corruption                       50
##  public_sector_corruption                       50
##  public_sector_corruption                       50
##  public_sector_corruption                       50
##  public_sector_corruption                       80
##  public_sector_corruption                       80
##  public_sector_corruption                       80
##  public_sector_corruption                       80
##  public_sector_corruption                       80
##  public_sector_corruption                       80
##                            region  Estimate Std. Error      z Pr(>|z|)   S
##  Latin America and the Caribbean  -0.009564   0.005210 -1.836  0.06641 3.9
##  Western Europe and North America -0.024973   0.021434 -1.165  0.24398 2.0
##  Sub-Saharan Africa               -0.021534   0.008314 -2.590  0.00960 6.7
##  Asia and Pacific                 -0.022099   0.008070 -2.739  0.00617 7.3
##  Eastern Europe and Central Asia  -0.006247   0.002719 -2.297  0.02162 5.5
##  Middle East and North Africa     -0.028602   0.013565 -2.109  0.03499 4.8
##  Latin America and the Caribbean  -0.007382   0.003718 -1.986  0.04707 4.4
##  Western Europe and North America -0.005563   0.011577 -0.481  0.63084 0.7
##  Sub-Saharan Africa               -0.005655   0.003460 -1.635  0.10212 3.3
##  Asia and Pacific                 -0.007775   0.004277 -1.818  0.06912 3.9
##  Eastern Europe and Central Asia  -0.015708   0.005697 -2.757  0.00583 7.4
##  Middle East and North Africa     -0.004458   0.004995 -0.893  0.37207 1.4
##  Latin America and the Caribbean  -0.003455   0.001449 -2.385  0.01708 5.9
##  Western Europe and North America -0.000295   0.001369 -0.216  0.82912 0.3
##  Sub-Saharan Africa               -0.000488   0.000698 -0.700  0.48390 1.0
##  Asia and Pacific                 -0.000540   0.000934 -0.578  0.56294 0.8
##  Eastern Europe and Central Asia  -0.008163   0.002924 -2.792  0.00525 7.6
##  Middle East and North Africa     -0.000141   0.000405 -0.348  0.72762 0.5
##      2.5 %    97.5 %
##  -0.019775  6.48e-04
##  -0.066984  1.70e-02
##  -0.037830 -5.24e-03
##  -0.037915 -6.28e-03
##  -0.011577 -9.17e-04
##  -0.055188 -2.02e-03
##  -0.014669 -9.55e-05
##  -0.028253  1.71e-02
##  -0.012436  1.13e-03
##  -0.016158  6.09e-04
##  -0.026873 -4.54e-03
##  -0.014247  5.33e-03
##  -0.006294 -6.16e-04
##  -0.002979  2.39e-03
##  -0.001856  8.79e-04
##  -0.002371  1.29e-03
##  -0.013894 -2.43e-03
##  -0.000935  6.53e-04
## 
## Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, public_sector_corruption, region, predicted_lo, predicted_hi, predicted, log_gdp_percapita, disclose_donations 
## Type:  response

# Create a dataset for visualization
visualization_data <- datagrid(model = model_interaction, 
                               public_sector_corruption = seq(min(Civil_clean$public_sector_corruption), 
                                                              max(Civil_clean$public_sector_corruption), 
                                                              length.out = 100), 
                               region = regions_to_use)

# Calculate predicted probabilities
predicted_probs <- predictions(model_interaction, newdata = visualization_data)

# Convert predictions to a data frame for ggplot
predicted_probs_df <- as.data.frame(predicted_probs)

# Visualize the interaction effects using ggplot2
ggplot(predicted_probs_df, aes(x = public_sector_corruption, y = estimate, color = region)) +
  geom_line() +
  labs(title = "Interaction Effect of Public Sector Corruption and Region on Campaign Finance Disclosure Laws",
       x = "Public Sector Corruption",
       y = "Predicted Probability of Disclosure Laws") +
  theme_minimal()

# Calculate marginal effects using emmeans
marginal_effects_emm <- model_interaction %>%
  emtrends(~ region, var = "public_sector_corruption", 
           at = list(public_sector_corruption = c(20, 50, 80)))
print(summary(marginal_effects_emm))

##  region                           public_sector_corruption.trend     SE  df
##  Eastern Europe and Central Asia                         -0.0633 0.0230 Inf
##  Latin America and the Caribbean                         -0.0384 0.0217 Inf
##  Middle East and North Africa                            -0.1177 0.0635 Inf
##  Sub-Saharan Africa                                      -0.0861 0.0334 Inf
##  Western Europe and North America                        -0.1016 0.0766 Inf
##  Asia and Pacific                                        -0.0948 0.0427 Inf
##  asymp.LCL asymp.UCL
##    -0.1084  -0.01829
##    -0.0809   0.00414
##    -0.2421   0.00672
##    -0.1515  -0.02076
##    -0.2517   0.04847
##    -0.1785  -0.01113
## 
## Results are averaged over the levels of: public_sector_corruption 
## Confidence level used: 0.95

#RESULTS

The logistic regression model with an interaction term between public sector corruption and region reveals significant regional variations in how corruption impacts the likelihood of having campaign finance disclosure laws. In regions such as the Middle East and North Africa and the Asia and Pacific, higher levels of public sector corruption significantly decrease the probability of having such laws, with marginal effects of -0.1177 and -0.0948, respectively. These findings suggest that in these regions, increases in corruption more severely undermine efforts to promote transparency through campaign finance disclosure. On the other hand, regions like Western Europe and North America show a smaller and less significant negative effect (-0.1016), indicating that corruption in these regions might have a relatively weaker influence on the presence of disclosure laws.

These regional differences have important policy implications. For regions where corruption has a stronger negative impact on the likelihood of having disclosure laws, targeted anti-corruption measures and campaign finance reforms are crucial. Policymakers need to design interventions that address the specific challenges and institutional weaknesses of these regions. International support and monitoring may also be necessary to bolster efforts to establish and enforce campaign finance disclosure laws effectively. By focusing on high-risk regions, such as the Middle East and North Africa and the Asia and Pacific, policymakers can work towards creating stronger institutional frameworks that mitigate the adverse effects of corruption and promote greater transparency in campaign finance.

Problem set 6

ziyi su

2024-08-07

Setup environment