Problem Set 6

# Display the first few rows and structure of the dataset
head(corruption)

##   country_name country_text_id year                           region
## 1       Mexico             MEX 2020  Latin America and the Caribbean
## 2     Suriname             SUR 2020  Latin America and the Caribbean
## 3       Sweden             SWE 2020 Western Europe and North America
## 4  Switzerland             CHE 2020 Western Europe and North America
## 5        Ghana             GHA 2020               Sub-Saharan Africa
## 6 South Africa             ZAF 2020               Sub-Saharan Africa
##   disclose_donations_ord public_sector_corruption polyarchy civil_liberties
## 1                      3                     48.8      64.7            71.2
## 2                      1                     24.8      76.1            87.7
## 3                      2                      1.3      90.8            96.9
## 4                      0                      1.4      89.4            94.8
## 5                      2                     65.2      72.0            90.4
## 6                      1                     57.1      70.3            82.2
##   disclose_donations iso2c population gdp_percapita     capital longitude
## 1               TRUE    MX  128932753      8922.612 Mexico City  -99.1276
## 2              FALSE    SR     586634      7529.614  Paramaribo  -55.1679
## 3              FALSE    SE   10353442     51541.656   Stockholm   18.0645
## 4              FALSE    CH    8636561     85685.290        Bern   7.44821
## 5              FALSE    GH   31072945      2020.624       Accra  -0.20795
## 6              FALSE    ZA   59308690      5659.207    Pretoria   28.1871
##   latitude              income log_gdp_percapita
## 1   19.427 Upper middle income          9.096344
## 2   5.8232 Upper middle income          8.926599
## 3  59.3327         High income         10.850146
## 4   46.948         High income         11.358436
## 5  5.57045 Lower middle income          7.611162
## 6  -25.746 Upper middle income          8.641039

str(corruption)

## 'data.frame':    168 obs. of  17 variables:
##  $ country_name            : chr  "Mexico" "Suriname" "Sweden" "Switzerland" ...
##  $ country_text_id         : chr  "MEX" "SUR" "SWE" "CHE" ...
##  $ year                    : num  2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 ...
##  $ region                  : Factor w/ 6 levels "Eastern Europe and Central Asia",..: 2 2 5 5 4 4 6 6 1 1 ...
##  $ disclose_donations_ord  : num  3 1 2 0 2 1 3 2 3 2 ...
##  $ public_sector_corruption: num  48.8 24.8 1.3 1.4 65.2 57.1 3.7 36.8 70.6 71.2 ...
##  $ polyarchy               : num  64.7 76.1 90.8 89.4 72 70.3 83.2 43.6 26.2 48.5 ...
##  $ civil_liberties         : num  71.2 87.7 96.9 94.8 90.4 82.2 92.8 56.9 43 85.5 ...
##  $ disclose_donations      : logi  TRUE FALSE FALSE FALSE FALSE FALSE ...
##  $ iso2c                   : chr  "MX" "SR" "SE" "CH" ...
##  $ population              : num  1.29e+08 5.87e+05 1.04e+07 8.64e+06 3.11e+07 ...
##   ..- attr(*, "label")= chr "Population, total"
##  $ gdp_percapita           : num  8923 7530 51542 85685 2021 ...
##   ..- attr(*, "label")= chr "GDP per capita (constant 2015 US$)"
##  $ capital                 : chr  "Mexico City" "Paramaribo" "Stockholm" "Bern" ...
##  $ longitude               : chr  "-99.1276" "-55.1679" "18.0645" "7.44821" ...
##  $ latitude                : chr  "19.427" "5.8232" "59.3327" "46.948" ...
##  $ income                  : chr  "Upper middle income" "Upper middle income" "High income" "High income" ...
##  $ log_gdp_percapita       : num  9.1 8.93 10.85 11.36 7.61 ...
##   ..- attr(*, "label")= chr "GDP per capita (constant 2015 US$)"

TASK 1 Using the Civil dataset, perform a simple linear regression with public_sector_corruption as the dependent variable and polyarchy as the independent variable. Visualize the relationship with a scatter plot and overlay the regression line. Use the sjPlot package to create regression tables and interpret the results.

# Load necessary libraries
library(ggplot2)
library(sjPlot)

# Perform the linear regression
model <- lm(public_sector_corruption ~ polyarchy, data = corruption)

# Summarize the regression results
summary(model)

## 
## Call:
## lm(formula = public_sector_corruption ~ polyarchy, data = corruption)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -69.498 -14.334   1.448  16.985  44.436 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 89.44444    3.95373   22.62   <2e-16 ***
## polyarchy   -0.82641    0.06786  -12.18   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.68 on 166 degrees of freedom
## Multiple R-squared:  0.4718, Adjusted R-squared:  0.4686 
## F-statistic: 148.3 on 1 and 166 DF,  p-value: < 2.2e-16

# Scatter plot with regression line
ggplot(corruption, aes(x = polyarchy, y = public_sector_corruption)) +
  geom_point() +  # Scatter plot
  geom_smooth(method = "lm", col = "blue") +  # Regression line
  labs(title = "Relationship between Polyarchy and Public Sector Corruption",
       x = "Polyarchy",
       y = "Public Sector Corruption")

## `geom_smooth()` using formula = 'y ~ x'

# Regression table
tab_model(model, show.ci = TRUE, show.se = TRUE, show.p = TRUE)

	public_sector_corruption
Predictors	Estimates	std. Error	CI	p
(Intercept)	89.44	3.95	-Inf – Inf	<0.001
polyarchy	-0.83	0.07	-Inf – Inf	<0.001
Observations	168
R² / R² adjusted	0.472 / 0.469

The regression analysis results show that the intercept is estimated at 89.44, with a standard error of 3.95. The p-value is less than 0.001, indicating statistical significance. Polyarchy, the model’s main predictor, has a coefficient of -0.83, indicating that public sector corruption decreases (0.83 units) with polyarchy. The standard error is 0.07 and the p-value is less than 0.001, making this relationship statistically significant. The R² value of 0.472 indicates that the model explains 47.2% of public sector corruption variance. The adjusted R² is slightly lower at 0.469. The analysis included 168 observations. However, the confidence intervals for both the intercept and the polyarchy coefficient are reported as Infinty - Infinity, indicating a problem with their calculation, which could be due to data characteristics or other technical factors in the estimation of the model.

TASK 2 Extend the model from Task 1 by adding a quadratic term for polyarchy to capture potential non-linear relationships. Visualize the polynomial relationship using ggplot2. Calculate the marginal effects of polyarchy at different levels (30, 60, 90) using both manual calculations and the marginaleffects package. Interpret the results.

# Extend the model with a quadratic term
model_quad <- lm(public_sector_corruption ~ polyarchy + I(polyarchy^2), data = corruption)

# Summarize the extended model
summary(model_quad)

## 
## Call:
## lm(formula = public_sector_corruption ~ polyarchy + I(polyarchy^2), 
##     data = corruption)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -64.462  -7.475   1.418  14.107  35.187 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    53.116104   7.019792   7.567 2.55e-12 ***
## polyarchy       0.974653   0.305335   3.192  0.00169 ** 
## I(polyarchy^2) -0.017310   0.002874  -6.023 1.08e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.69 on 165 degrees of freedom
## Multiple R-squared:  0.567,  Adjusted R-squared:  0.5618 
## F-statistic:   108 on 2 and 165 DF,  p-value: < 2.2e-16

# Load necessary libraries
library(ggplot2)
library(marginaleffects)

# Create the scatter plot with polynomial regression line
ggplot(corruption, aes(x = polyarchy, y = public_sector_corruption)) +
  geom_point() +  # Scatter plot
  stat_smooth(method = "lm", formula = y ~ poly(x, 2), col = "blue") +  # Polynomial regression line
  labs(title = "Polynomial Relationship between Polyarchy and Public Sector Corruption",
       x = "Polyarchy",
       y = "Public Sector Corruption")

# Coefficients from the quadratic model
coef_linear <- coef(model_quad)["polyarchy"]
coef_quadratic <- coef(model_quad)["I(polyarchy^2)"]

# Manual marginal effects calculation
marginal_effect_30 <- coef_linear + 2 * coef_quadratic * 30
marginal_effect_60 <- coef_linear + 2 * coef_quadratic * 60
marginal_effect_90 <- coef_linear + 2 * coef_quadratic * 90

# Display the results
marginal_effect_30

##   polyarchy 
## -0.06392759

marginal_effect_60

## polyarchy 
## -1.102508

marginal_effect_90

## polyarchy 
## -2.141088

# Calculate marginal effects using marginaleffects package
marginal_effects <- slopes(model_quad, newdata = data.frame(polyarchy = c(30, 60, 90)))

# Display the marginal effects
summary(marginal_effects)

##      rowid         term              estimate          std.error      
##  Min.   :1.0   Length:3           Min.   :-2.14109   Min.   :0.07681  
##  1st Qu.:1.5   Class :character   1st Qu.:-1.62180   1st Qu.:0.10882  
##  Median :2.0   Mode  :character   Median :-1.10251   Median :0.14083  
##  Mean   :2.0                      Mean   :-1.10251   Mean   :0.14815  
##  3rd Qu.:2.5                      3rd Qu.:-0.58322   3rd Qu.:0.18381  
##  Max.   :3.0                      Max.   :-0.06393   Max.   :0.22680  
##    statistic           p.value          s.value            conf.low      
##  Min.   :-14.3535   Min.   :0.0000   Min.   :  0.6218   Min.   :-2.5856  
##  1st Qu.:-11.8970   1st Qu.:0.0000   1st Qu.: 34.2455   1st Qu.:-1.9193  
##  Median : -9.4405   Median :0.0000   Median : 67.8693   Median :-1.2531  
##  Mean   : -8.0827   Mean   :0.2166   Mean   : 73.7606   Mean   :-1.3929  
##  3rd Qu.: -4.9472   3rd Qu.:0.3249   3rd Qu.:110.3301   3rd Qu.:-0.7965  
##  Max.   : -0.4539   Max.   :0.6499   Max.   :152.7908   Max.   :-0.3399  
##    conf.high        predicted_lo      predicted_hi       predicted      
##  Min.   :-1.6966   Min.   : 0.6361   Min.   : 0.6169   Min.   : 0.6265  
##  1st Qu.:-1.3243   1st Qu.:24.9607   1st Qu.:24.9462   1st Qu.:24.9535  
##  Median :-0.9520   Median :49.2854   Median :49.2755   Median :49.2805  
##  Mean   :-0.8121   Mean   :38.8996   Mean   :38.8897   Mean   :38.8946  
##  3rd Qu.:-0.3699   3rd Qu.:58.0313   3rd Qu.:58.0261   3rd Qu.:58.0287  
##  Max.   : 0.2121   Max.   :66.7773   Max.   :66.7767   Max.   :66.7770  
##    polyarchy  public_sector_corruption
##  Min.   :30   Min.   :48.8            
##  1st Qu.:45   1st Qu.:48.8            
##  Median :60   Median :48.8            
##  Mean   :60   Mean   :48.8            
##  3rd Qu.:75   3rd Qu.:48.8            
##  Max.   :90   Max.   :48.8

The marginal effects calculated manually at polyarchy levels of 30, 60, and 90 show different slopes at these points. Specifically, the marginal effect at polyarchy = 30 is approximately -0.064, indicating that polyarchy has a slight negative impact on public sector corruption at low levels. At polyarchy = 60, the marginal effect is -1.103, indicating a stronger negative relationship as polyarchy approaches a medium level. At polyarchy = 90, the marginal effect is -2.141, with a stronger negative impact at higher polyarchy levels. The marginal effects at these levels have been confirmed using the marginaleffects package. The marginal effects decrease, supporting the idea that polyarchy reduces public sector corruption more as it increases. Countries with more polyarchy reduce public sector corruption more.

TASK 3 Using the Civil dataset, fit a logistic regression model predicting the presence of campaign finance disclosure laws (disclose_donations) with public_sector_corruption and log_gdp_percapita as predictors. Use the sjPlot package to create regression tables and interpret the results.

# Fit the logistic regression model
model_logistic <- glm(disclose_donations ~ public_sector_corruption + log_gdp_percapita, 
                      data = corruption, 
                      family = binomial)

# Summarize the model
summary(model_logistic)

## 
## Call:
## glm(formula = disclose_donations ~ public_sector_corruption + 
##     log_gdp_percapita, family = binomial, data = corruption)
## 
## Coefficients:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -0.50466    2.18953  -0.230    0.818    
## public_sector_corruption -0.05964    0.01191  -5.007 5.54e-07 ***
## log_gdp_percapita         0.24907    0.21785   1.143    0.253    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 217.79  on 167  degrees of freedom
## Residual deviance: 131.30  on 165  degrees of freedom
## AIC: 137.3
## 
## Number of Fisher Scoring iterations: 5

# Create the regression table
tab_model(model_logistic, show.ci = TRUE, show.se = TRUE, show.p = TRUE)

	disclose_donations
Predictors	Odds Ratios	std. Error	CI	p
(Intercept)	0.60	1.32	0.00 – Inf	0.818
public_sector_corruption	0.94	0.01	0.00 – Inf	<0.001
GDP per capita (constant 2015 US$)	1.28	0.28	0.00 – Inf	0.253
Observations	168
R² Tjur	0.454

The logistic regression model shows that the intercept has an odds ratio of 0.60, a standard error of 1.32, and a p-value of 0.818. This implies that when both public sector corruption and log GDP per capita are zero, the chances of having campaign finance disclosure laws are slightly lower than one, but this effect is not statistically significant. The odds ratio for public sector corruption is 0.94, which means that for every unit increase in public sector corruption, the chances of having campaign finance disclosure laws fall by about 6%. This estimate has a standard error of 0.01, indicating high precision, and a p-value less than 0.001, confirming that this relationship is statistically significant. The odds ratio for log GDP per capita is 1.28, indicating that higher GDP per capita increases the likelihood of campaign finance disclosure laws by 28%. The standard error is 0.28, and the p-value is 0.253, suggesting that this predictor is not statistically significant in the model. Tjur’s R² is 0.454, indicating that the model explains 45.4% of the variation in campaign finance disclosure laws. However, the confidence intervals for all predictors range from 0.00 to infinity, suggesting potential issues with the model’s fit or data.

TASK 4 Calculate the marginal effects of public_sector_corruption from the logistic regression model in Task 3 at representative values (20, 50, 80). Use the marginaleffects and emmeans packages to compute these effects. Visualize the predicted probabilities of having campaign finance disclosure laws across a range of public_sector_corruption values using ggplot2.

# Ensure the original data is used
newdata_for_slopes <- data.frame(public_sector_corruption = c(20, 50, 80),
                                 log_gdp_percapita = mean(corruption$log_gdp_percapita))

# Calculate marginal effects using slopes()
marginal_effects_slopes <- slopes(model_logistic, newdata = newdata_for_slopes)
summary(marginal_effects_slopes)

##      rowid          term              estimate           std.error        
##  Min.   :1.00   Length:6           Min.   :-0.014221   Min.   :0.0008188  
##  1st Qu.:1.25   Class :character   1st Qu.:-0.007893   1st Qu.:0.0018280  
##  Median :2.00   Mode  :character   Median : 0.003760   Median :0.0070514  
##  Mean   :2.00                      Mean   : 0.013936   Mean   :0.0179391  
##  3rd Qu.:2.75                      3rd Qu.: 0.032965   3rd Qu.:0.0307530  
##  Max.   :3.00                      Max.   : 0.059394   Max.   :0.0539726  
##    statistic         p.value             s.value          conf.low        
##  Min.   :-6.060   Min.   :0.0000000   Min.   : 1.321   Min.   :-0.046391  
##  1st Qu.:-5.134   1st Qu.:0.0009582   1st Qu.: 1.876   1st Qu.:-0.028727  
##  Median :-1.025   Median :0.1374871   Median : 4.955   Median :-0.015987  
##  Mean   :-1.966   Mean   :0.1580304   Mean   :11.739   Mean   :-0.021224  
##  3rd Qu.: 1.033   3rd Qu.:0.2724973   3rd Qu.:22.915   3rd Qu.:-0.013022  
##  Max.   : 1.100   Max.   :0.4002592   Max.   :29.451   Max.   :-0.003973  
##    conf.high          predicted_lo      predicted_hi       predicted      
##  Min.   :-0.009621   Min.   :0.04141   Min.   :0.04141   Min.   :0.04142  
##  1st Qu.:-0.005059   1st Qu.:0.08243   1st Qu.:0.08241   1st Qu.:0.08242  
##  Median : 0.016083   Median :0.20546   Median :0.20542   Median :0.20544  
##  Mean   : 0.049096   Mean   :0.28477   Mean   :0.28474   Mean   :0.28476  
##  3rd Qu.: 0.093239   3rd Qu.:0.50692   3rd Qu.:0.50687   3rd Qu.:0.50692  
##  Max.   : 0.165178   Max.   :0.60749   Max.   :0.60743   Max.   :0.60742  
##  public_sector_corruption log_gdp_percapita disclose_donations
##  Min.   :20.0             Min.   :8.567     Mode:logical      
##  1st Qu.:27.5             1st Qu.:8.567     TRUE:6            
##  Median :50.0             Median :8.567                       
##  Mean   :50.0             Mean   :8.567                       
##  3rd Qu.:72.5             3rd Qu.:8.567                       
##  Max.   :80.0             Max.   :8.567

# Load the packages
library(emmeans)

## Welcome to emmeans.
## Caution: You lose important information if you filter this package's results.
## See '? untidy'

# Calculate marginal effects using emmeans
em_marginal_effects <- emmeans(model_logistic, ~ public_sector_corruption, at = list(public_sector_corruption = c(20, 50, 80)))

# Summarize the results
summary(em_marginal_effects)

##  public_sector_corruption emmean    SE  df asymp.LCL asymp.UCL
##                        20  0.436 0.300 Inf    -0.152     1.024
##                        50 -1.353 0.279 Inf    -1.899    -0.806
##                        80 -3.142 0.567 Inf    -4.252    -2.031
## 
## Results are given on the logit (not the response) scale. 
## Confidence level used: 0.95

# Generate a sequence of public_sector_corruption values for prediction
public_sector_corruption_seq <- data.frame(public_sector_corruption = seq(0, 100, by = 1),
                                           log_gdp_percapita = mean(corruption$log_gdp_percapita))

# Predict probabilities using the logistic model
predicted_probs <- predict(model_logistic, newdata = public_sector_corruption_seq, type = "response")

# Combine the predictions with the sequence of public_sector_corruption values
prediction_data <- cbind(public_sector_corruption_seq, predicted_probs)

# Plot the predicted probabilities using ggplot2
library(ggplot2)
ggplot(prediction_data, aes(x = public_sector_corruption, y = predicted_probs)) +
  geom_line(color = "blue") +
  labs(title = "Predicted Probabilities of Campaign Finance Disclosure Laws",
       x = "Public Sector Corruption",
       y = "Predicted Probability") +
  theme_minimal()

TASK 5 Explore the interaction effect between public_sector_corruption and region in the logistic regression model from Task 3. Use the datagrid() function from the marginaleffects package to create a dataset with representative values for regions. Fit the logistic regression model with the interaction term and visualize the interaction effects using ggplot2. Interpret the results and discuss the implications of the interaction effect.

# Create a dataset with representative values for regions using datagrid()
representative_data <- datagrid(model = model_logistic, 
                                region = unique(corruption$region),
                                public_sector_corruption = seq(0, 100, by = 10))

## Warning: Some of the variable names are missing from the model data: region

# Display the first few rows of the representative dataset
head(representative_data)

##   log_gdp_percapita                          region public_sector_corruption
## 1          8.567353 Latin America and the Caribbean                        0
## 2          8.567353 Latin America and the Caribbean                       10
## 3          8.567353 Latin America and the Caribbean                       20
## 4          8.567353 Latin America and the Caribbean                       30
## 5          8.567353 Latin America and the Caribbean                       40
## 6          8.567353 Latin America and the Caribbean                       50
##   rowid
## 1     1
## 2     2
## 3     3
## 4     4
## 5     5
## 6     6

# Fit the logistic regression model with the interaction term
model_interaction <- glm(disclose_donations ~ public_sector_corruption * region + log_gdp_percapita,
                         data = corruption, 
                         family = binomial)

# Summarize the model
summary(model_interaction)

## 
## Call:
## glm(formula = disclose_donations ~ public_sector_corruption * 
##     region + log_gdp_percapita, family = binomial, data = corruption)
## 
## Coefficients:
##                                                                 Estimate
## (Intercept)                                                      3.21658
## public_sector_corruption                                        -0.06335
## regionLatin America and the Caribbean                           -2.47593
## regionMiddle East and North Africa                              -0.65585
## regionSub-Saharan Africa                                        -1.61845
## regionWestern Europe and North America                          -1.05205
## regionAsia and Pacific                                          -0.92044
## log_gdp_percapita                                                0.01539
## public_sector_corruption:regionLatin America and the Caribbean   0.02499
## public_sector_corruption:regionMiddle East and North Africa     -0.05436
## public_sector_corruption:regionSub-Saharan Africa               -0.02279
## public_sector_corruption:regionWestern Europe and North America -0.03829
## public_sector_corruption:regionAsia and Pacific                 -0.03145
##                                                                 Std. Error
## (Intercept)                                                        3.52849
## public_sector_corruption                                           0.02299
## regionLatin America and the Caribbean                              1.53294
## regionMiddle East and North Africa                                 2.36552
## regionSub-Saharan Africa                                           1.79649
## regionWestern Europe and North America                             1.57290
## regionAsia and Pacific                                             1.85141
## log_gdp_percapita                                                  0.34500
## public_sector_corruption:regionLatin America and the Caribbean     0.03045
## public_sector_corruption:regionMiddle East and North Africa        0.06720
## public_sector_corruption:regionSub-Saharan Africa                  0.03978
## public_sector_corruption:regionWestern Europe and North America    0.07872
## public_sector_corruption:regionAsia and Pacific                    0.04645
##                                                                 z value
## (Intercept)                                                       0.912
## public_sector_corruption                                         -2.755
## regionLatin America and the Caribbean                            -1.615
## regionMiddle East and North Africa                               -0.277
## regionSub-Saharan Africa                                         -0.901
## regionWestern Europe and North America                           -0.669
## regionAsia and Pacific                                           -0.497
## log_gdp_percapita                                                 0.045
## public_sector_corruption:regionLatin America and the Caribbean    0.821
## public_sector_corruption:regionMiddle East and North Africa      -0.809
## public_sector_corruption:regionSub-Saharan Africa                -0.573
## public_sector_corruption:regionWestern Europe and North America  -0.486
## public_sector_corruption:regionAsia and Pacific                  -0.677
##                                                                 Pr(>|z|)   
## (Intercept)                                                      0.36198   
## public_sector_corruption                                         0.00586 **
## regionLatin America and the Caribbean                            0.10628   
## regionMiddle East and North Africa                               0.78158   
## regionSub-Saharan Africa                                         0.36764   
## regionWestern Europe and North America                           0.50358   
## regionAsia and Pacific                                           0.61908   
## log_gdp_percapita                                                0.96442   
## public_sector_corruption:regionLatin America and the Caribbean   0.41188   
## public_sector_corruption:regionMiddle East and North Africa      0.41851   
## public_sector_corruption:regionSub-Saharan Africa                0.56671   
## public_sector_corruption:regionWestern Europe and North America  0.62666   
## public_sector_corruption:regionAsia and Pacific                  0.49833   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 217.79  on 167  degrees of freedom
## Residual deviance: 114.37  on 155  degrees of freedom
## AIC: 140.37
## 
## Number of Fisher Scoring iterations: 7

# Create the regression table
tab_model(model_interaction, show.ci = TRUE, show.se = TRUE, show.p = TRUE)

	disclose_donations
Predictors	Odds Ratios	std. Error	CI	p
(Intercept)	24.94	88.01	0.00 – Inf	0.362
public_sector_corruption	0.94	0.02	0.00 – Inf	0.006
region: Latin America and the Caribbean	0.08	0.13	0.00 – Inf	0.106
region: Middle East and North Africa	0.52	1.23	0.00 – Inf	0.782
region: Sub-Saharan Africa	0.20	0.36	0.00 – Inf	0.368
region: Western Europe and North America	0.35	0.55	0.00 – Inf	0.504
region: Asia and Pacific	0.40	0.74	0.00 – Inf	0.619
GDP per capita (constant 2015 US$)	1.02	0.35	0.00 – Inf	0.964
public_sector_corruption:regionLatin America and the Caribbean	1.03	0.03	0.00 – Inf	0.412
public_sector_corruption:regionMiddle East and North Africa	0.95	0.06	0.00 – Inf	0.419
public_sector_corruption:regionSub-Saharan Africa	0.98	0.04	0.00 – Inf	0.567
public_sector_corruption:regionWestern Europe and North America	0.96	0.08	0.00 – Inf	0.627
public_sector_corruption:regionAsia and Pacific	0.97	0.05	0.00 – Inf	0.498
Observations	168
R² Tjur	0.521

# Predict probabilities using the interaction model for the representative data
predicted_probs_interaction <- predict(model_interaction, newdata = representative_data, type = "response")

# Combine the predictions with the representative dataset
interaction_data <- cbind(representative_data, predicted_probs_interaction)

# Plot the interaction effects using ggplot2
library(ggplot2)
ggplot(interaction_data, aes(x = public_sector_corruption, y = predicted_probs_interaction, color = region)) +
  geom_line() +
  labs(title = "Interaction Effect of Public Sector Corruption and Region on Campaign Finance Disclosure Laws",
       x = "Public Sector Corruption",
       y = "Predicted Probability",
       color = "Region") +
  theme_minimal()

The logistic regression model shows that public sector corruption has a significant negative effect on the likelihood of having campaign finance disclosure laws, with an odds ratio of 0.94. This implies that for every unit increase in public sector corruption, the odds of having such laws decreases by about 6%. This relationship is statistically significant (p-value = 0.006). However, the interaction terms between public sector corruption and region are not statistically significant, implying that the effect of corruption on the likelihood of enforcing disclosure laws does not vary significantly across regions. The visualization of interaction effects further supports these findings. As public sector corruption increases, the predicted probability of having campaign finance disclosure laws decreases, and this trend is observed across all regions. However, the slopes of the lines are similar, indicating that the impact of corruption on the likelihood of adopting disclosure laws is not significantly different across regions. For example, in Eastern Europe and Central Asia, the probability begins high but gradually declines as corruption increases, similar to patterns seen in other regions such as Latin America and the Caribbean, Western Europe, and North America. The uniformity of these slopes suggests that, while corruption reduces the likelihood of disclosure laws, this effect is not significantly influenced by regional differences.

Problem Set 6

Anum Peshimam

2024-08-06