Please use this R Markdown template to report your code, ouput, and written answers in a single document. Remember that you may not collaborate with others on exams. Please ask clarifying questions on CampusWire. Turn in your completed exam by uploading the compiled html or pdf file to Brightspace by 9:30 AM EST on Thursday, April 7. Make sure to comment your code (using the # key). Report results in the correct units of measurement. Do not report more than two digits to the right of the decimal point.

Name: Meghla Srabon

TA: Alejandro

Understanding the Rise of the Nazi Party in Germany

Who brought the Nazis to power? Researchers have attempted to answer this question by analyzing aggregate election data from the 1932 German election.

This exercise is based on the following article: King, Gary, Ori Rosen, Martin Tanner, Alexander F. Wagner. 2008. “Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler.Journal of Economic History 68(4): 951-996.

We analyze a simplified version of this article’s data, which records, for each precinct in Germany, the number of eligible voters as well as the number of votes for the Nazi party in 1932. In addition, the data set contains the aggregate occupation statistics for each precinct. The table below presents the variable names and descriptions of the data file nazis.csv. Each observation represents a German precinct.

Name Description
shareself Proportion of self-employed potential voters
shareblue Proportion of blue-collar potential voters
sharewhite Proportion of white-collar potential voters
sharedomestic Proportion of domestically employed potential voters
shareunemployed Proportion of unemployed potential voters
nvoter Number of eligible voters
nazivote Number of votes for Nazis

The goal of the analysis is to investigate which types of voters (based on their occupation category) cast ballots for the Nazis in 1932.

Question 1 (10 pts)

First, load the data. Use the dim(), min(), and max() functions to answer the following questions. How many precincts are there in the data? What are the minimum and maximum observed unemployment rates (in percentage terms?) (Hint: you might want to store these minimum and maximum values in objects so that you can use them later.)

Answer 1

setwd("~/Desktop/POL-UA-850 Files/midterm 2")
nazis <- read.csv("nazis.csv")
View(nazis)

dim(nazis)
## [1] 681   7
getwd()
## [1] "/Users/meghlasrabon/Desktop/POL-UA-850 Files/midterm 2"
minunemployed <- min(nazis$shareunemployed)

maxunemployed <- max(nazis$shareunemployed)

There are 681 precincts in the data.The minimum observed unemployment rate is roughly 2 percentage points while the maximum is 40 percentage points.

Question 2 (10 pts)

One hypothesis is that growing unemployment in Germany in the late 1920s and early 1930s led to the rise of the Nazi party. If this hypothesis is correct, then we should observe higher vote shares for the Nazi party in precincts with higher unemployment. First, make a variable for the Nazi party’s vote share in each precinct by dividing the number of votes for the Nazi party by the number of eligible voters. Next, estimate a linear model where the outcome is the Nazi party’s vote share and the predictor is the proportion of unemployed workers in each precinct. Explain what each coefficient means. What is the predicted change in the Nazi party voteshare for a 10 percentage point increase in a precinct’s unemployment rate? Do these findings support the hypothesis stated at the start of this question?

Answer 2

racist <- (nazis$nazivote)/(nazis$nvoter)
fit1 <- lm(racist~nazis$shareunemployed, data = nazis)
summary(fit1)
## 
## Call:
## lm(formula = racist ~ nazis$shareunemployed, data = nazis)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.269553 -0.071051  0.007933  0.067755  0.295344 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            0.466303   0.008066  57.810  < 2e-16 ***
## nazis$shareunemployed -0.368942   0.051077  -7.223 1.37e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1042 on 679 degrees of freedom
## Multiple R-squared:  0.07136,    Adjusted R-squared:  0.06999 
## F-statistic: 52.17 on 1 and 679 DF,  p-value: 1.367e-12

Since the intercept is 47 percentage points, this is the Nazi party vote share in a district with no unemployed workers. The -0.368 integer indicates that there is an estimated decrease of 37 percentage points in the Nazi party’s voter share with every one unit increase in the proportion of unemployed potential voters. The predicted change in the Nazi party voteshare for a 10 percentage point increase in a precinct’s unemployment rate is roughly -3.68 percentage points (?) This does not support the hypothesis stated at the beginning of the question because we see that an increase in the unemployment rate correlates with a decrease in a precinct’s support/vote for the Nazi Party. The hypothesis stated that the opposite should occur.

Question 3 (10 pts)

Use the coef() function to answer the following questions. What is the predicted Nazi party voteshare in a district with the minimum observed unemployment rate? What is the predicted Nazi party voteshare in a district with the maximum observed unemployment rate? (Hint: you might want to also store these predicted values in objects so that you can use them later.)

Answer 3

coef(lm(racist~nazis$shareunemployed==minunemployed, data = nazis))
##                                (Intercept) 
##                                  0.4155210 
## nazis$shareunemployed == minunemployedTRUE 
##                                  0.1050933
coef(lm(racist~nazis$shareunemployed==maxunemployed, data = nazis))
##                                (Intercept) 
##                                 0.41557905 
## nazis$shareunemployed == maxunemployedTRUE 
##                                 0.06557357
predictedmin<- coef(lm(racist~nazis$shareunemployed==minunemployed, data = nazis))

predictedmax <- coef(lm(racist~nazis$shareunemployed==maxunemployed, data = nazis))

The predicted Nazi party voteshare in a district with the minimum observed unemployment rate is 51 percentage points whereas the voteshare in a district with the maximum observed unemployment rate is 48 percentage points.

Question 4 (15 pts)

Next, make a scatterplot with the Nazis’ vote share on the y-axis and the share of unemployed workers in each precinct on the x-axis. Scale the x and y axes so that the x-axis runs from 0 to 0.5 and the y-axis runs from 0 to 0.8. Add a title, x-axis label, y-axis label. Add the line for the predicted Nazi vote share from the regression model in Quetion 2 to the plot. Finally, add the points for predicted Nazi party voteshare for the precincts with the lowest and highest observed unemployment rates. Make these points solid and in a different color from the rest of the points so that they can be seen more easily.

Answer 4

nvoterbelow <- subset(nazis, nvoter<= 35203)
nvoterabove <- subset(nazis, nvoter> 35203)
racistnew <- nvoterabove$nazivote/nvoterabove$nvoter
racistnew2 <- nvoterbelow$nazivote/nvoterbelow$nvoter
plot(nazis$shareunemployed,
    racist,
     ylab = "Nazi's vote share", 
    ylim = c(0,0.8),
     xlab = "Share of unemployed workers",
    xlim = c(0,0.5),
     main = "Unemployment effect on Nazi's vote share",
    abline(fit1))
    points(x =0.019,
           y =0.521,
           pch = 16,
           col = "coral2")
     points(x =0.396,
           y =0.4812,
           pch = 16,
           col = "coral4")

Question 5 (10 pts)

Precincts with higher unemployment rates may have been different from other precincts. In particular, precincts with higher unemployment rates may have been more likely to be in urban areas. Perhaps other characteristics of urban areas depressed Nazi vote shares, confounding our ability to estimate the relationship between unemployment rates and Nazi voteshares. First, compute the correlation coefficient between the share of the unemployed and the Nazi party voteshare. Interpret the direction and strength of this coefficient. What does the correlation coefficient tell you about whether precinct population is a potential confounder?

Answer 5

unemployed = nazis$shareunemployed
nazivote = racist

cor(unemployed, racist)
## [1] -0.2671283

The correlation coefficient between the two variables is negative and close to zero. This indicates a weak linear relationship between the unemployment rates and Nazi vote shares. However, it will be also be slightly negatively correlated. By looking at the correlation coefficient, it does not tell us anything significant about whether precinct population is a potential confounder. We need to run more thorough testing.

Question 6 (15 pts)

If precinct population is a confounder, then we might see a positive relationship between unemployment and support for the Nazi party after controlling for precinct population. Estimate the linear model estimated for Question 2 within the subsets of a) precincts below or equal to the median number of eligible voters, and b) precincts above the median number of eligible voters. Then, add these two estimated regression lines to the plot from Question 4; make each line a different color from the line estimated for Question 2. Interpret the coefficients from each model. For each model, what is the predicted change in the Nazi party voteshare for a 1 percentage point increase in a precinct’s unemployment rate? Do these findings support the hypothesis stated at the start of this question?

Answer 6

summary(nazis$nvoter)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    5969   23404   35203   65118   67606  932831
nvoterbelow <- subset(nazis, nvoter<= 35203)
nvoterabove <- subset(nazis, nvoter> 35203)
fit2 <- lm(racistnew~nvoterabove$shareunemployed+nvoterabove$nvoter, data = nvoterabove)
fit3 <- lm(racistnew2~nvoterbelow$shareunemployed+nvoterbelow$nvoter, data = nvoterbelow)
fit2
## 
## Call:
## lm(formula = racistnew ~ nvoterabove$shareunemployed + nvoterabove$nvoter, 
##     data = nvoterabove)
## 
## Coefficients:
##                 (Intercept)  nvoterabove$shareunemployed  
##                   4.882e-01                   -4.251e-01  
##          nvoterabove$nvoter  
##                  -9.666e-08
fit3
## 
## Call:
## lm(formula = racistnew2 ~ nvoterbelow$shareunemployed + nvoterbelow$nvoter, 
##     data = nvoterbelow)
## 
## Coefficients:
##                 (Intercept)  nvoterbelow$shareunemployed  
##                   3.989e-01                   -2.310e-01  
##          nvoterbelow$nvoter  
##                   2.159e-06
plot(nazis$shareunemployed,
    racist,
     ylab = "Nazi's vote share", 
    ylim = c(0,0.8),
     xlab = "Share of unemployed workers",
    xlim = c(0,0.5),
     main = "Unemployment effect on Nazi's vote share",
    abline(fit1))
    abline(lm(nvoterabove$nazivote~nvoterabove$shareunemployed, col="red"))
## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
##  extra argument 'col' will be disregarded
    points(x =0.019,
           y =0.521,
           pch = 16,
           col = "coral2")
     points(x =0.396,
           y =0.4812,
           pch = 16,
           col = "coral4")

Insert written answer here

Question 7 (15 pts)

Next, create a variable that represents the number of eligible voters in a precinct divided by 10,000. Add this variable as a predictor to the linear model from Question 2. Interpret all coefficients. Does adding precinct population (in 10,000s) substantially change the estimated relationship between the unemployment rate and votes for the Nazi party?

Answer 7

divided <- nazis$nvoter/10000
fit4 <- lm(racist~nazis$shareunemployed + divided, data = nazis)

summary(fit4)
## 
## Call:
## lm(formula = racist ~ nazis$shareunemployed + divided, data = nazis)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.275018 -0.070671  0.007982  0.067270  0.297012 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            0.4650631  0.0080759  57.586  < 2e-16 ***
## nazis$shareunemployed -0.3133410  0.0586055  -5.347 1.23e-07 ***
## divided               -0.0009812  0.0005103  -1.923   0.0549 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.104 on 678 degrees of freedom
## Multiple R-squared:  0.07639,    Adjusted R-squared:  0.07367 
## F-statistic: 28.04 on 2 and 678 DF,  p-value: 1.995e-12

The intercept of 47 percentage points is the estimated Nazi party vote share in a precinct where the other two variables (number of eligible voters + the proportion of unemployed potential voters have a zero value. The -0.31 value is the estimated 31 percentage point decrease of the Nazi party vote share in a precinct with a one unit increase in the proportion of unemployed potential voters. The very small figure of -0.0009 is statistically insignificant and virtually has no effect on the Nazi Party vote share. This indicates an incredibly small estimated decrease of the vote share with each unit increase of the “divided” variable. Adding the precinct population (in 10,000s) does NOT substantially change the estimated relationship the unemployment rate and votes for the Nazi party.

Question 8 (15 pts)

Another hypothesis about support for the Nazi party is that support came from small self-employed business owners and farmers, who were not in danger of becoming unemployed during deteriorating economic conditions in Germany, but who were losing income during this period. Estimate two linear models, both with Nazi voteshare as the outcome, and with the following predictor variables: 1) share of self-employed workers; 2) share of self-employed workers and number of eligible voters in 10,000s. Interpret all coefficients on both models. What is the predicted change in Nazi party voteshare for a 10 percentage point increase in the share of self-employed in both models? Do the findings support the hypothesis stated at the start of this question?

Answer 8

new <- lm(racist~nazis$shareself, data = nazis)
summary(new)
## 
## Call:
## lm(formula = racist ~ nazis$shareself, data = nazis)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.27036 -0.07789  0.00679  0.07185  0.30353 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      0.33103    0.01573   21.05  < 2e-16 ***
## nazis$shareself  0.45538    0.08175    5.57 3.67e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1057 on 679 degrees of freedom
## Multiple R-squared:  0.0437, Adjusted R-squared:  0.04229 
## F-statistic: 31.03 on 1 and 679 DF,  p-value: 3.672e-08
a.inter<- coef(new)[1]
b.slope <- coef(new)[2]

pred10 <- a.inter + b.slope * 10
new2 <- lm(racist~nazis$shareself+divided, data = nazis)
summary(new2)
## 
## Call:
## lm(formula = racist ~ nazis$shareself + divided, data = nazis)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.279043 -0.073986  0.008526  0.067688  0.297217 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      0.3621736  0.0182835  19.809  < 2e-16 ***
## nazis$shareself  0.3435952  0.0880564   3.902 0.000105 ***
## divided         -0.0015922  0.0004861  -3.276 0.001108 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.105 on 678 degrees of freedom
## Multiple R-squared:  0.05859,    Adjusted R-squared:  0.05582 
## F-statistic:  21.1 on 2 and 678 DF,  p-value: 1.289e-09

##First model: The intercept of 0.33 is the estimated Nazi Party vote share when accounting for the share of self-employed workers (i.e.: the share is 0). The nazi$shareself variable has a estimated effect of 0.45 which is very significant, conveying a 45 percentage point increase of the Nazi party vote share with each one unit increase of the share of self-employed workers.

##Second model: The first two variables, the intercept and nazis$shareself ##variable in the second model have the same statistical definition as what I stated for the first model. However, the final variable which is the number of eligible voters in 10,000s has a very small estimated negative effect of -0.001 which equates to less than one percentage point. It virtually has no effect on the Nazi Party vote share. The predicted change in Nazi party voteshare for a 10 percentage point increase in the share of self-employed is roughly 4.88 percentage points. These findings support the hypothesis at the beginning of the question because there is a significant effect of one variable on the other.