Here I use the database of DW-Nominate scores. At its core, the data seeks to use data about legislators’ choices (in this case, roll-call voting) to “map” legislators’ ideological positions. I am using the dataset to practice conducting exploratory data analysis, practice basic concepts in regression, correlation and causation, and running regressions.
I am going to examine the hypothesis that older senators are more conservative.
## ── Data Summary ────────────────────────
## Values
## Name Piped data
## Number of rows 537
## Number of columns 25
## _______________________
## Column type frequency:
## numeric 2
## ________________________
## Group variables None
##
## ── Variable type: numeric ──────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50
## 1 nominate_percentile 0 1 49.9 29.0 0 24.8 50
## 2 age 0 1 58.9 11.8 30 50 59
## p75 p100 hist
## 1 75 100 ▇▇▇▇▇
## 2 68 86 ▂▅▇▇▂
The correlation coefficient between age and nominate_percentile in the 116th Congress is -0.159. The two variables are weakly negatively correlated, meaning that as age increases, the percentile of NOMINATE decreases, but trivially.*
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
| Effect of Age on DW Nominate Percentile | |||
|---|---|---|---|
| Higher percentile suggests more conservative | |||
| Variable | Estimate | Lower bound | Upper bound |
| (Intercept) | 45.7469111 | 37.4458230 | 54.0479993 |
| age | -0.3322965 | -0.4687675 | -0.1958254 |
The equation of the regression follows nominate_percent = b0 + b1 * Age. The y-intercept of 45.75 means that in the fictitious case that age is 0, the percent rank of nominate will be 45.75. The average treatment effect of increasing 1 year of age for a congressman is negative 0.33 percent. With every increase in age by 1 year, the percent rank decreases by 0.33 percent. This does not represent a definite causal relationship, however, as there can factors other than age that affect someone’s ideology. A senator might go through more experiences as he or she ages, which can shift his or her ideology from conservative to liberal. The change in ideology, in this case, is caused by the senator’s experience, rather than directly the age. The confidence interval allows us to get a more accurate idea of the range of the correlation coefficient, as if we did many boostrapped resamples of the congress.
The Rubin Causal Model states that no causation without manipulation. If we interpreted the coefficient on military causally, we would say that the slope we get from our calculation would be the average causal effect of military on the percent rank, which is the difference between the coefficient of the group that is treated the the group that is not. However, since we are unable to see both results at the same time, this way of interpreting would not be truly causal.