Setup

Packages

library(pacman)
p_load(dagitty, ggdag, jtools, huxtable, kirkegaard, corrplot)

Rationale

The whole point of multiple regression… is to try to isolate the effects of the individual regressors by ‘controlling’ on the others. Still, when orthogonality is absent the concept of the contribution of an individual regressor remains inherently ambiguous. (Goldberger, 1964, p. 201)

This is a quick demonstration that controlling for downstream or correlated, noncausal variables can produce incorrect results. This applies when talking about finding correct \(\beta\) - between-individuals - or \(d\) - within-individuals - effects (of more varieties than these alone). There are huge effects of this fallacy since its use amounts to accepting a causal model that has no demonstrated validity and, often enough, more reason to doubt than to trust it. There are numerous permutations of this fallacy given many different names and applied to a wide variety of specific scenarios, but I’ll just focus on partialing in genere.

I may expand this page later but for now I’ll just generate a condition where \(Y\) is caused by \(X\) but \(Z\) and \(Z_1\) through \(Z_{10}\) are also caused by \(X\) independent of \(Y\); additionally, I’ll generate the condition where \(Y\) causes the \(Z\) variables but \(X\) still causes \(Y\). This graph will look like \(X\) pointing to every other variable. Someone else wrote part of this a while ago and this page just gives it a home.

Analysis

XC <- dagitty("dag { 
              X -> Y
              X -> Z
              X -> Z1
              X -> Z2
              X -> Z3
              X -> Z4
              X -> Z5
              X -> Z6
              X -> Z7
              X -> Z8
              X -> Z9
              X -> Z10
              }")

YC <- dagitty("dag { 
              X -> Y
              Y -> Z
              Y -> Z1
              Y -> Z2
              Y -> Z3
              Y -> Z4
              Y -> Z5
              Y -> Z6
              Y -> Z7
              Y -> Z8
              Y -> Z9
              Y -> Z10
              }")

ggdag(XC, layout = "circle") + apatheme + remove_axes(); ggdag(YC, layout = "circle") + apatheme + remove_axes()

set.seed(69) # For replicability

XCD = tibble(X= rnorm(1000), 
             Y = rnorm(1000) + X * 0.2,
             Z = rnorm(1000) + X)

YCD = tibble(X = rnorm(1000), 
             Y = rnorm(1000) + X * 0.2,
             Z = rnorm(1000) + Y)

for (i in 1:10) XCD[["Z" + i]] = XCD$X * 0.71 + rnorm(1000)
for (i in 1:10) YCD[["Z" + i]] = YCD$Y * 0.71 + rnorm(1000)
corrplot(XCD.cor); corrplot(YCD.cor)

# X -> Y + Z models

XCC <- lm(Y ~ X, XCD); XCR <- lm(X ~ Y, XCD); XCW1 <- lm(Y ~ Z, XCD); XCW2 <- lm(X ~ Z, XCD); XCW3 <- lm(Y ~ X + Z, XCD); XCP1 <- lm(Y ~ X + Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10, XCD); XCP2 <- lm(X ~ Y + Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10, XCD)

export_summs(XCC, XCR, XCW1, XCW2, XCW3, XCP1, XCP2, scale = T, error_format = "[{conf.low}, {conf.high}]", digits = getOption("jtools-digits", 4))
Model 1Model 2Model 3Model 4Model 5Model 6Model 7
(Intercept)-0.0362    -0.0266    -0.0362    -0.0266    -0.0362    -0.0362 -0.0266 *  
[-0.0974, 0.0251]   [-0.0865, 0.0333]   [-0.0979, 0.0256]   [-0.0693, 0.0162]   [-0.0974, 0.0251]   [-0.0973, 0.0250][-0.0511, -0.0020]   
X0.1757 ***                           0.1800 ***0.0965          
[0.1144, 0.2369]                              [0.0927, 0.2672]   [-0.0558, 0.2488]         
Y         0.1719 ***                                 0.0159    
         [0.1119, 0.2318]                                    [-0.0092, 0.0410]   
Z                  0.1220 ***0.6974 ***-0.0060                   
                  [0.0603, 0.1838]   [0.6546, 0.7402]   [-0.0933, 0.0812]                  
Z1                                             0.0214 0.1224 ***
                                             [-0.0523, 0.0950][0.0939, 0.1510]   
Z2                                             -0.0449 0.1539 ***
                                             [-0.1191, 0.0293][0.1257, 0.1821]   
Z3                                             0.0430 0.1504 ***
                                             [-0.0316, 0.1176][0.1220, 0.1789]   
Z4                                             0.0176 0.1359 ***
                                             [-0.0581, 0.0932][0.1067, 0.1650]   
Z5                                             0.0393 0.1661 ***
                                             [-0.0379, 0.1166][0.1368, 0.1953]   
Z6                                             0.0239 0.1174 ***
                                             [-0.0504, 0.0982][0.0885, 0.1463]   
Z7                                             0.0593 0.1471 ***
                                             [-0.0148, 0.1334][0.1188, 0.1754]   
Z8                                             -0.0562 0.1488 ***
                                             [-0.1304, 0.0180][0.1204, 0.1771]   
Z9                                             0.0576 0.1606 ***
                                             [-0.0186, 0.1338][0.1317, 0.1895]   
Z10                                             -0.0273 0.1314 ***
                                             [-0.1010, 0.0464][0.1030, 0.1598]   
N1000         1000         1000         1000         1000         1000      1000         
R20.0307    0.0307    0.0148    0.5063    0.0308    0.0430 0.8389    
All continuous predictors are mean-centered and scaled by 1 standard deviation. *** p < 0.001; ** p < 0.01; * p < 0.05.

Models 1 and 2 are the correct and reverse-causal models respectively. They cannot be differentiated. Model 3 involves one of the \(Z\) variables caused by \(X\) being regressed against \(Y\). This model could be considered informative even though \(Z\) does not cause \(Y\) and only maintains an association due to its relationship to one of the causes of \(Y\), \(X\). Model 4 shows that one of the downstream variables of \(X\), \(Z\), can be regressed against \(X\) in a decent model explaining a staggering 51% of the variance in \(X\)! Interpreted improperly, one could argue that when regressing \(X\) on \(Y\), we ought to control for \(Z\) (the argument probably being dependent on what it is), but this is foolish because \(X\) is a cause of \(Z\) and not the reverse. Model 5 does just this and, though the increase isn’t much, with more complex causal or noncausal relationships in the data, there may be more inflation of the model \(R^2\) and effects on the propriety of the \(Y\) ~ \(X\) estimate. Model 6 shows that controlling for a bevy of downstream variables from \(X\), we can render the association between \(X\) and \(Y\) insignificant and inflate the model \(R^2\) in the process. Moreover, no variable should have a negative sign by definition (though they are all insignificant, but may not be with a larger n; a larger n will also influence the coefficients beneficially as-is, but in real-world data with many diverse causes, problems can still abound). Model 7 shows much larger \(R^2\) inflation and coefficients for the \(Z\) variables in the correct direction but everything is still wrong. This should make it clear that controls do not necessarily make models more accurate. Regarding reverse causality, there are exceptions to model order informing us about it like with Kelley’s paradox. A final note is that it can be very difficult to discern genuine mediation cross-sectionally, though there are a lot of methods to help with this now.

# X -> Y -> Z models

YCC <- lm(Y ~ X, YCD); YCR <- lm(X ~ Y, YCD); YCW1 <- lm(Y ~ Z, YCD); YCW2 <- lm(X ~ Z, YCD); YCW3 <- lm(Y ~ X + Z, YCD); YCP1 <- lm(Y ~ X + Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10, YCD); YCP2 <- lm(X ~ Y + Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10, YCD)

export_summs(YCC, YCR, YCW1, YCW2, YCW3, YCP1, YCP2, scale = T, error_format = "[{conf.low}, {conf.high}]", digits = getOption("jtools-digits", 4))
Model 1Model 2Model 3Model 4Model 5Model 6Model 7
(Intercept)0.0034    -0.0048    0.0034    -0.0048    0.0034    0.0034    -0.0048 
[-0.0596, 0.0664]   [-0.0656, 0.0560]   [-0.0408, 0.0476]   [-0.0663, 0.0567]   [-0.0403, 0.0471]   [-0.0217, 0.0284]   [-0.0657, 0.0561]
X0.2292 ***                           0.1091 ***0.0223          
[0.1662, 0.2922]                              [0.0648, 0.1535]   [-0.0034, 0.0481]         
Y         0.2213 ***                                    0.1368 
         [0.1604, 0.2821]                                       [-0.0208, 0.2944]
Z                  0.7581 ***0.1628 ***0.7404 ***               
                  [0.7138, 0.8023]   [0.1013, 0.2243]   [0.6961, 0.7847]                  
Z1                                             0.1423 ***-0.0312 
                                             [0.1123, 0.1724]   [-0.1074, 0.0449]
Z2                                             0.1431 ***-0.0109 
                                             [0.1137, 0.1724]   [-0.0854, 0.0637]
Z3                                             0.1488 ***0.0407 
                                             [0.1187, 0.1788]   [-0.0357, 0.1171]
Z4                                             0.1785 ***0.0512 
                                             [0.1484, 0.2086]   [-0.0268, 0.1292]
Z5                                             0.1619 ***0.0214 
                                             [0.1318, 0.1920]   [-0.0557, 0.0986]
Z6                                             0.1736 ***-0.0068 
                                             [0.1436, 0.2036]   [-0.0844, 0.0709]
Z7                                             0.1489 ***0.0377 
                                             [0.1190, 0.1789]   [-0.0385, 0.1138]
Z8                                             0.1043 ***0.0216 
                                             [0.0741, 0.1345]   [-0.0535, 0.0967]
Z9                                             0.1535 ***0.0371 
                                             [0.1238, 0.1832]   [-0.0388, 0.1130]
Z10                                             0.1240 ***-0.0220 
                                             [0.0945, 0.1534]   [-0.0960, 0.0520]
N1000         1000         1000         1000         1000         1000         1000      
R20.0486    0.0486    0.5314    0.0263    0.5421    0.8510    0.0549 
All continuous predictors are mean-centered and scaled by 1 standard deviation. *** p < 0.001; ** p < 0.01; * p < 0.05.

Again, we cannot differentiate causality from reverse-causality, downstream variables can be predicted by upstream causes and direct causes alike (i.e., mediation), and so on. In this set of models, note that even the variables which are downstream of \(Y\) can attenuate the effect of \(X\) on \(Y\) and the effect of \(X\) on \(Y\) can be blown to smithereens by including things \(Y\) causes and which \(X\) should only influence insofar as it affects \(Y\), even if this comes with no effect on them (because relationships need not be transitive, and since the relationship between \(X\) and \(Y\) is not sufficiently high to justify transitivity, this is a real concern). It is noteworthy that including the variables \(Y\) causes in Model 6 was met with a lot of significant effect for them but not for \(X\). This was also coupled with an overinflated \(R^2\); note the contrast to Model 7.

I may test more models later with different data structures but right now this is sufficient to justify caution about the hasty use of controls.

Discussion

This issue has been discussed by many people under many names over many years and there are as many spin-off fallacies as there are examples of the fallacy. A failure to acknowledge this fallacy can have harmful real-world consequences when it impacts policy advice. For example, as Figueredo, Hetherington & Sechrest (1992) showed, if Bingham, Heywood & White’s (1991) results were taken to their logical conclusion, we could expect free lunch participation in second and fifth grade to improve fifth grade achievement while participation in the free lunch program at all should worsen it. There are many instances where effects like this showed contrasts and where variables - like missing school - could earn the wrong sign thanks to the extent of multicollinearity and the inability of simple multiple regression to handle the indiscriminate introduction of variously covarying covariates. Interpreting these would lead to absurd conclusions such as that in certain grades, students should be encouraged to miss more days of school to improve performance in later years.

This fallacy is commonly used to justify socioeconomic and cultural influences on children by controlling for family-level variables (cf. Turkheimer, 2000) and interventions aimed, sometimes humorously, at correlates of outcomes (e.g., Cohen et al., 2000). But controls do not provide prima facie evidence of real influence. Designs (like assessing the impact of IQ on crime or attained SES within sibling pairs to avoid controlling for shared variance with SES, as is so often done) and causal thinking (there is a causal calculus and diagramming method) are necessary to avoid these sorts of inferences. In some cases, however, these can be appropriate inferences, like when there is evidence of a control’s causal impact, but it can still be hard to use said control when the causal extent of its impact or the specificity on - in particular a psychological - outcome of interest is uncertain and mismeasured relative to the correlational expectation. This is rarely considered although it is probably common (e.g., Hardin’s dictum that “We can never do merely one thing”). To deal with the problem, be creative (Goedel, 1931) and don’t give in to the impulse to control everything. Rohrer (2018) was a great recent publication on this topic and a classic, underappreciated explanation - and the source of the fallacy’s now little-known name - came from Gordon (1967, 1968).

References

Goldberger, A. S. (1964). Econometric Theory (1st edition). John Wiley & Sons Inc.

Figueredo, A. J., Hetherington, J., & Sechrest, L. (2016). Water Under the Bridge: A Response to Bingham, Heywood, and White. Evaluation Review. https://doi.org/10.1177/0193841X9201600103

Bingham, R. D., Heywood, J. S., & White, S. B. (1991). Evaluating Schools and Teachers Based On Student Performance: Testing an Alternative Methodology. Evaluation Review, 15(2), 191-218. https://doi.org/10.1177/0193841X9101500203

Turkheimer, E. (2000). Three Laws of Behavior Genetics and What They Mean. Current Directions in Psychological Science, 9(5), 160-164. https://doi.org/10.1111/1467-8721.00084

Cohen, D., Spear, S., Scribner, R., Kissinger, P., Mason, K., & Wildgen, J. (2000). “Broken windows” and the risk of gonorrhea. American Journal of Public Health, 90(2), 230-236.

Goedel, K. (1931). Ueber formal unentscheidbare Saetze der Principia Mathematica und verwandter Systeme I. Monatshefte fuer Mathematik und Physik, 38(1), 173-198. https://doi.org/10.1007/BF01700692

Rohrer, J. M. (2018). Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data. Advances in Methods and Practices in Psychological Science. https://doi.org/10.1177/2515245917745629

Gordon, R. A. (1967). Issues in the ecological study of delinquency. American Sociological Review, 32(6), 927-944.

Gordon, R. A. (1968). Issues in multiple regression. American Journal of Sociology, 73(5), 592-616. https://doi.org/10.1086/224533 (see also http://garfield.library.upenn.edu/classics1987/A1987J744900001.pdf)

Transitivity

Correlations are only transitive when the interval

\[\rho_{xy}\rho_{yz} - \sqrt{(1-\rho^2_{xy})(1-\rho^2_{yz})} \leq \rho_{xz} \leq \rho_{xy}\rho_{yz} + \sqrt{(1-\rho^2_{xy})(1-\rho^2_{yz})}\]

is positive (for positive correlations, i.e., \(\rho_{xy} > 0 \bullet \rho_{yz} > 0\)). To be certain \(\rho_{xz} > 0\), satisfy

\[\rho^2_{xy} + \rho^2_{yz} > 1\]