Shiny App: Multivariable Regression and "Simpson's Paradox" with Swiss Data

P. Fleer
May 23, 2017

What does the App do?

Objective

Show effect of the “Simpson's Paradox” in multivariate regression:

  • It fits a linear model to R's swiss dataset
  • It plots the unajusted effect for a set of chosen variables
  • It plots the adjusted effect after regressing out one predictor variable
  • It allows to play with different combinations of variables

Description The (purported) “Simpson's Paradox” that refers to the fact that, in multivariate (linear) regression unadjusted and adjusted effects can be the reversed. I.e. the relationship between a predictor (x1) and the outcome (y) may change when accounting for a second pedictor (x2).

Example Graph

Unadjusted Effect plot of chunk unnamed-chunk-1

Adjusted Effect plot of chunk unnamed-chunk-2

Example Values

coefficients(lm(Fertility ~ Agriculture, data = swiss))
(Intercept) Agriculture 
 60.3043752   0.1942017 

Unadjusted coefficient for Agriculture is positive: 0.194.

coefficients(lm(Fertility ~ Agriculture + Education, data = swiss))
(Intercept) Agriculture   Education 
84.08005397 -0.06647502 -0.96276262 

Adjusted coefficient for Agriculture is negative: -0.066.

Background and Sources

Swiss dataset

  • Data frame with 47 observations on 6 variables (each of which is in percent)
  • Variables: Fertility, Agriculture, Examination, Education, Catholic, Infant Mortality
  • Observations: standardized fertility measures and five socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888

See for more details here.

This App was inspired by the Book Regression Models for Data Science In R by Brian Caffo, published 2015-08-05 on leanpub.

The app can be found here. —— Have fun playing!