Description

Below is an attempt to provide graphical examples of random error, systematic bias, confounding and interaction using real data with some simulation. While using real data, these represent thought experiments for the purpose of teaching concepts. Along these lines, some of the assumptions and hypotheses that I ask you to imagine are for learning purposes only.

Data Source

This data was obtained from the National Health and Nutrition Examination Survey (NHANES) 2007-2008 public use datafiles, more specifcally within the Demographics, ‘Druge Use’ and ‘Smoking - Cigarette Use’ components of the Questionnaire and the ‘Spirometry’ component of the Examination datafiles. The data was initially imported, merged and pared down to variables of interest in SAS 9.4. That SAS program can be downloaded from my GoogleDrive here.

R Packages Utilized

## Load dplyr, packages
#Note, the packages are always changing, so you may need to run update.packages()
#to have the most recent version for full functionality.
library(dplyr)
library(knitr)
library(ggplot2)
library(scatterplot3d)

The Dataset Preparation

Further data preparation was done in RStudio to get nice, pretty and clean datasets for these examples.

This is only here for completeness sake.

setwd("C:/Users/jkempke/Dropbox/MSCR 761/Obstacles Lecture/MJ")

mj <- read.csv("mj.csv", stringsAsFactors = F, na.strings = c(7,9,999))
colnames(mj)<-c("sex", "age", "race", "served","education", "mj_ever", "mj_days", "deep_breath","fvc", "fev1", "hundrocigs", "age_start", "age_stop")

mj$education <- as.factor(mj$education)
mj$sex <-as.factor(mj$sex)
mj$served<-as.factor(mj$served)
mj$race <- as.factor(mj$race)
mj$mj_ever <- as.factor(mj$mj_ever)
mj$hundrocigs <- as.factor(mj$hundrocigs)
mj$deep_breath<- as.factor(mj$deep_breath)

mj <- mj %>%
    mutate(age_smoke = cut(age_start, c(0,20,70)))%>%
    mutate(age_cat = cut(age, seq(0,80,20)))%>%
    mutate(years = age_stop-age_start)%>%
    mutate(duration = cut(years, seq(0,100,20)))%>%
    mutate(age_quit = cut(age_stop, seq(0,100,20)))%>%
    mutate(
        served= recode_factor(served, "1"="Yes", "2"="No"),
        sex = recode_factor(sex, "1"="Male", "2"="Female"),
        mj_ever = recode_factor(mj_ever, "1"="Yes", "2"="No"),
        deep_breath=recode_factor(deep_breath, "1"="Yes", "2"="No"),
        hundrocigs = recode_factor(hundrocigs, "1"="Yes", "2"="No"),
        age_smoke = recode_factor(age_smoke, "(0,20]"="</=20", "(20,70]"=">20"))%>%
    filter(is.na(fev1)==F)

mj$fev1_noisy <- jitter(mj$fev1, factor=10000)
mj$fev1_really_noisy <- jitter(mj$fev1, factor=30000) 

mja <- mj[mj$age_start!=0,]#excluded these since 0 was response for those who didn't identify as smokers
mja <- mja[mja$age_start<60,] #cleaned out a few outliers..
mja <- mja[is.na(mja$served)==F,] #below 3 lines remove missing data.
mjd <- mja[is.na(mja$duration)==F,]
mjm <- mj[mj$age < 60,] # only ages 20-59 were asked marijuana survey questions
mjm <- mjm[mjm$age > 19,]
mjm <- mjm[is.na(mjm$age)==F,]

Random Error Simulation

Age, Sex and FEV1 - The Real Data

Here we see FEV1 (forced expiratory volume in 1 second, a spirometric measure of lung function) by age of survey respondent. There is a clear trend of increasing FEV1s from birth to about 20 years of age with steadily decreasing values in participants from 20 to 80 years of age. We also notice that there is a lot variation around the trend line.

Now let’s add participant’s sex as an additional layer into the graph.

We generally see a similar FEV1-age trend among males and females with females generally having on average similarly lower FEV1s at any age past about 20 years-old.

Simulating a little random noise

Using an R function jitter(), I can insert some random noise into the FEV1 variable. This is meant to simulate an imprecise measurement method in any study.

The trend is still visible above, just with a lot more variation (scatter) around the average curve.

Below, we see that the male-female differences are present but less pronounced.

Simulating more random noise

Now we add more random noise to the FEV1 measurement…

Now the trend is much more attenuated. Below, you can also appreciate that the male-female difference is much less pronounced.

Discussion

This simulation is meant to show how random error can influence results. We started with the original data and added progressively more random error into the measurement of our dependent variable of interest, FEV1. This is meant to be an analogy to how any measurement tool (instrument, survey, case definition…) can introduce error into our observations of the world. This random error then influences our ability to discern relationships.

This next point is purely semantic but I’ll make it anyways. Error can be called random to the extent that we cannot (under the current state of the art) account for it or reduce it using more precise tools. In this thought experiment, all scientists using the same instruments will get roughly the same amount of error which will yield roughly the same answers since no one has the ability to control for it. However, once there is a more precise tool available, if one were to then use the old method of measurement, their study can be said to now have introduced a systematic bias which will bias their results when compared to other’s work using the new, more precise tools. Even more precisely, we could say in this case that it would be an information bias introducing nondifferential mismeasurement (or misclassification for a categorical variable) of the dependent variable, FEV1. This tends to bias results towards the NULL by introducing noise and making differences more difficult to discern but more on systematic bias below.

Information Bias Thought Experiment

Information Bias:

Any mismeasurement or misclassification of exposures or outcomes (or independent/dependent variables depending on your preferred nomenclature). Can be further conceptually classified as:

Nondifferential - the probability of misclassification does NOT vary by different groups. This acts exactly like adding random noise as we saw above and tends to bias results towards the NULL (i.e. oftern ‘no difference’ or ‘no effect’).

Differential - the probability of misclassification varies by group. Depending on how the extent and direction of this variation is unknown, the effect on results is unpredictable. There are statistical, analytic methods that may be helpful to adjust for this bias only to the extent that the bias is known and quantifiable.

Sex, Marijuana and Information Bias

We will use a thought experiment to discuss called information bias. We see from this survey data that it appears that among males, the majority (57%) have tried marijuana whereas among females the majority (57%) have NOT tried marijuana. We can may conclude from this that males are more likely than females to try marijuana in their lifetime.

Discussion

Now, let us run some thought experiments regarding nondifferential versus differential misclassification. In this thought experiment, we can suppose that because of the way this question was asked (maybe its actual wording is really confusing) both males and females provide incorrect answers (both yes’s and no’s) a certain proportion of the time. This will introduce a misclassification of a respondent’s true status of whether or not they have actually tried marijuana.

This hypothetical misclassification would be considered nondifferential if both males and females have an equal probability (and directionality) of answering incorrectly. This will tend to distribute error symmetrically among both males and females, introduce ‘noise’ into the data and bias results towards the NULL.

On the other hand, this hypothetical misclassification would be considered differential if males and females had different probabilities (or directionality) of missclassification. The effects of differential misclassification on the results are much harder to predict. Let’s say the question has some gender-specific bias such that males answer very accurately but females tend to answer inaccurately and towards answer of ‘yes’. Here the probability of misclassification is different by sex, and the directionality is different by sex (females tend to inaccurately say ‘yes’ more often then they inaccurately say ‘no’). This would make bias the results towards no difference between the sexes, however, one can re-imagine any hypothetical scenario where the probability and directionality by sex is different and would produce different results, wither towards the NULL or away from the NULL in either direction.

Selection Bias Simulation

Selection bias occurs in the selection of the study population, such that the study population is not representative of the intended target population to which the results were supposed to apply to.

Often results from using convenience samples.

As a general rule, selection bias cannot be overcome in the analsysis.

Note that the existence or non-existence of selection bias is dependent on the objectives of the study, specifically the relationship between the study population and target population.

Relationship of FEV1 by age

For now we will return to our initial example of looking at the relatsionships between a measure of lung function (FEV1) and age. The examples may seem simplistic and artificial since these relationships are perhaps now so well described as to be general knowledge. Nonetheless, they will serve a purpose for showing off selection bias.

Let’s say it is 100 years and we are studying lung function and age before we had any such knowledge of these relationships say we use a convience sample of adults. We may get the following results:

Based on these results may conclude that on average it seems that lung function is lower as age increases.

A competing scientist scorns our results and performs her own study, but she is using a convenience sample of children and young adults and gets the following results:

Based on these results may conclude that on average it seems that lung function is lower as age increases.

Discussion

If the intention is to describe the relationship between age and FEV1 in the entire population, then both results exhibit selection bias to the extent of what the actual age range of the entire population truly is. If the intention of the first study is to described the relationships of FEV1 and age among 20-50 year-olds then this stay may not actually have a selection bias (barring other anomalies in the ‘convenience sample’). Similarly, if the intention of the second study is to describe the relationship between age and FEV1 in 5-30 year-olds then that study may not have a selection. This demonstrates how the designation of a selection bias is determined by the extent to which the study population is representative of the target population.

Furthermore, within each study group it may be difficult to near impossible to adjust for the differing relationships between age and FEV1 among different age groups in the analsysis. However, if in our study we have no selection bias and our pariticpants are from age 5-80 years-old then indeed we observe the different relationships across age-groups and adjust for age-group in the analysis to get a more correct answer (in this scenario age-group would now display characterisitcs of interaction).

Interaction

Interaction

The relationship between 2 variables is different (or varies) based on the level of a third variable.

Can be synergistic or antagonistic.

Relationship over both groups is weighted average of relationships by each group

Note: this is a mathematical relationship that can vary depending on what how differences are computed (i.e. additive vs multiplicative differences…). The term effect modification is generally reserved for the actual bioligical phenomenon of synergy and antagonism.

Age, FEV1 and Sex Differences

We now going to examine the relationshp between age and FEV1 and how this relationship differs based on the level of a third variable, sex. The age-FEV1 association that we are going to examine is actually the slope of the line.

Let’s look at participants in the first level of our variable age-group. These represent participants 5-10 years-old.

Here we can see that the slopes of the line are the same in males and females. Now let’s look at adolescents and young adults.

Here we now see a steeper upslope among males and a less steep upslope among females in this age-group with tight standard errors around the lines. We may describe that the age-FEV1 relationship (slope) is different among males and females 10-20 years of age since there are different slopes. The effects of this phenomenon could be hypothesized to reflect an effect modification (either synergistic or antagonistic, depending on your point of view) of sex on the age-FEV1 relationship during this stage of the life cycle.

Finally, let’s examine our adult population…

Within this age-group we see that while males and females have differences in average FEV1s (hence the lines are shifted), the slopes of the lines actually look parallel. We may concluded that among this age-group sex does not display interaction with the age-FEV1 relationship since the slopes of the lines are essentially the same.

Discussion

You may have noticed throughout this example that I utilized a fourth lurking varaible, age-group. And you may have noticed that the slopes of the lines also changed by age-group. So here we may actually describe age-group as demonstrating interaction between age-FEV1 relationship across the the 3-levels of age-group while sex displayed interaction only among 10-20 year-old age-group.

We used a series of examples to display interaction, how the relationship between two variables (here the slope of the age-FEV1 regression line) is changed by the level of a third variable (age-group, or sex within the 10-20 year-old age group). When interaction exists, it is important to display results that are stratfied by the levels of the interacting variable to display the most accurate results. In the above example, among 10-20 year-olds you may want to report the indivdual age-FEV1 slopes for males and females to demonstrate important sex differences. In a randomized clinical trial, you may also consider a stratified randomization to ensure balanced proportions of each level of a known interacting factor within experimental and control groups. This sets you up to analyze the effects of treatment on different levels of the interacting variable without breaking the effects of your hard-earned randomization.

I’d like to end this by noting that I mentioned above that statisitcal interaction refers to a mathematical phenomenon while effect modification refers to the biological phenomenon. This is an important distinction since interaction may be demonstrated or not demonstrated in the same data depending on whether additive or mutliplicative models are used to describe the differences between two groups….

Confounding

Confounding

The relationship between an exposure and an outcome is distorted by the presence of another variable.

Meets all 3 conditions:

Associated with the outcome

Associated with the exposure

Not in the causal pathway between exposure and outcome

Age, Sex, Military Service and Lung Function

g<- ggplot(data=mja,aes(age, fev1, color=served))+ 
    theme_bw()+
    geom_point()+
    geom_smooth(method = 'lm', se=F)
print(g)

g<- ggplot(data=mja,aes(age, fev1, color=served))+ 
    theme_bw()+
    geom_point()+
    geom_smooth(method = 'lm', se=F)+
    facet_grid(. ~ sex)
print(g)

g<- ggplot(data=mja, aes(served, fill=sex))+
    geom_bar(position = 'dodge')+
    theme_bw()
print(g)

Simpson’s Paradox

g<- ggplot(data=mja,aes(age, fev1, color=served))+ 
    theme_bw()+
    geom_point()+
    geom_smooth(method = 'lm', se=F)
print(g)

g<- ggplot(data=mja, aes(served, fev1, fill=served))+
    geom_boxplot()+
    theme_bw()
print(g)

g<- ggplot(data=mja, aes(served, fev1, fill=served))+
    geom_boxplot()+
    theme_bw()+
    facet_grid(.~age_cat)
print(g)

g<- ggplot(data=mja, aes(served, fev1, fill=served))+
    geom_boxplot()+
    theme_bw()+
    facet_grid(sex~age_cat)
print(g)

Graphical Examples of Bias, Confounding and Interaction

Jordan A. Kempker, MD, MSc

2016-09-20

Description

Data Source

R Packages Utilized

The Dataset Preparation

Random Error Simulation

Age, Sex and FEV1 - The Real Data

Simulating a little random noise

Simulating more random noise

Discussion

Information Bias Thought Experiment

Sex, Marijuana and Information Bias

Discussion

Selection Bias Simulation

Relationship of FEV1 by age

Discussion

Interaction

Age, FEV1 and Sex Differences

Discussion

Confounding

Age, Sex, Military Service and Lung Function

Simpson’s Paradox