The goal of this supervision is to conduct a two-sample Mendelian Randomisation analysis to assess the causal effect of one variable on another variable. We will use the ‘two-sample MR’ package in R to conduct the analysis.
To start, read through the TwoSampleMR website (https://mrcieu.github.io/TwoSampleMR/articles/introduction.html).
Generally, you need an authentication code to use the publicly
available GWAS data (to prevent system overloads). I’ve provided you
with my authentication code below to save you the hassle of
registering.
Browse through the IEU Open GWAS database to determine what you want to
use for your exposure and your outcome variables (https://gwas.mrcieu.ac.uk/). MR performs best when there
is a clear biological link between individuals’ genes and the outcome
(or Trait). Consider using Education attainment as your outcome (for
which there are large sample sizes and well-studied mechanisms) and some
health-related variable as your exposure (e.g. alcohol/smoking/caffeine
consumption).
Choose outcomes and exposures with large sample sizes (at least 100,000) as otherwise there may not be enough people in the GWAS to find many significant associations between SNPs and Traits.
An example code is displayed below.
# Set CRAN mirror
options(repos = "https://cloud.r-project.org")
#Install and load TwoSampleMR package
install.packages("remotes")
library(remotes)
install_github("MRCIEU/TwoSampleMR")
library(TwoSampleMR)
auth_code = "eyJhbGciOiJSUzI1NiIsImtpZCI6ImFwaS1qd3QiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJhcGkub3Blbmd3YXMuaW8iLCJhdWQiOiJhcGkub3Blbmd3YXMuaW8iLCJzdWIiOiJmbTUyN0BjYW0uYWMudWsiLCJpYXQiOjE3MjQzMjYwMTcsImV4cCI6MTcyNTUzNTYxN30.ZurPesLhSkKJaqr8JgnJHiRymEZBC0BGx4Ub72-nDIhCmrNE15p5GDO-R3Lkky4EF9SKETWBOadjCSsKXicoTdFhWIyjWMmCD1wjhKKR2fCSAVQh88DP4YcSTRp6T2vegpyo4lFmFai22VmxVBK3E1A2i6hYIz3AxquQeWcwVw342_uCfW8lqYEN1PBCEE8QBN18XuPPh98hWnJg8QSKR20Ia855bkLnyUbgrNxhetIUAFTJwswv3Z4hpIpfP1dkxv_7H9_SYMh9kmp8F-fNzwCdXYz2wTv-DQwYHcgmBkryS3IwyLAEA0muMfzmmcGytqkBnlOeuVaa5o0XSmzwuw"
# List available GWASs
ao <- available_outcomes(opengwas_jwt = auth_code)
# Get instruments
exposure_dat <- extract_instruments("ieu-a-2", opengwas_jwt = auth_code)
# Get effects of instruments on outcome
outcome_dat <- extract_outcome_data(snps=exposure_dat$SNP, outcomes = "ieu-a-7", opengwas_jwt = auth_code)
# Harmonise the exposure and outcome data
dat <- harmonise_data(exposure_dat, outcome_dat)
# Perform MR
res <- mr(dat)
View(res)
Once you have run this code for your chosen exposure and outcome variables, examine your results by viewing the ‘res’ (or results) data.frame. The results show the magnitude and the sign of the effect (b) and whether it is statistically significant (pval) for various methods. If pval is less than 0.05, you can interpret this method as delivering a statistically significant result.
Write up a short report reporting and analysing your results. Reflect on what might be driving the causal association (if there is one) or why there may not be a causal association. Also discuss the limitations of this method. In particular, discuss why the results from the GWAS may not reflect a true causal association between the gene and the outcome [hint: google horizontal pleiotropy].