A research group (Sniekers et al. 2017) yesterday published the largest GWAS of intelligence. I will calculate polygenic scores using the independent hits and compare them with polygenic scores computed from previous educational attainment GWAS and reported in Piffer (2017). You will need to load 5 external files (in this order):“Nature2017IntelligenceGWAS.csv”,“df_merged_Nature_top_positive.csv”,“Int_EA_Replicated.csv”, “Matched_18_freqs.csv”,“Replication_linked_Okbay.csv”, “FstDistances.csv” Download them from: osf.io/5yhf
Load packages
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(RMySQL)
## Loading required package: DBI
Call to 1000 Genomes database
sql = src_mysql("population_freqs", host = "128.199.203.208", user = "anon", port = 3306)
sql_1kg = tbl(sql, "1000genomes_phase3")
Load list of SNPs from file Nature2017IntelligenceGWAS.csv
Nature2017IntelligenceGWAS=read.csv(file.choose(),header=TRUE)
Nature2017_ind=Nature2017IntelligenceGWAS
snps = Nature2017_ind$rsID
Fetch from SQL server
(Nature_2017_SNPs= sql_1kg %>% filter(SNP %in% snps) %>% collect())
## # A tibble: 15 × 35
## CHR SNP A1 A2 ACB ASW BEB CDX CEU CHB
## <int> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 rs10191758 G A 0.1979 0.2049 0.1047 0.0000 0.3687 0.0097
## 2 7 rs10236197 C T 0.5833 0.5000 0.3372 0.4839 0.3586 0.5728
## 3 9 rs11138902 A G 0.2865 0.4016 0.4302 0.3548 0.5859 0.4612
## 4 1 rs12744310 T C 0.1042 0.1066 0.1047 0.4516 0.2172 0.3010
## 5 16 rs12928404 C T 0.6042 0.6066 0.3023 0.5806 0.3737 0.4563
## 6 2 rs13010010 T C 0.0417 0.0492 0.1163 0.4839 0.4091 0.5291
## 7 17 rs16954078 A T 0.0260 0.0492 0.1105 0.0000 0.2020 0.0049
## 8 13 rs2251499 T C 0.1562 0.1803 0.1512 0.1129 0.2727 0.1117
## 9 6 rs2490272 T C 0.1354 0.2213 0.4651 0.5591 0.6465 0.8107
## 10 22 rs36093924 T C 0.5365 0.5492 0.5174 0.2527 0.4444 0.2379
## 11 7 rs4728302 T C 0.2708 0.4508 0.3837 0.6505 0.5505 0.4757
## 12 2 rs6746731 G T 0.2448 0.3361 0.5814 0.2419 0.5909 0.3835
## 13 3 rs6779302 T G 0.4479 0.4180 0.3721 0.0107 0.3636 0.0583
## 14 3 rs7646501 G A 0.7240 0.7213 0.3721 0.1129 0.2374 0.1796
## 15 6 rs9320913 A C 0.1615 0.2295 0.2907 0.4409 0.5000 0.4369
## # ... with 25 more variables: CHS <dbl>, CLM <dbl>, ESN <dbl>, FIN <dbl>,
## # GBR <dbl>, GIH <dbl>, GWD <dbl>, IBS <dbl>, ITU <dbl>, JPT <dbl>,
## # KHV <dbl>, LWK <dbl>, MSL <dbl>, MXL <dbl>, PEL <dbl>, PJL <dbl>,
## # PUR <dbl>, STU <dbl>, TSI <dbl>, YRI <dbl>, EAS <dbl>, EUR <dbl>,
## # AFR <dbl>, AMR <dbl>, SAS <dbl>
Merge file with downloaded frequencies to original input file with extra info (z scores, p value, etc)
df_merged_Nature_top=merge(Nature_2017_SNPs,Nature2017_ind,by.x="SNP",by.y="rsID")
write.csv(df_merged_Nature_top,"df_merged_Nature_top.csv")
Unfortunately 3 variants were missing (rs66495454,rs113315451,rs41352752). Two indels and 1 SNP.Had to add them manually by using 1000 Genomes Ensembl browswer. Manually reversed the allele when z score was negative to get only positive alleles. Load updated file “df_merged_Nature_top_positive”
df_merged_Nature_top_positive=read.csv(file.choose(),header=TRUE)
df_pol_top=df_merged_Nature_top_positive[,1:26]
Calculate polygenic scores
PS_top=apply(df_pol_top,2,mean)
PS_top
## ACB ASW BEB CDX CEU CHB CHS
## 0.4753722 0.4690556 0.4312000 0.4925333 0.4747000 0.5264056 0.5071167
## CLM ESN FIN GBR GIH GWD IBS
## 0.4622167 0.4685500 0.4621556 0.4654611 0.4706389 0.4717833 0.4535333
## ITU JPT KHV LWK MSL MXL PEL
## 0.4305722 0.5315111 0.5073278 0.4567889 0.4885778 0.4422722 0.4467167
## PJL PUR STU TSI YRI
## 0.4433056 0.4551167 0.4439056 0.4579222 0.4704444
Apply weights (z scores)
df_weighted_top=(df_merged_Nature_top_positive[,1:26])*(abs(df_merged_Nature_top_positive$z))
PS_top_w=apply(df_weighted_top,2,mean)
PS_top_w
## ACB ASW BEB CDX CEU CHB CHS CLM
## 2.791038 2.760445 2.563153 2.922041 2.836348 3.149946 3.019268 2.760920
## ESN FIN GBR GIH GWD IBS ITU JPT
## 2.750451 2.754825 2.775733 2.799647 2.767001 2.700084 2.549037 3.160909
## KHV LWK MSL MXL PEL PJL PUR STU
## 3.014528 2.686187 2.872146 2.630699 2.655750 2.635490 2.715050 2.627435
## TSI YRI
## 2.727961 2.760959
Sniekers et al. (2017) also identified 9 SNPs that reached significance both in their study and in the Okbay educational attainment GWAS. As this constitutes a quasi-replication, these SNPs are more likely to be real (or closer to) causal variants. Load file “Int_EA_Replicated.csv”
df_int_EA=read.csv(file.choose(),header=TRUE)
PS_int_EA=apply(df_int_EA[,2:27],2,mean)
PS_int_EA
## ACB ASW BEB CDX CEU CHB CHS
## 0.5197333 0.5236667 0.5445667 0.6320667 0.5813111 0.6272889 0.6381778
## CLM ESN FIN GBR GIH GWD IBS
## 0.5555667 0.4988889 0.5768667 0.5689667 0.5378556 0.4970889 0.5596778
## ITU JPT KHV LWK MSL MXL PEL
## 0.5522444 0.6255556 0.6274222 0.4909889 0.5267778 0.5441889 0.5607667
## PJL PUR STU TSI YRI
## 0.5428000 0.5416667 0.5293667 0.5617000 0.5011111
Compute correlation between unweighted, weighted PS, Piffer-estimated PGS, Population IQ (Piffer 2015). PGS were calculated using Okbay et al. (2016), Davies et al. (2016) and Rietveld et al. (2013) meta-analyses of educational attainment. For more details see Piffer (2017)
IQ=c(83,85,81,NA,99,105,105,83.5,71,101,100,NA,62,97,NA,105,99.4,74,64,88,85,84,83.5,79,99,71)
Piffer_2017_9= c(0.323,0.355,0.401,0.485,0.428,0.553,0.535,0.394,0.334,0.477,0.469,0.429,
0.328,0.438,0.412,0.530,0.517,0.333,0.341,0.385,0.353,0.419,0.394,0.404,0.439,
0.340)
Piffer_2017_Okbay_162SNPs=c(0.485,0.483,0.503,0.516,0.503,0.529,0.522, 0.498,0.479,0.516,0.506,
0.506,0.484,0.513,0.504,0.522,0.522,0.477,0.473, 0.495,0.474,0.505,
0.498,0.501,0.513,0.482)
df_PS=data.frame(PS_top,PS_top_w,PS_int_EA,IQ,Piffer_2017_9,Piffer_2017_Okbay_162SNPs)
IQ has NAs, hence use pairwise deletion
cor(df_PS,use="pairwise.complete.obs")
## PS_top PS_top_w PS_int_EA IQ
## PS_top 1.0000000 0.9936323 0.6136411 0.4088755
## PS_top_w 0.9936323 1.0000000 0.6814205 0.4966647
## PS_int_EA 0.6136411 0.6814205 1.0000000 0.8766974
## IQ 0.4088755 0.4966647 0.8766974 1.0000000
## Piffer_2017_9 0.5731021 0.6460421 0.9241116 0.8797065
## Piffer_2017_Okbay_162SNPs 0.4218000 0.4979507 0.8351598 0.8543011
## Piffer_2017_9 Piffer_2017_Okbay_162SNPs
## PS_top 0.5731021 0.4218000
## PS_top_w 0.6460421 0.4979507
## PS_int_EA 0.9241116 0.8351598
## IQ 0.8797065 0.8543011
## Piffer_2017_9 1.0000000 0.9500431
## Piffer_2017_Okbay_162SNPs 0.9500431 1.0000000
Correlation between the intelligence PS and IQ,PS_Piffer2017,PS_Piffer_2017_162SNPs are r=0.496, 0.646, 0.497. Correlations between the intelligence-EA PS and IQ, PS_Piffer2017_9, PS_Piffer_2016_162SNPs are r= 0.877, 0.924, 0.835 These are lower than the correlations previously observed (Piffer, 2017)
Run Monte Carlo simulation using random SNPs First to calculate Monte Carlo p value for correlation with IQ Load random SNPs (N=13,130). “Matched_18_freqs.csv”
sql_random_freqs <- read.csv(file.choose(),header=TRUE)
Keep only freqs.Transpose
t_random_freqs=t(sql_random_freqs[6:31])
Compute non-overlapping polygenic scores of same size (18)
byapply <- function(t_random_freqs, by, fun, ...)
{
# Create index list
if (length(by) == 1)
{
nc <- ncol(t_random_freqs)
split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
} else # 'by' is a vector of groups
{
nc <- length(by)
split.index <- by
}
index.list <- split(seq(from = 1, to = nc), split.index)
# Pass index list to fun using sapply() and return object
sapply(index.list, function(i)
{
do.call(fun, list(t_random_freqs[, i], ...))
})
}
# Run function
y <- byapply(t_random_freqs, 18, rowMeans)
Compute corr between IQ and PGS for random sets of 18 SNPs (N=730)
cor_random18_IQ=cor(y,IQ,use="complete.obs")
Calculate Monte Carlo p value (applying formula with correction) for weighted PS
r_w=length(which(cor_random18_IQ>=0.496))
n_w=length(cor_random18_IQ)
MC_p_w=(r_w+1)/(n_w+1)
MC_p_w
## [1] 0.3816689
The MC p value is 0.382. This means that the same results (r x IQ > 0.496) can be achieved with random SNPs roughly 38.2% of the times.
Create correlation matrix of random and real (PS 9) polygenic scores to obtain a Monte Carlo p value for the correlation between the 9 SNPs EA Polygenic score and the new Nature 2017 polygenic score.
cor_random18_PS_9=cor(y,Piffer_2017_9,use="complete.obs")
Calculate p value
r_18_9=length(which(cor_random18_PS_9>=0.646))
n_18_9=length(cor_random18_PS_9)
MC_p_18_9=(r_18_9+1)/(n_18_9+1)
MC_p_18_9
## [1] 0.120383
Calculate PS by continent
df_weighted_top_continent=(df_merged_Nature_top_positive[,27:31])*(abs(df_merged_Nature_top_positive$z))
PS_top_cont=apply(df_weighted_top_continent,2,mean)
PS_top_cont
## EAS EUR AFR AMR SAS
## 2.913380 2.736141 2.781131 2.720869 2.624233
The MC p value is 0.120. This means that the same results (r x PS_9 > 0.646) can be achieved with random SNPs roughly 12% of the times. In general, the only robust finding of this study is an East Asian (EAS) advantage but the scores for the other populations are largely overlapping.
Run MC simulation for the 9 intelligence-EA pseudo-replicated SNPs.
byapply <- function(t_random_freqs, by, fun, ...)
{
# Create index list
if (length(by) == 1)
{
nc <- ncol(t_random_freqs)
split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
} else # 'by' is a vector of groups
{
nc <- length(by)
split.index <- by
}
index.list <- split(seq(from = 1, to = nc), split.index)
# Pass index list to fun using sapply() and return object
sapply(index.list, function(i)
{
do.call(fun, list(t_random_freqs[, i], ...))
})
}
# Run function
y_9 <- byapply(t_random_freqs, 9, rowMeans)
Compute corr between random sets of 9 SNPs and IQ
cor_random9_IQ=cor(y_9,IQ,use="complete.obs")
Calculate p value
r_9_9=length(which(cor_random9_IQ>=0.876))
n_9_9=length(cor_random9_IQ)
MC_p_9_9=(r_9_9+1)/(n_9_9+1)
MC_p_9_9
## [1] 0.00890411
The MC p value for the 9 intelligence-EA quasi-replicated corr x IQ PGS is 0.009.
Factor analysis was used to extrapolate a signal of directional polygenic selection for the 18 intelligence SNPs(Piffer, 2015)
library(psych)
fa_int=fa(t(df_pol_top))
fa_int$scores
## MR1
## ACB -1.27631875
## ASW -0.96102733
## BEB -0.07533494
## CDX 1.35058189
## CEU 0.84374466
## CHB 1.10943690
## CHS 1.20766955
## CLM 0.35707546
## ESN -1.65963577
## FIN 0.77070237
## GBR 0.79683553
## GIH -0.04922992
## GWD -1.35832632
## IBS 0.63069832
## ITU -0.07433635
## JPT 0.87816364
## KHV 1.26670030
## LWK -1.59864602
## MSL -1.44376294
## MXL 0.21476285
## PEL -0.05987738
## PJL 0.06571488
## PUR 0.37462562
## STU -0.39063541
## TSI 0.76435902
## YRI -1.68393985
The results show clearer differences between populations, compared to the polygenic scores and look similar to the polygenic scores published by Piffer (2015, 2017)
cor(fa_int$scores,df_PS)
## PS_top PS_top_w PS_int_EA IQ Piffer_2017_9
## MR1 0.3082242 0.4038636 0.8990939 NA 0.8876893
## Piffer_2017_Okbay_162SNPs
## MR1 0.8799348
Compute correlation with IQ (remove NA pairwise)
cor(fa_int$scores,df_PS,use = "pairwise.complete.obs")
## PS_top PS_top_w PS_int_EA IQ Piffer_2017_9
## MR1 0.3082242 0.4038636 0.8990939 0.9034576 0.8876893
## Piffer_2017_Okbay_162SNPs
## MR1 0.8799348
Repeat for the intelligence-EA SNPs
fa_int_ea=fa(t(df_int_EA[,2:27]))
fa_int_ea$scores
## MR1
## ACB -1.0629236351
## ASW -0.9970987866
## BEB -0.6601555423
## CDX 1.2507053503
## CEU 0.7542623611
## CHB 1.3743195824
## CHS 1.6347042174
## CLM -0.1131873487
## ESN -1.2547291601
## FIN 0.5814735810
## GBR 0.7815198753
## GIH -0.0005317457
## GWD -1.1861165277
## IBS 0.4755995807
## ITU -0.2120756729
## JPT 1.3210159053
## KHV 1.9253948560
## LWK -1.2545719258
## MSL -1.1648724693
## MXL -0.2591111303
## PEL -0.7621497853
## PJL 0.0347293802
## PUR -0.2075127835
## STU -0.4320448257
## TSI 0.6767470955
## YRI -1.2433904462
cor(fa_int_ea$scores,df_PS)
## PS_top PS_top_w PS_int_EA IQ Piffer_2017_9
## MR1 0.5516593 0.626515 0.9401761 NA 0.9618424
## Piffer_2017_Okbay_162SNPs
## MR1 0.9236177
cor(fa_int_ea$scores,df_PS,use = "pairwise.complete.obs")
## PS_top PS_top_w PS_int_EA IQ Piffer_2017_9
## MR1 0.5516593 0.626515 0.9401761 0.9061601 0.9618424
## Piffer_2017_Okbay_162SNPs
## MR1 0.9236177
Indeed, as we can see above, the factor scores have very high correlations with population IQ (r=0.9) and the 162 SNPs polygenic score from Okbay et.al (2017) and the SNPs polygenic score made from the 9 pseudo-repllicated SNPs (Piffer, 2017). Thus, in this instance factor analysis outperforms simply averging allele frequencies.
Show factor loadings
fa_int$loadings
##
## Loadings:
## MR1
## [1,]
## [2,] -0.382
## [3,] 0.710
## [4,] -0.778
## [5,] -0.329
## [6,] 0.929
## [7,] -0.396
## [8,] 0.238
## [9,] 0.921
## [10,] 0.568
## [11,] -0.713
## [12,] -0.422
## [13,] 0.829
## [14,] -0.977
## [15,] 0.840
## [16,] 0.726
## [17,] -0.834
## [18,] 0.310
##
## MR1
## SS loadings 7.965
## Proportion Var 0.443
Calculate mean fa loadings
mean(fa_int$loadings)
## [1] 0.06868852
Repeat for the int-ea SNPs
fa_int_ea$loadings
##
## Loadings:
## MR1
## [1,] -0.358
## [2,] -0.883
## [3,] 0.994
## [4,] 0.156
## [5,] 0.305
## [6,] 0.875
## [7,] 0.789
## [8,] 0.475
## [9,] -0.510
##
## MR1
## SS loadings 3.888
## Proportion Var 0.432
mean(fa_int_ea$loadings)
## [1] 0.2047414
Finally, a Monte Carlo simulation with the random (N=730) factor scores. First carry out fa over groups of 18 SNPs…see first group output as test
ind=seq(1,(ncol(t_random_freqs)-18),by=18)
results_fa=lapply(ind, function(x) fa(t_random_freqs[, x + 0:17]))
results_fa[[1]]$scores
## MR1
## ACB 1.11569074
## ASW 0.53001953
## BEB -0.52523418
## CDX -1.55733116
## CEU 0.47450740
## CHB -1.95284548
## CHS -1.80867302
## CLM 0.28743412
## ESN 0.67975332
## FIN 0.15299121
## GBR 0.60614566
## GIH -0.17495926
## GWD 1.03746149
## IBS 0.92244653
## ITU 0.13683089
## JPT -1.84459552
## KHV -1.81687202
## LWK 0.79571481
## MSL 1.33625037
## MXL 0.07936017
## PEL 0.31384984
## PJL -0.24407216
## PUR 0.39328959
## STU -0.23122284
## TSI 0.35191022
## YRI 0.94214976
Save factor scores
results_fa_scores=lapply(results_fa, "[[" , "scores" )
Convert list to df and transpose
df_fa_scores=data.frame(lapply(data.frame(t(sapply(results_fa_scores, `[`))), unlist))
df_fa_scores=t(df_fa_scores)
Compute correlation between population IQ and random factor scores
cor_fa_iq=cor(df_fa_scores,IQ,use = "pairwise.complete.obs")
Compute Monte Carlo p value for corr x population IQ
r_iq=length(which(cor_fa_iq>=0.9))
n_iq=length(cor_fa_iq)
MC_p_iq=(r_iq+1)/(n_iq+1)
MC_p_iq
## [1] 0.004109589
Repeat for intelligence-EA factor
ind=seq(1,(ncol(t_random_freqs)-9),by=9)
results_fa_int_ea=lapply(ind, function(x) fa(t_random_freqs[, x + 0:8]))
results_fa_scores_int_ea=lapply(results_fa_int_ea, "[[" , "scores" )
df_fa_scores_int_ea=data.frame(lapply(data.frame(t(sapply(results_fa_scores_int_ea, `[`))), unlist))
df_fa_scores_int_ea=t(df_fa_scores_int_ea)
cor_fa_iq_int_ea=cor(df_fa_scores_int_ea,IQ,use = "pairwise.complete.obs")
Compute Monte Carlo p value for corr x population IQ
r_iq_int_ea=length(which(cor_fa_iq_int_ea>=0.9))
n_iq_int_ea=length(cor_fa_iq_int_ea)
MC_p_iq_int_ea=(r_iq_int_ea+1)/(n_iq_int_ea+1)
MC_p_iq_int_ea
## [1] 0.002056203
Compute correlation between PGS Okbay et al. (N=162) and random factor scores
cor_fa_Okbay=cor(df_fa_scores,Piffer_2017_Okbay_162SNPs,use = "pairwise.complete.obs")
Compute Monte Carlo p value
r_Okbay=length(which(cor_fa_Okbay>=0.879))
n_Okbay=length(cor_fa_Okbay)
MC_p_Okbay=(r_Okbay+1)/(n_Okbay+1)
MC_p_Okbay
## [1] 0.02191781
Do the same as above for the 9 EA SNPs PS (Piffer, 2017)
cor_fa_9=cor(df_fa_scores,Piffer_2017_9,use = "pairwise.complete.obs")
r_9=length(which(cor_fa_9>=0.887))
n_9=length(cor_fa_9)
MC_p_9=(r_9+1)/(n_9+1)
MC_p_9
## [1] 0.01643836
The Monte Carlo simulation showed that the correlation between the factor score obtained from the intelligence GWAS and population IQ, and the previous educational attainment GWAS polygenic scores, is higher than the vast majority of the correlations produced stochastically (i.e. correlations between factor scores of random SNPs and population IQ and ed. att. GWAS PS are almost always lower). Specifically, the empirical (MC) p values are very low (0.004,0.022, 0.016 for population IQ, Okbay 162 PS and Piffer 2017 PS, respectively).
Calculate 9 SNPs EA factor score (load “Replication_linked_Okbay.csv”“)
Replication_linked_Okbay <- read.csv(file.choose(),header=TRUE)
fa_9_EA=fa(Replication_linked_Okbay[,2:10])
Show factor loadings
fa_9_EA$loadings
##
## Loadings:
## MR1
## rs1008078 -0.801
## rs11588857 0.831
## rs12987662 0.956
## rs148734725 -0.492
## rs11712056 0.606
## rs62263923 0.863
## rs13294439 0.910
## rs12969294 0.459
## rs9320913 0.717
##
## MR1
## SS loadings 5.153
## Proportion Var 0.573
Compute mean fa loadings
mean(fa_9_EA$loadings)
## [1] 0.4499548
Correlation between 3 factors: 9 SNPs EA factor, Intell factor, Int-EA factor
df_fac=data.frame(fa_9_EA$scores,fa_int$scores,fa_int_ea$scores)
cor(df_fac)
## MR1 MR1.1 MR1.2
## MR1 1.0000000 0.9608041 0.9459244
## MR1.1 0.9608041 1.0000000 0.9347101
## MR1.2 0.9459244 0.9347101 1.0000000
Carry out Monte Carlo analysis for EA x int correlation
cor_fa_18=cor(df_fa_scores,fa_9_EA$scores,use = "pairwise.complete.obs")
r_18=length(which(cor_fa_18>=0.96))
n_18=length(cor_fa_18)
MC_p_18=(r_18+1)/(n_18+1)
MC_p_18
## [1] 0.02328767
The correlation between the EA and intelligence factors is 0.96 (MC p= 0.023)
Controlling for spatial autocorrelation. Load file “FstDistancesAlph_2016.csv” EA factor
FstDistances <- read.csv(file.choose(),header=TRUE)
library(caTools)
library(lm.beta)
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
## The following object is masked from 'package:dplyr':
##
## recode
combinations_EA=combs(fa_9_EA$scores, 2)
combinations_int=combs(fa_int$scores,2)
#SAC control 9 EA factor
cnames_EA=provideDimnames(combinations_EA, sep = "", base = list(LETTERS))
cnamesdf_EA=as.data.frame(cnames_EA)
distances_EA9=cnamesdf_EA$A-cnamesdf_EA$B
fa_9_dist=abs(distances_EA9)
FstDistances$fa_9_dist=fa_9_dist
fit_dist_fa_9=lm(FstDistances$IQdist ~FstDistances$Fst_Chr1+FstDistances$fa_9_dist)
summary(fit_dist_fa_9)
##
## Call:
## lm(formula = FstDistances$IQdist ~ FstDistances$Fst_Chr1 + FstDistances$fa_9_dist)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.1961 -5.6657 0.0894 5.5857 17.9431
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.2076 0.8819 5.905 1.15e-08 ***
## FstDistances$Fst_Chr1 -26.1310 19.8261 -1.318 0.189
## FstDistances$fa_9_dist 10.3841 1.1716 8.864 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.683 on 250 degrees of freedom
## (72 observations deleted due to missingness)
## Multiple R-squared: 0.5027, Adjusted R-squared: 0.4987
## F-statistic: 126.3 on 2 and 250 DF, p-value: < 2.2e-16
betafit_fit_dist_9=lm.beta(fit_dist_fa_9)
structure(betafit_fit_dist_9)
##
## Call:
## lm(formula = FstDistances$IQdist ~ FstDistances$Fst_Chr1 + FstDistances$fa_9_dist)
##
## Standardized Coefficients::
## (Intercept) FstDistances$Fst_Chr1 FstDistances$fa_9_dist
## 0.0000000 -0.1207470 0.8120172
vif(fit_dist_fa_9)
## FstDistances$Fst_Chr1 FstDistances$fa_9_dist
## 4.218947 4.218947
SAC control Intelligence factor
cnames_int=provideDimnames(combinations_int, sep = "", base = list(LETTERS))
cnamesdf_int=as.data.frame(cnames_int)
distances_int=cnamesdf_int$A-cnamesdf_int$B
fa_int_dist=abs(distances_int)
FstDistances$fa_int_dist=fa_int_dist
fit_dist_fa_int=lm(FstDistances$IQdist ~FstDistances$Fst_Chr1+FstDistances$fa_int_dist)
summary(fit_dist_fa_int)
##
## Call:
## lm(formula = FstDistances$IQdist ~ FstDistances$Fst_Chr1 + FstDistances$fa_int_dist)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.0620 -5.0967 0.1203 4.7051 18.0239
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.5172 0.8315 5.433 1.32e-07 ***
## FstDistances$Fst_Chr1 -23.4298 16.2952 -1.438 0.152
## FstDistances$fa_int_dist 10.7256 0.9639 11.128 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.203 on 250 degrees of freedom
## (72 observations deleted due to missingness)
## Multiple R-squared: 0.5629, Adjusted R-squared: 0.5594
## F-statistic: 161 on 2 and 250 DF, p-value: < 2.2e-16
betafit_fit_dist_int=lm.beta(fit_dist_fa_int)
structure(betafit_fit_dist_int)
##
## Call:
## lm(formula = FstDistances$IQdist ~ FstDistances$Fst_Chr1 + FstDistances$fa_int_dist)
##
## Standardized Coefficients::
## (Intercept) FstDistances$Fst_Chr1 FstDistances$fa_int_dist
## 0.0000000 -0.1082649 0.8378747
vif(fit_dist_fa_int)
## FstDistances$Fst_Chr1 FstDistances$fa_int_dist
## 3.242631 3.242631
The EA and Intell factor scores predict population IQ also after controlling for SAC (Beta= 0.81 and 0.84, respectively)
References:
Piffer, D. Can We Detect Polygenic Selection on Cognitive Ability Using GWAS Hits? Employing Random SNPs as a Null Model.. Preprints 2017, 2017010127 (doi: 10.20944/preprints201701.0127.v2).
Sniekers et al. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence, Nature Genetics 2017 (doi: doi:10.1038/ng.3869)