I will calculate group-level Polygenic Scores (PGS) for 52 populations and 7 super-populations for SNPs that were found to explain variation in Morningness-Eveningness (Phenotype 1180) by the Neale group on UK Biobank data. (https://docs.google.com/spreadsheets/d/1b3oGI2lUt57BcuHttWaZotQcI0-mBRPyZihz87Ms_No/edit#gid=1209628142)
A prediction based on observed phenotypic and genotypic differences in chronotype (Eastman et al., 2016; Malone et al., 2017) is that Africans score lower on Eveningness than Europeans. Data regarding other populations is lacking. In the UK Biobank sample, where Blacks were found to have significantly shorter sleep and 62% greater odds of being morning type (Malone et al., 2017). In another study, African Americans scored higher on morningness on the MEQ scale. The other sleep variables confirmed this result, as African-Americans also had slightly earlier baseline dim light melatonin onset (DLMOs), waketime and bedtime, and phase angle of entrainment (the temporal relationship between the internal, master, circadian clock and external zeitgebers), although none of these differences were statistically significant (Eastman et al., 2016).
It was proposed that this ethnic difference evolved thanks to selective pressures due to seasonal change in photoperiod (duration of daylight) at higher latitudes. However, for humans living around the equator, where the photoperiod is constant across the year, a shorter (approx. 24h) period was more adaptive and there was no pressure to alter this for humans who moved to West Africa from their original cradle in East Africa, as long as they stayed around the equator. According to this theory, a longer circadian period was adaptive for entrainment to the photoperiod that changed with seasons farther away from the equator(Pittendrigh, 1993). Latitudinal clines in circadian period have been found in insects and plants (Hut et al., 2013)
(Skip this part you’re not running the code but just reading this) ** Download files from: https://osf.io/ewxqj/ HGDP frequencies file: “1180_results.csv”. Frequencies of GWAS hits from 52 populations pre-computed using HGDP-CEPH browser (http://spsmart.cesga.es/ceph.php?dataSet=ceph_stanford). GWAS summary file: “1180_sig_8.txt”" HGDP frequencies for 1180 by continent file: “1180_results_continent.csv”"
You will be requested to upload them (in the same order) as the code runs. Note that GWAS files were not LD clumped because this would result in too small number of SNPs being found in the HGDP dataset. However, LD clumping should be performed on the output HGDP files. For simplicity, this step is omitted but will be done in a follow-up version. **
HGDP_CEPH=read.csv(file.choose(), header=TRUE, sep = ";")#open HGDP-CEPH browser output with freqs from GWAS hits
gwas=read.csv(file.choose(), header=TRUE)#open GWAS summary file
Merge two datasets
HGDP_CEPH_merged=merge(HGDP_CEPH,gwas, by.x = "SNP", by.y = "SNP")
Split into 8 groups according to levels of vector “var”: “AC” ,“AG”, “CA”, “CT”, “GA”, “GT” ,“TC”, “TG”
Note that some SNPs are on reverse strand so AC is TG on GWAS and viceversa, so this will cause a nonmatch. Select reverse-stranded SNPs and translate into same strand as HGDP-CEPH reference panel. Then, flip frequency of non-matching alleles (when A2 (A1) is (not) HGDP reference allele).
Create objects based on match or nonmatch between A1 and reference allele and flip frequency of nonmatch alleles
Calculate weighted frequency by SNP by group.
Re-merge groups
compute PGS by population
PGS<-with(df_remerged, tapply(GVS, population, mean))
PGS
## Algeria (Mzab) - Mozabite Bougainville - NAN Melanesian
## 0.0034229484 0.0036113315
## Brazil - Karitiana Brazil - Surui
## 0.0037588762 0.0045604107
## C. African Republic - Biaka Pygmy Cambodia - Cambodian
## 0.0020927414 0.0038670493
## China - Dai China - Daur
## 0.0043440490 0.0047349093
## China - Han China - Hezhen
## 0.0045800655 0.0042085785
## China - Lahu China - Miaozu
## 0.0037696712 0.0036127437
## China - Mongola China - Naxi
## 0.0042896824 0.0041248200
## China - Oroqen China - She
## 0.0045457498 0.0042764034
## China - Tu China - Tujia
## 0.0034318042 0.0047880143
## China - Uygur China - Xibo
## 0.0043259484 0.0043332869
## China - Yizu Colombia - Piapoco and Curripaco
## 0.0039165131 0.0054732218
## D. R. of Congo - Mbuti Pygmy France - Basque
## 0.0020306879 0.0043933948
## France - French Israel (Carmel) - Druze
## 0.0047494689 0.0043174853
## Israel (Central) - Palestinian Israel (Negev) - Bedouin
## 0.0038576502 0.0038249283
## Italy - from Bergamo Italy - Sardinian
## 0.0042619472 0.0041830143
## Italy - Tuscan Japan - Japanese
## 0.0039727641 0.0043429154
## Kenya - Bantu Mexico - Maya
## 0.0015530253 0.0039263951
## Mexico - Pima Namibia - San
## 0.0051698081 0.0003120899
## New Guinea - Papuan Nigeria - Yoruba
## 0.0035564243 0.0019916999
## Orkney Islands - Orcadian Pakistan - Balochi
## 0.0049868264 0.0044334040
## Pakistan - Brahui Pakistan - Burusho
## 0.0039463953 0.0040371756
## Pakistan - Hazara Pakistan - Kalash
## 0.0040745807 0.0048168179
## Pakistan - Makrani Pakistan - Pathan
## 0.0041869974 0.0043488595
## Pakistan - Sindhi Population Set 1
## 0.0044600222 0.0039776083
## Russia - Russian Russia (Caucasus) - Adygei
## 0.0044537179 0.0041478392
## Senegal - Mandenka Siberia - Yakut
## 0.0021700960 0.0045476182
## South Africa - Bantu
## 0.0019372415
Sort in PGS in descending order
sort(PGS,decreasing = TRUE)
## Colombia - Piapoco and Curripaco Mexico - Pima
## 0.0054732218 0.0051698081
## Orkney Islands - Orcadian Pakistan - Kalash
## 0.0049868264 0.0048168179
## China - Tujia France - French
## 0.0047880143 0.0047494689
## China - Daur China - Han
## 0.0047349093 0.0045800655
## Brazil - Surui Siberia - Yakut
## 0.0045604107 0.0045476182
## China - Oroqen Pakistan - Sindhi
## 0.0045457498 0.0044600222
## Russia - Russian Pakistan - Balochi
## 0.0044537179 0.0044334040
## France - Basque Pakistan - Pathan
## 0.0043933948 0.0043488595
## China - Dai Japan - Japanese
## 0.0043440490 0.0043429154
## China - Xibo China - Uygur
## 0.0043332869 0.0043259484
## Israel (Carmel) - Druze China - Mongola
## 0.0043174853 0.0042896824
## China - She Italy - from Bergamo
## 0.0042764034 0.0042619472
## China - Hezhen Pakistan - Makrani
## 0.0042085785 0.0041869974
## Italy - Sardinian Russia (Caucasus) - Adygei
## 0.0041830143 0.0041478392
## China - Naxi Pakistan - Hazara
## 0.0041248200 0.0040745807
## Pakistan - Burusho Population Set 1
## 0.0040371756 0.0039776083
## Italy - Tuscan Pakistan - Brahui
## 0.0039727641 0.0039463953
## Mexico - Maya China - Yizu
## 0.0039263951 0.0039165131
## Cambodia - Cambodian Israel (Central) - Palestinian
## 0.0038670493 0.0038576502
## Israel (Negev) - Bedouin China - Lahu
## 0.0038249283 0.0037696712
## Brazil - Karitiana China - Miaozu
## 0.0037588762 0.0036127437
## Bougainville - NAN Melanesian New Guinea - Papuan
## 0.0036113315 0.0035564243
## China - Tu Algeria (Mzab) - Mozabite
## 0.0034318042 0.0034229484
## Senegal - Mandenka C. African Republic - Biaka Pygmy
## 0.0021700960 0.0020927414
## D. R. of Congo - Mbuti Pygmy Nigeria - Yoruba
## 0.0020306879 0.0019916999
## South Africa - Bantu Kenya - Bantu
## 0.0019372415 0.0015530253
## Namibia - San
## 0.0003120899
Calculate Z-score
scale(PGS, center = TRUE, scale = TRUE)
## [,1]
## Algeria (Mzab) - Mozabite -0.49498967
## Bougainville - NAN Melanesian -0.30209947
## Brazil - Karitiana -0.15102468
## Brazil - Surui 0.66968672
## C. African Republic - Biaka Pygmy -1.85702222
## Cambodia - Cambodian -0.04026357
## China - Dai 0.44814849
## China - Daur 0.84836027
## China - Han 0.68981176
## China - Hezhen 0.30943678
## China - Lahu -0.13997147
## China - Miaozu -0.30065353
## China - Mongola 0.39248112
## China - Naxi 0.22367439
## China - Oroqen 0.65467508
## China - She 0.37888449
## China - Tu -0.48592198
## China - Tujia 0.90273577
## China - Uygur 0.42961486
## China - Xibo 0.43712888
## China - Yizu 0.01038373
## Colombia - Piapoco and Curripaco 1.60433709
## D. R. of Congo - Mbuti Pygmy -1.92056034
## France - Basque 0.49867493
## France - French 0.86326817
## Israel (Carmel) - Druze 0.42094921
## Israel (Central) - Palestinian -0.04988749
## Israel (Negev) - Bedouin -0.08339228
## Italy - from Bergamo 0.36408238
## Italy - Sardinian 0.28326102
## Italy - Tuscan 0.06798053
## Japan - Japanese 0.44698781
## Kenya - Bantu -2.40965120
## Mexico - Maya 0.02050213
## Mexico - Pima 1.29366409
## Namibia - San -3.68027633
## New Guinea - Papuan -0.35832038
## Nigeria - Yoruba -1.96048120
## Orkney Islands - Orcadian 1.10630447
## Pakistan - Balochi 0.53964138
## Pakistan - Brahui 0.04098087
## Pakistan - Burusho 0.13393307
## Pakistan - Hazara 0.17223312
## Pakistan - Kalash 0.93222854
## Pakistan - Makrani 0.28733941
## Pakistan - Pathan 0.45307408
## Pakistan - Sindhi 0.56689636
## Population Set 1 0.07294061
## Russia - Russian 0.56044121
## Russia (Caucasus) - Adygei 0.24724428
## Senegal - Mandenka -1.77781686
## Siberia - Yakut 0.65658812
## South Africa - Bantu -2.01624256
## attr(,"scaled:center")
## [1] 0.003906372
## attr(,"scaled:scale")
## [1] 0.0009766338
Carry out the same job at the sub-continent level. This should make results more stable because populations have very low sample sizes but sub-continental clusters rely on larger samples. Upload “Lee_results_continent.csv”
compute PGS by population
PGS_continent<-with(df_remerged, tapply(GVS, population, mean))
PGS_continent
## AFRICA AMERICA CENTRAL-SOUTH ASIA
## 0.001919375 0.004407307 0.004285533
## EAST ASIA EUROPE MIDDLE EAST
## 0.004306213 0.004426125 0.003889246
## OCEANIA Population Set 1
## 0.003577771 0.003977608
Sort in PGS in descending order
sort(PGS_continent,decreasing = TRUE)
## EUROPE AMERICA EAST ASIA
## 0.004426125 0.004407307 0.004306213
## CENTRAL-SOUTH ASIA Population Set 1 MIDDLE EAST
## 0.004285533 0.003977608 0.003889246
## OCEANIA AFRICA
## 0.003577771 0.001919375
Calculate Z-score
PGS_Z=scale(PGS,center = TRUE, scale= TRUE)
Bar chart
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.4
df_PGS=data.frame(PGS_Z)
df_PGS$Population<- rownames(df_PGS)
ggplot(df_PGS, aes(PGS_Z, Population)) +
geom_segment(aes(x = 0, y = Population, xend = PGS_Z, yend = Population), color = "grey50") +
geom_point()+
labs(title = "Polygenic scores for 52 HGDP populations",
caption = "Neale lab, UK Biobank GWAS")
The prediction of morningness being associated with African ancestry was borne out by the PGS.
References:
Eastman, C. I., Tomaka, V. A. & Crowley, S. J. (2016). Circadian rhythms of European and African-Americans after a large delay of sleep as in jet lag and night Work. Sci Rep, 6: 37616. doi: 10.1038/srep36716
Hut, R. A., Paolucci, S., Dor, R., Kyriacou, C. P. & Daan, S. Latitudinal clines: an evolutionary view on biological rhythms. Proc Biol Sci 280, 20130433 (2013).
Malone, S. K., Patterson, F., Lu, Y., Lozano, A. & Hanlon, A. (2017). Differences in morning–evening type and sleep duration between Black and White adults: Results from a propensity-matched UK Biobank sample. Chronobiology International,34. https://doi.org/10.1080/07420528.2017.1317639
Pittendrigh CS. 1993 Temporal organisation: reflections of a Darwinian clock-watcher. Annu. Rev. Physiol. 55, 17–54. (doi:10.1146/annurev.ph.55.030193.000313)