Genomic prediction of wheat traits with transfer learning approach

A summary of variance components, SNP effects, transfer learning, and cross-validation of wheat traits

Author

Blessing Olabosoye

Published

March 21, 2024

1 Genomic Prediction

  • Fit Growth Habit as random variable
  • Fit Plant height as response variable
  • Output predicted breeding values by cluster
  • Output variance components by cluster

1.1 Phenotype by cluster

Code
y_pheno <- data.frame(new_pheno[, -1], row.names = 1)
y_pheno$GrowthHabit <- as.factor(y_pheno$GrowthHabit)

yc <- list()
for(i in 1:8){
  yc[[i]] <- y_pheno[clust_data$cluster == i,]
}

1.2 Trait one (Plant Height)

1.2.1 Fit GBLUP by cluster

Code
geno_pred <- list()
for(i in 1:8){
  geno_pred[[i]] <- model_gb(G_cl[[i]], yc[[i]], y_name = 'adj_PlantHeight',random =  'GrowthHabit')
}
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -463.3194 converged in: 7 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1128 first 5 animals shown:
                 [,1]
AYT-O-1092  0.2576672
AYT-O-1524  0.5631538
AYT-O-0789  0.5246622
AYT-O-1055  0.4553185
AYT-O-1352 -0.1040332
AYT-O-1390 -0.5963672
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -399.841 converged in: 6 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1156 first 5 animals shown:
                [,1]
AYT-O-0974 0.2840823
AYT-S-0299 0.2990243
AYT-O-0901 1.5456684
AYT-O-0990 0.5256079
AYT-O-0483 0.9507397
PYT-S-0019 0.1832534
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -362.0128 converged in: 8 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 948 first 5 animals shown:
                 [,1]
PYT-S-1863  0.4602759
PYT-S-1361 -0.2430845
AYT-S-0704  0.3645520
PYT-S-1539 -0.1738447
PYT-S-2970  0.9658436
PYT-S-1940  0.6930008
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -421.0563 converged in: 7 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1297 first 5 animals shown:
                 [,1]
PYT-S-7551 -0.1757567
PYT-S-0979 -0.6248585
PYT-S-0746 -0.2777269
PYT-S-1731 -0.1534670
PYT-S-7351  0.1828540
PYT-S-0473 -0.8881389
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -708.8606 converged in: 6 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 2436 first 5 animals shown:
                  [,1]
AYT-O-1848  0.01603199
AYT-S-0664  0.11175245
AYT-O-1485  0.15578496
AYT-O-1849  0.51388379
AYT-O-1840 -0.35160061
AYT-S-0657 -0.20304514
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -421.9616 converged in: 9 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1250 first 5 animals shown:
                 [,1]
AYT-S-0579  0.1267793
AYT-S-0566  0.1941359
PYT-S-0078 -0.6168866
PYT-S-7745  0.3935005
PYT-S-1769 -0.4043010
PYT-S-7635 -0.4552102
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -352.0993 converged in: 7 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1049 first 5 animals shown:
                 [,1]
AYT-S-0304 -0.3264263
AYT-S-0319 -0.1232910
PYT-S-0163 -0.7042811
PYT-S-0185 -0.2071245
PYT-S-0207  1.0176971
PYT-S-0267 -0.8432723
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -540.4494 converged in: 7 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1094 first 5 animals shown:
                 [,1]
AYT-O-1546  0.6257452
AYT-S-0310 -0.1234987
PYT-S-0254 -1.6915419
PYT-S-0241 -1.6224948
PYT-S-7507 -0.1102057
PYT-S-7529 -0.0712129
.......................................................... 

1.2.2 Variance components by cluster

Code
lapply(geno_pred, varcomp)
[[1]]
               Estimate    StdError    prop.var          se
G           0.462899071 0.052785250 0.483911880 0.043562261
GrowthHabit 0.004101812 0.005955175 0.004288009 0.006209932
In          0.489576317 0.031700635 0.511800111 0.044079030

[[2]]
             Estimate   StdError   prop.var         se
G           0.3349517 0.04179907 0.41088532 0.04315358
GrowthHabit 0.0106464 0.01167458 0.01305994 0.01416638
In          0.4695971 0.02893875 0.57605473 0.04645344

[[3]]
              Estimate   StdError   prop.var         se
G           0.29021711 0.04010044 0.34470147 0.04576793
GrowthHabit 0.04987172 0.04091344 0.05923447 0.04589984
In          0.50184872 0.03448460 0.59606407 0.05527365

[[4]]
               Estimate    StdError    prop.var         se
G           0.372890864 0.042263396 0.454104494 0.04039527
GrowthHabit 0.006546445 0.008417732 0.007972225 0.01018414
In          0.441719207 0.025132782 0.537923280 0.04175395

[[5]]
              Estimate   StdError   prop.var         se
G           0.37227101 0.03276775 0.43351115 0.03206641
GrowthHabit 0.04144492 0.03226281 0.04826279 0.03582335
In          0.44501865 0.01657650 0.51822607 0.03361127

[[6]]
              Estimate   StdError  prop.var         se
G           0.23134127 0.03201619 0.2885290 0.03908005
GrowthHabit 0.05684337 0.04567466 0.0708951 0.05308183
In          0.51361085 0.02794684 0.6405759 0.05418867

[[7]]
              Estimate   StdError   prop.var         se
G           0.34231399 0.04200142 0.41947436 0.04377851
GrowthHabit 0.02927867 0.02640084 0.03587832 0.03128972
In          0.44446196 0.02800921 0.54464732 0.04659690

[[8]]
              Estimate   StdError   prop.var         se
G           0.27609784 0.04082913 0.27313855 0.03907440
GrowthHabit 0.04220453 0.03584829 0.04175217 0.03409002
In          0.69253203 0.04224475 0.68510928 0.05152434

1.3 Trait two (Grain Yield)

1.3.1 Fit GBLUP by cluster

Code
geno_pred2 <- list()
for(i in 1:8){
  geno_pred2[[i]] <- model_gb(G_cl[[i]], yc[[i]], y_name = 'adj_GrainYield',random =  'GrowthHabit')
}
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -7485.903 converged in: 8 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1109 first 5 animals shown:
                  [,1]
AYT-O-1092    1.224167
AYT-O-1524  -79.770514
AYT-O-0789 -292.396556
AYT-O-1055   34.191337
AYT-O-1352  238.867960
AYT-O-1390  141.495747
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -7768.735 converged in: 7 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1146 first 5 animals shown:
                 [,1]
AYT-O-0974  214.57111
AYT-S-0299  -18.08344
AYT-O-0901 -380.37557
AYT-O-0990  171.44036
AYT-O-0483 -104.48898
PYT-S-0019   34.43261
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -6281.247 converged in: 5 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 936 first 5 animals shown:
                 [,1]
PYT-S-1863  -45.01403
PYT-S-1361 -171.53470
AYT-S-0704   51.37504
PYT-S-1539  -87.06022
PYT-S-2970 -413.95348
PYT-S-1940 -299.52556
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -8685.566 converged in: 8 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1280 first 5 animals shown:
                 [,1]
PYT-S-7551  165.94291
PYT-S-0979   81.20703
PYT-S-0746 -243.98838
PYT-S-1731   72.88448
PYT-S-7351 -188.46519
PYT-S-0473  223.69265
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -16253.21 converged in: 9 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 2421 first 5 animals shown:
                [,1]
AYT-O-1848 -43.72039
AYT-S-0664  33.56904
AYT-O-1485 -83.77434
AYT-O-1849 178.64263
AYT-O-1840 302.42947
AYT-S-0657 392.60781
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -8389.823 converged in: 14 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1232 first 5 animals shown:
                [,1]
AYT-S-0579  22.40657
AYT-S-0566 393.11457
PYT-S-0078 164.23872
PYT-S-7745 -78.14549
PYT-S-1769 -10.65700
PYT-S-7635 -84.76342
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -7050.843 converged in: 9 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1038 first 5 animals shown:
                 [,1]
AYT-S-0304   61.70158
AYT-S-0319    6.24014
PYT-S-0163 -485.54957
PYT-S-0185 -474.77055
PYT-S-0207  -84.50120
PYT-S-0267  -83.83512
.......................................................... 
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ 1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -7394.409 converged in: 8 iterations 

estimated fixed effects:
[1] test.st p.value
<0 rows> (or 0-length row.names)

estimated variance components:
[1] prop.var
<0 rows> (or 0-length row.names)

breeding value predicted for 1075 first 5 animals shown:
                [,1]
AYT-O-1546 -257.0352
AYT-S-0310  260.1522
PYT-S-0254   89.9059
PYT-S-7507  118.9877
PYT-S-7529  100.7254
PYT-S-0010   53.1918
.......................................................... 

1.3.2 Variance components by cluster

Code
lapply(geno_pred2, varcomp)
[[1]]
              Estimate  StdError   prop.var         se
G            45083.488  9081.283 0.16570552 0.03219090
GrowthHabit   4539.328  4239.331 0.01668442 0.01536498
In          222447.100 11850.810 0.81761006 0.04456779

[[2]]
              Estimate  StdError   prop.var         se
G            42500.381  9261.008 0.14637938 0.03109122
GrowthHabit   6416.992  6438.303 0.02210134 0.02173203
In          241426.668 12576.609 0.83151928 0.04586580

[[3]]
             Estimate  StdError   prop.var         se
G            32112.77  7471.156 0.12516795 0.02959533
GrowthHabit  15766.12 13047.956 0.06145258 0.04789387
In          208678.59 12162.168 0.81337947 0.05807798

[[4]]
               Estimate  StdError     prop.var          se
G            41446.0234  8544.637 0.1428246361 0.028351472
GrowthHabit    261.0534  1074.879 0.0008996002 0.003702072
In          248481.1183 11832.251 0.8562757637 0.038485407

[[5]]
              Estimate StdError   prop.var         se
G            29340.027 4822.382 0.11249515 0.01803988
GrowthHabit   9707.868 8119.086 0.03722178 0.03000455
In          221763.576 7187.159 0.85028306 0.03510301

[[6]]
              Estimate    StdError     prop.var          se
G            38669.311  8038.41753  0.128216169 0.025982043
GrowthHabit   -669.873    30.08615 -0.002221104 0.000077903
In          263595.213 12659.86078  0.874004934 0.036059367

[[7]]
             Estimate  StdError  prop.var         se
G            36884.75  8412.925 0.1131428 0.02735210
GrowthHabit  39075.34 30221.087 0.1198624 0.08184950
In          250041.59 13263.732 0.7669948 0.07881136

[[8]]
              Estimate  StdError   prop.var         se
G            44428.085  9949.658 0.12899764 0.02883771
GrowthHabit   6227.035  6404.688 0.01808029 0.01830062
In          293754.927 16074.440 0.85292206 0.04428053

1.4 Estimation of SNP effects and variances

1.4.1 Trait1

Code
gwa_c <- list()
for(i in 1:8){ 
gwa_c[[i]] <- gwas(geno_pred[[i]], t(Z_cl[[i]]))
}
lapply(gwa_c, function(x) x[1:5, ])
[[1]]
                    ghat var_ghat
AXAFFSNP-00001 -4.626070 89.02621
AXAFFSNP-00004  6.732500 55.09585
AXAFFSNP-00006  3.982197 77.75904
AXAFFSNP-00010 -8.261871 74.88069
AXAFFSNP-00016 -1.583187 55.93648

[[2]]
                      ghat var_ghat
AXAFFSNP-00001   4.7506725 55.92118
AXAFFSNP-00004   0.4354213 43.01558
AXAFFSNP-00006 -14.2879546 51.77210
AXAFFSNP-00010 -16.0143654 43.94971
AXAFFSNP-00013   1.2483772 20.74670

[[3]]
                     ghat var_ghat
AXAFFSNP-00001  -1.174380 16.44676
AXAFFSNP-00004  -1.450365 11.20644
AXAFFSNP-00006 -12.302587 25.85211
AXAFFSNP-00009  -2.910375 14.51515
AXAFFSNP-00010  -7.477296 28.70357

[[4]]
                    ghat var_ghat
AXAFFSNP-00001  2.013998 31.49416
AXAFFSNP-00002 -6.671907 16.91085
AXAFFSNP-00004 -3.979269 26.48976
AXAFFSNP-00006 -3.431579 49.20268
AXAFFSNP-00009  6.006828 36.08906

[[5]]
                     ghat var_ghat
AXAFFSNP-00001  10.931529 75.05700
AXAFFSNP-00004  -6.172678 38.09890
AXAFFSNP-00006   7.985127 49.79542
AXAFFSNP-00010 -12.790644 80.25158
AXAFFSNP-00013   4.038807 36.42224

[[6]]
                    ghat  var_ghat
AXAFFSNP-00001  1.842982 12.278985
AXAFFSNP-00002  1.864016  9.080894
AXAFFSNP-00006 -3.197075 23.288048
AXAFFSNP-00009 -2.114315 27.502033
AXAFFSNP-00010 -8.636853 31.515490

[[7]]
                    ghat var_ghat
AXAFFSNP-00001 -5.613757 25.22079
AXAFFSNP-00006  8.922005 41.59449
AXAFFSNP-00009 -7.496845 28.00205
AXAFFSNP-00010 -3.810679 35.69161
AXAFFSNP-00013 -3.499518 29.62600

[[8]]
                      ghat  var_ghat
AXAFFSNP-00001   8.8075110 27.893187
AXAFFSNP-00002   0.3757961  8.131401
AXAFFSNP-00003   1.4062579 15.149970
AXAFFSNP-00004 -11.2984359 24.528547
AXAFFSNP-00006  -4.0332308 33.188816
Code
lapply(gwa_c, dim)
[[1]]
[1] 11868     2

[[2]]
[1] 12773     2

[[3]]
[1] 11166     2

[[4]]
[1] 11438     2

[[5]]
[1] 10544     2

[[6]]
[1] 11330     2

[[7]]
[1] 10452     2

[[8]]
[1] 13488     2
Code
lapply(gwa_c, function(x) range(x[,1]))
[[1]]
[1] -38.41372  31.64816

[[2]]
[1] -27.30391  28.51620

[[3]]
[1] -28.24796  31.13607

[[4]]
[1] -42.14360  56.42984

[[5]]
[1] -62.66411  52.67604

[[6]]
[1] -29.87758  23.41250

[[7]]
[1] -37.16951  44.54240

[[8]]
[1] -35.32168  36.73554

1.4.2 Trait2

Code
gwa_c2 <- list()
for(i in 1:8){ 
gwa_c2[[i]] <- gwas(geno_pred2[[i]], t(Z_cl[[i]]))
}
lapply(gwa_c2, function(x) x[1:5, ])
[[1]]
                     ghat var_ghat
AXAFFSNP-00001   731.8357  3667394
AXAFFSNP-00004 -1578.4232  2148778
AXAFFSNP-00006 -2961.0058  3459778
AXAFFSNP-00010  -740.9045  3403515
AXAFFSNP-00016   881.9000  2348996

[[2]]
                     ghat var_ghat
AXAFFSNP-00001  1163.0630  2955848
AXAFFSNP-00004  -569.4867  2184237
AXAFFSNP-00006  1640.8742  2964700
AXAFFSNP-00010 -1779.0850  2449377
AXAFFSNP-00013  2070.9446  1084591

[[3]]
                    ghat  var_ghat
AXAFFSNP-00001 -144.7706  759633.4
AXAFFSNP-00004  106.6463  480046.9
AXAFFSNP-00006  229.1155 1410049.5
AXAFFSNP-00009 1079.1468  734592.7
AXAFFSNP-00010  121.8277 1526120.3

[[4]]
                     ghat  var_ghat
AXAFFSNP-00001  1020.5808 1200824.8
AXAFFSNP-00002   339.9511  734131.9
AXAFFSNP-00004 -1896.9006  985181.0
AXAFFSNP-00006  1148.9042 2359185.8
AXAFFSNP-00009   747.1646 1698896.1

[[5]]
                     ghat  var_ghat
AXAFFSNP-00001  1636.0139 2035881.0
AXAFFSNP-00004  -943.6813  895700.1
AXAFFSNP-00006 -1313.9434 1591684.0
AXAFFSNP-00010   611.8697 2793725.7
AXAFFSNP-00013  -161.1688 1050951.5

[[6]]
                     ghat  var_ghat
AXAFFSNP-00001  2275.7462  925194.8
AXAFFSNP-00002   -77.8165  880330.3
AXAFFSNP-00006 -1460.3428 2159075.6
AXAFFSNP-00009  -725.5023 2688815.5
AXAFFSNP-00010  3727.9597 3198135.4

[[7]]
                     ghat var_ghat
AXAFFSNP-00001   959.5715   897488
AXAFFSNP-00006   795.3867  1871932
AXAFFSNP-00009 -1623.7773  1176863
AXAFFSNP-00010  1929.0904  1500007
AXAFFSNP-00013   307.6584  1333985

[[8]]
                     ghat  var_ghat
AXAFFSNP-00001   215.8252 2307974.4
AXAFFSNP-00002   938.6636  695557.5
AXAFFSNP-00003   234.2508 1327067.1
AXAFFSNP-00004 -1736.1005 2002688.3
AXAFFSNP-00006   841.0824 3048819.4
Code
lapply(gwa_c2, dim)
[[1]]
[1] 11868     2

[[2]]
[1] 12773     2

[[3]]
[1] 11166     2

[[4]]
[1] 11438     2

[[5]]
[1] 10544     2

[[6]]
[1] 11330     2

[[7]]
[1] 10452     2

[[8]]
[1] 13488     2
Code
lapply(gwa_c2, function(x) range(x[,1]))
[[1]]
[1] -6056.969  6005.034

[[2]]
[1] -6419.920  6647.591

[[3]]
[1] -4447.570  4514.493

[[4]]
[1] -7066.766  5879.442

[[5]]
[1] -7063.670  7217.976

[[6]]
[1] -5592.604  7319.548

[[7]]
[1] -5592.008  5352.008

[[8]]
[1] -5974.458  7883.720

1.4.3 Summary

Code
lapply(gwa_c, summary)
[[1]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     1      1     1    3
attr(,"class")
[1] "summary.gwas"

[[2]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     1      2     3    3
attr(,"class")
[1] "summary.gwas"

[[3]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      3     6      9    13   15
attr(,"class")
[1] "summary.gwas"

[[4]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      4     4      6     7    8
attr(,"class")
[1] "summary.gwas"

[[5]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      8    13     15    16   20
attr(,"class")
[1] "summary.gwas"

[[6]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      2     2      3     3    5
attr(,"class")
[1] "summary.gwas"

[[7]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      4     4      4     4    5
attr(,"class")
[1] "summary.gwas"

[[8]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      5    12     14    15   17
attr(,"class")
[1] "summary.gwas"
Code
lapply(gwa_c2, summary)
[[1]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     0    0
attr(,"class")
[1] "summary.gwas"

[[2]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     1    1
attr(,"class")
[1] "summary.gwas"

[[3]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     0    0
attr(,"class")
[1] "summary.gwas"

[[4]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     0    0
attr(,"class")
[1] "summary.gwas"

[[5]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     1      1     1    2
attr(,"class")
[1] "summary.gwas"

[[6]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     1    1
attr(,"class")
[1] "summary.gwas"

[[7]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     1    1
attr(,"class")
[1] "summary.gwas"

[[8]]
              <0.001 <0.01 <0.025 <0.05 <0.1
signif.values      0     0      0     0    0
attr(,"class")
[1] "summary.gwas"

1.5 Mahanttan Plots for Trait 1

The Manhattan plot was made using qvalues with a threshold of 0.05. This is to ensure significant markers and regions are distinguishable and threshold is a little less stringent.

Code
map <- read_csv("FileS3.csv", col_select = c(1:3))
map <- map[order(map$Marker),]
map <- map|>
  column_to_rownames(var = "Marker")
colnames(map) <- c("chr", "pos")
map$pos[is.na(map$pos)] <- runif(sum(is.na(map$pos)), min = 1, max = 200)
map$chr <- map$chr%>% replace_na("U")

chr_plot <- list()
qq_plot <- list()
for(i in 1:8){
  mk <- map[rownames(map)%in%colnames(new_geno[[i]]),]
  mp <- arrange(mk, chr, pos)
  pv <- getpvalue(gwa_c[[i]], log.p = F)
  pv <- pv[rownames(mp)]
  qq_plot[[i]] <- qqgplot(pv) |> title(main = paste0("qqplot cluster ", i))
  qv <- qvalue(pv)
  qv1 <- qv$qvalues
  chr_plot[[i]] <- manhattan_plot(qv1, mp, threshold = 0.05) |> title(main = paste0("Manhattan plot cluster ", i))
}

1.6 SNP effects correlation between clusters

1.6.1 Correlation matirx: Trait 1

Considering unequal cluster length,

  • combine all the cluster using rbind as a single dataframe to capture markers in all clusters.

  • order markers by ID and remove duplicates

  • use join_all to combine clusters using by = "ID" to align ordered marker IDs and replacing missing values with NAs.

Code
library(plyr)

Clust1 <- data.frame(ID = rownames(gwa_c[[1]]), Clst = gwa_c[[1]][,1])
Clust2 <- data.frame(ID = rownames(gwa_c[[2]]), Clst = gwa_c[[2]][,1])
Clust3 <- data.frame(ID = rownames(gwa_c[[3]]), Clst = gwa_c[[3]][,1])
Clust4 <- data.frame(ID = rownames(gwa_c[[4]]), Clst = gwa_c[[4]][,1])
Clust5 <- data.frame(ID = rownames(gwa_c[[5]]), Clst = gwa_c[[5]][,1])
Clust6 <- data.frame(ID = rownames(gwa_c[[6]]), Clst = gwa_c[[6]][,1])
Clust7 <- data.frame(ID = rownames(gwa_c[[7]]), Clst = gwa_c[[7]][,1])
Clust8 <- data.frame(ID = rownames(gwa_c[[8]]), Clst = gwa_c[[8]][,1])

df_h <- rbind(Clust1, Clust2, Clust3, Clust4, Clust5, Clust6, Clust7, Clust8)
df_h <- df_h [order(df_h $ID),]
df_h <- df_h [!duplicated(df_h $ID),]
df_ha <- join_all(list(df_h ,Clust1, Clust2, Clust3,  Clust4, Clust5, Clust6, Clust7, Clust8), by = "ID")
df_h <- df_ha [, -2]
colnames(df_h) <- c("ID", "Clst1", "Clst2", "Clst3", "Clst4", "Clst5", "Clst6", "Clst7", "Clst8")
head(df_h )
              ID     Clst1       Clst2      Clst3     Clst4     Clst5     Clst6
1 AXAFFSNP-00001 -4.626070   4.7506725  -1.174380  2.013998 10.931529  1.842982
2 AXAFFSNP-00002        NA          NA         NA -6.671907        NA  1.864016
3 AXAFFSNP-00003        NA          NA         NA        NA        NA        NA
4 AXAFFSNP-00004  6.732500   0.4354213  -1.450365 -3.979269 -6.172678        NA
5 AXAFFSNP-00006  3.982197 -14.2879546 -12.302587 -3.431579  7.985127 -3.197075
6 AXAFFSNP-00009        NA          NA  -2.910375  6.006828        NA -2.114315
      Clst7       Clst8
1 -5.613757   8.8075110
2        NA   0.3757961
3        NA   1.4062579
4        NA -11.2984359
5  8.922005  -4.0332308
6 -7.496845   0.1613191
Code
df_h <- df_h[, -1]
head(df_h)
      Clst1       Clst2      Clst3     Clst4     Clst5     Clst6     Clst7
1 -4.626070   4.7506725  -1.174380  2.013998 10.931529  1.842982 -5.613757
2        NA          NA         NA -6.671907        NA  1.864016        NA
3        NA          NA         NA        NA        NA        NA        NA
4  6.732500   0.4354213  -1.450365 -3.979269 -6.172678        NA        NA
5  3.982197 -14.2879546 -12.302587 -3.431579  7.985127 -3.197075  8.922005
6        NA          NA  -2.910375  6.006828        NA -2.114315 -7.496845
        Clst8
1   8.8075110
2   0.3757961
3   1.4062579
4 -11.2984359
5  -4.0332308
6   0.1613191
Code
ggpairs(df_h)

1.6.2 Correlogram: Trait 1

Code
df_h |> 
  cor(use = "pairwise.complete.obs") |>
  ggcorrplot(type = "lower",ggtheme = theme_void, title = "Correlation of between Clusters SNP effects with Trait 1", hc.order = TRUE, lab = TRUE, outline.color = "white")

1.6.3 Correlation matrix: Trait 2

Considering unequal cluster length,

  • combine all the cluster using rbind as a single dataframe to capture markers in all clusters.

  • order markers by ID and remove duplicates

  • use join_all to combine clusters using by = "ID" to align ordered marker IDs and replacing missing values with NAs.

Code
gClust1 <- data.frame(ID = rownames(gwa_c2[[1]]), Clst = gwa_c2[[1]][,1])
gClust2 <- data.frame(ID = rownames(gwa_c2[[2]]), Clst = gwa_c2[[2]][,1])
gClust3 <- data.frame(ID = rownames(gwa_c2[[3]]), Clst = gwa_c2[[3]][,1])
gClust4 <- data.frame(ID = rownames(gwa_c2[[4]]), Clst = gwa_c2[[4]][,1])
gClust5 <- data.frame(ID = rownames(gwa_c2[[5]]), Clst = gwa_c2[[5]][,1])
gClust6 <- data.frame(ID = rownames(gwa_c2[[6]]), Clst = gwa_c2[[6]][,1])
gClust7 <- data.frame(ID = rownames(gwa_c2[[7]]), Clst = gwa_c2[[7]][,1])
gClust8 <- data.frame(ID = rownames(gwa_c2[[8]]), Clst = gwa_c2[[8]][,1])

df_g <- rbind(gClust1, gClust2, gClust3, gClust4, gClust5, gClust6, gClust7, gClust8)
df_g <- df_g [order(df_g $ID),]
df_g <- df_g [!duplicated(df_g $ID),]
df_ga <- join_all(list(df_g ,gClust1, gClust2, gClust3,  gClust4, gClust5, gClust6, gClust7, gClust8), by = "ID")
df_g <- df_ga [, -2]
colnames(df_g) <- c("ID", "Clst1", "Clst2", "Clst3", "Clst4", "Clst5", "Clst6", "Clst7", "Clst8")
head(df_g )
              ID      Clst1     Clst2     Clst3      Clst4      Clst5
1 AXAFFSNP-00001   731.8357 1163.0630 -144.7706  1020.5808  1636.0139
2 AXAFFSNP-00002         NA        NA        NA   339.9511         NA
3 AXAFFSNP-00003         NA        NA        NA         NA         NA
4 AXAFFSNP-00004 -1578.4232 -569.4867  106.6463 -1896.9006  -943.6813
5 AXAFFSNP-00006 -2961.0058 1640.8742  229.1155  1148.9042 -1313.9434
6 AXAFFSNP-00009         NA        NA 1079.1468   747.1646         NA
       Clst6      Clst7      Clst8
1  2275.7462   959.5715   215.8252
2   -77.8165         NA   938.6636
3         NA         NA   234.2508
4         NA         NA -1736.1005
5 -1460.3428   795.3867   841.0824
6  -725.5023 -1623.7773  1784.3746
Code
df_g <- df_g [, -1]
head(df_g)
       Clst1     Clst2     Clst3      Clst4      Clst5      Clst6      Clst7
1   731.8357 1163.0630 -144.7706  1020.5808  1636.0139  2275.7462   959.5715
2         NA        NA        NA   339.9511         NA   -77.8165         NA
3         NA        NA        NA         NA         NA         NA         NA
4 -1578.4232 -569.4867  106.6463 -1896.9006  -943.6813         NA         NA
5 -2961.0058 1640.8742  229.1155  1148.9042 -1313.9434 -1460.3428   795.3867
6         NA        NA 1079.1468   747.1646         NA  -725.5023 -1623.7773
       Clst8
1   215.8252
2   938.6636
3   234.2508
4 -1736.1005
5   841.0824
6  1784.3746
Code
ggpairs(df_g )

Code
detach("package:plyr")

1.6.4 Correlogram: Trait 2

Code
df_g |> 
  cor(use = "pairwise.complete.obs") |>
  ggcorrplot(type = "lower",ggtheme = theme_void, title = "Correlation of between Clusters SNP effects with Trait 2", hc.order = TRUE, lab = TRUE, outline.color = "white")

2 Transfer learning

2.1 Genomic Prediction: Trait 1

2.1.1 Cluster 3

  • Transfer maker effects from cluster 5 to cluster 3
Code
Clust3_5 <- get_polys(Z_cl[[3]], gwa_c[[5]], scale = FALSE)
yt3 <- new_pheno[new_pheno$cluster == 3, -1]
yt3 <- column_to_rownames(yt3, var = "ID")
yt3$GrowthHabit <- as.factor(yt3$GrowthHabit)
Phe_3_5 <- add_poly(yt3, Clust3_5)

gp_3_5 <- model_gb(G_cl[[3]], Phe_3_5, y_name = 'adj_PlantHeight',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -335.2549 converged in: 9 iterations 

estimated fixed effects:
                Estimate     StdError  test.st      p.value
(Intercept) 0.2710529878 1.144153e-01 2.369027 1.783497e-02
poly1       0.0002327936 2.988035e-05 7.790860 6.661338e-15

estimated variance components:
              Estimate   StdError   prop.var
G           0.18252930 0.03059501 0.23228597
GrowthHabit 0.06132263 0.04854386 0.07803891
In          0.54194368 0.03431406 0.68967512

breeding value predicted for 948 first 5 animals shown:
                  [,1]
PYT-S-1863  0.26858790
PYT-S-1361 -0.01305992
AYT-S-0704  0.27124186
PYT-S-1539 -0.22393335
PYT-S-2970  0.71730224
PYT-S-1940  0.43528637
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phe_3_5$adj_PlantHeight, Phe_3_5$poly1)
          [,1]
[1,] 0.3459982

2.1.2 Cluster 5

  • Transfer maker effects from cluster 3 to cluster 5
Code
Clust5_3 <- get_polys(Z_cl[[5]], gwa_c[[3]], scale = FALSE)
yt5 <- new_pheno[new_pheno$cluster == 5, -1]
yt5 <- column_to_rownames(yt5, var = "ID")
yt5$GrowthHabit <- as.factor(yt5$GrowthHabit)
Phe_5_3 <- add_poly(yt5, Clust5_3)

gp_5_3 <- model_gb(G_cl[[5]], Phe_5_3, y_name = 'adj_PlantHeight',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -687.1715 converged in: 7 iterations 

estimated fixed effects:
                 Estimate     StdError   test.st      p.value
(Intercept) -0.2040115550 9.298301e-02 -2.194073 2.823013e-02
poly1        0.0002470708 3.747643e-05  6.592699 4.319012e-11

estimated variance components:
              Estimate   StdError   prop.var
G           0.32795570 0.03008097 0.39991583
GrowthHabit 0.04107201 0.03196881 0.05008404
In          0.45103412 0.01655445 0.55000014

breeding value predicted for 2436 first 5 animals shown:
                  [,1]
AYT-O-1848  0.17713124
AYT-S-0664 -0.02762182
AYT-O-1485 -0.04678334
AYT-O-1849  0.43811585
AYT-O-1840 -0.29066922
AYT-S-0657 -0.27396034
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phe_5_3$adj_PlantHeight, Phe_5_3$poly1)
          [,1]
[1,] 0.2534927

2.1.3 Cluster 7

  • Transfer maker effects from cluster 5 to cluster 7
Code
Clust7_5 <- get_polys(Z_cl[[7]], gwa_c[[5]], scale = FALSE)
yt7 <- new_pheno[new_pheno$cluster == 7, -1]
yt7 <- column_to_rownames(yt7, var = "ID")
yt7$GrowthHabit <- as.factor(yt7$GrowthHabit)
Phe_7_5 <- add_poly(yt7, Clust7_5)

gp_7_5 <- model_gb(G_cl[[7]], Phe_7_5, y_name = 'adj_PlantHeight',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -291.7576 converged in: 9 iterations 

estimated fixed effects:
                 Estimate     StdError   test.st     p.value
(Intercept) -0.2135834980 7.439333e-02 -2.871003 0.004091712
poly1        0.0003026481 2.545617e-05 11.888989 0.000000000

estimated variance components:
              Estimate   StdError   prop.var
G           0.17925425 0.02783426 0.26710746
GrowthHabit 0.02158098 0.02020807 0.03215791
In          0.47025888 0.02708577 0.70073463

breeding value predicted for 1049 first 5 animals shown:
                  [,1]
AYT-S-0304 -0.27492609
AYT-S-0319 -0.35638128
PYT-S-0163 -0.39128535
PYT-S-0185  0.25384041
PYT-S-0207  0.72408821
PYT-S-0267 -0.05446074
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phe_7_5$adj_PlantHeight, Phe_7_5$poly1)
          [,1]
[1,] 0.4799039

2.1.4 Cluster 5a

  • Transfer maker effects from cluster 7 to cluster 5
Code
Clust5_7 <- get_polys(Z_cl[[5]], gwa_c[[7]], scale = FALSE)
Phe_5_7 <- add_poly(yt5, Clust5_7)

gp_5_7 <- model_gb(G_cl[[5]], Phe_5_7, y_name = 'adj_PlantHeight',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -653.5732 converged in: 8 iterations 

estimated fixed effects:
                 Estimate     StdError   test.st    p.value
(Intercept) -0.2124721760 8.978493e-02 -2.366457 0.01795926
poly1        0.0003105102 2.770523e-05 11.207640 0.00000000

estimated variance components:
              Estimate   StdError   prop.var
G           0.24974189 0.02509016 0.33162187
GrowthHabit 0.03797651 0.02975872 0.05042743
In          0.46537395 0.01658088 0.61795070

breeding value predicted for 2436 first 5 animals shown:
                  [,1]
AYT-O-1848  0.05821880
AYT-S-0664 -0.31442254
AYT-O-1485  0.43259688
AYT-O-1849  0.24802319
AYT-O-1840 -0.12960232
AYT-S-0657  0.05090983
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phe_5_7$adj_PlantHeight, Phe_5_7$poly1)
          [,1]
[1,] 0.4004873

2.2 SNP Effects: Trait1

2.2.1 Cluster 3

Code
gw_3_5 <- gwas(gp_3_5, t(Z_cl[[3]]))
head(gw_3_5)
                     ghat  var_ghat
AXAFFSNP-00001 -2.2896183  7.360566
AXAFFSNP-00004 -0.7589992  4.885620
AXAFFSNP-00006 -7.9417947 12.525017
AXAFFSNP-00009 -2.6628090  6.852242
AXAFFSNP-00010 -3.8413624 13.736534
AXAFFSNP-00013  4.7059463  9.678932

2.2.2 Cluster 5

Code
gw_5_3 <- gwas(gp_5_3, t(Z_cl[[5]]))
head(gw_5_3)
                     ghat var_ghat
AXAFFSNP-00001 10.0520004 61.80006
AXAFFSNP-00004 -4.3891218 30.97108
AXAFFSNP-00006 11.2457976 41.04854
AXAFFSNP-00010 -8.6367877 66.69606
AXAFFSNP-00013  1.7622503 30.00388
AXAFFSNP-00015 -0.8783663 40.26564

2.2.3 Cluster 7

Code
gw_7_5 <- gwas(gp_7_5, t(Z_cl[[7]]))
head(gw_7_5)
                    ghat  var_ghat
AXAFFSNP-00001 -3.645440  8.584628
AXAFFSNP-00006  4.029292 15.782771
AXAFFSNP-00009 -5.451509 10.238639
AXAFFSNP-00010  1.417893 12.955969
AXAFFSNP-00013 -5.201282 11.199245
AXAFFSNP-00016 -4.029265 15.920978

2.2.4 Cluster 5a

Code
gw_5_7 <- gwas(gp_5_7, t(Z_cl[[5]]))
head(gw_5_7)
                     ghat var_ghat
AXAFFSNP-00001  7.9309900 40.37737
AXAFFSNP-00004 -5.6990654 19.73947
AXAFFSNP-00006  4.7715349 27.65362
AXAFFSNP-00010 -8.3526777 45.25281
AXAFFSNP-00013  4.3417704 19.86775
AXAFFSNP-00015  0.1774272 26.16263

2.3 Plots: Trait 1

Code
df_TL <- df_ha[,-2]
colnames(df_TL) <- c("ID", "Clst1", "Clst2", "Clst3", "Clst4", "Clst5", "Clst6", "Clst7", "Clst8")
 kt <- na.omit(df_TL[,c(1,4,6)])
 dim(kt)
[1] 9726    3
Code
 head(kt)
               ID      Clst3      Clst5
1  AXAFFSNP-00001  -1.174380  10.931529
4  AXAFFSNP-00004  -1.450365  -6.172678
5  AXAFFSNP-00006 -12.302587   7.985127
7  AXAFFSNP-00010  -7.477296 -12.790644
8  AXAFFSNP-00013   7.828549   4.038807
10 AXAFFSNP-00016  -9.774522  -5.775532
Code
 id3 <- gwa_c[[3]][kt$ID,]
 id5 <- gwa_c[[5]][kt$ID,]
 kt3 <- gw_3_5[kt$ID, ]
 kt5 <- gw_5_3[kt$ID, ]
 Ph3 <- gp_3_5$coefm[2]*id5$ghat+kt3$ghat

Ph5 <- gp_5_3$coefm[2]*id3$ghat+kt5$ghat


cr1 <- cor(id3[,1], Ph3)
tibble(x=id3[,1], y=Ph3)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -20, y = 14, label = paste0("r: ", round(cr1, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 3",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
cr1_5 <- cor(id5[,1], Ph5)
tibble(x=id5[,1], y=Ph5)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -55, y = 30, label = paste0("r: ", round(cr1_5, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 5",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
ktb <- na.omit(df_TL[,c(1,6,8)])
dim(ktb)
[1] 9470    3
Code
head(ktb)
               ID      Clst5     Clst7
1  AXAFFSNP-00001  10.931529 -5.613757
5  AXAFFSNP-00006   7.985127  8.922005
7  AXAFFSNP-00010 -12.790644 -3.810679
8  AXAFFSNP-00013   4.038807 -3.499518
10 AXAFFSNP-00016  -5.775532 -8.368099
11 AXAFFSNP-00017 -10.308467 -3.259819
Code
 id7 <- gwa_c[[7]][ktb$ID,]
 id5a <- gwa_c[[5]][ktb$ID,]
 kt7 <- gw_7_5[ktb$ID, ]
 kt5a <- gw_5_7[ktb$ID, ]
Ph7 <- gp_7_5$coefm[2]*id5a$ghat+kt7$ghat

Ph5a <- gp_5_7$coefm[2]*id7$ghat+kt5a$ghat


cr2 <- cor(id7[,1], Ph7)
tibble(x=id7[,1], y=Ph7)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -30, y = 16, label = paste0("r: ", round(cr2, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 7",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
cr2_5 <- cor(id5a[,1], Ph5a)
tibble(x=id5a[,1], y=Ph5a)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -53, y = 25, label = paste0("r: ", round(cr2_5, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 5*",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

2.4 Cross-validation: Trait 1

  • Individual Cluster(3, 5, and 7)

  • Transfer effects from Cluster 5 to 3 & 5 to 7 and vice-versa

Code
trt1_cv_3 <- cross_validation(geno_pred[[3]]$model$G, iterate=10, yc[[3]], y_name = 'adj_PlantHeight',rand_var =  'GrowthHabit')

trt1_cv_5 <- cross_validation(geno_pred[[5]]$model$G, iterate=10, yc[[5]], y_name = 'adj_PlantHeight',rand_var =  'GrowthHabit')

trt1_cv_7 <- cross_validation(geno_pred[[7]]$model$G, iterate=10, yc[[7]], y_name = 'adj_PlantHeight',rand_var =  'GrowthHabit')

trt1.Tf_cv_3 <- cross_validation(gp_3_5$model$G, iterate=10, Phe_3_5, y_name='adj_PlantHeight', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt1.Tf_cv_5 <- cross_validation(gp_5_3$model$G, iterate=10, Phe_5_3, y_name='adj_PlantHeight', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt1.Tf_cv_7 <- cross_validation(gp_7_5$model$G, iterate=10, Phe_7_5, y_name='adj_PlantHeight', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt1.Tf_cv_5a <- cross_validation(gp_5_7$model$G, iterate=10, Phe_5_7, y_name='adj_PlantHeight', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")
Code
kable(cbind(c.v3, c.v5, c.v7, c.Tf.3 , c.Tf.5, c.Tf.7, c.Tf.5a), 
      caption = "Prediction accuracy under cross-validation: Trait 1", col.names = c("Cluster.3", "Cluster.5", "Cluster.7","Cluster.Trf.3", "Cluster.Trf.5", "Cluster.Trf.7", "Cluster.Trf.5*"), align = "c")|>
  kable_styling(latex_options = "scale down")|>
  add_footnote(c("values are mean ± standard deviation",  "Cluster.Trf = Cluster with transfer"))
Prediction accuracy under cross-validation: Trait 1
Cluster.3 Cluster.5 Cluster.7 Cluster.Trf.3 Cluster.Trf.5 Cluster.Trf.7 Cluster.Trf.5*
0.457±0.073 0.564±0.043 0.544±0.058 0.486±0.077 0.565±0.042 0.591±0.05 0.57±0.044
a values are mean ± standard deviation
b Cluster.Trf = Cluster with transfer

2.5 Genomic Prediction: Trait 2

2.5.1 Cluster 3

  • Transfer maker effects from cluster 5 to cluster 3
Code
trait3_5 <- get_polys(Z_cl[[3]], gwa_c2[[5]], scale = FALSE)
Phet_3_5 <- add_poly(yt3, trait3_5)

gpt_3_5 <- model_gb(G_cl[[3]], Phet_3_5, y_name = 'adj_GrainYield',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -6264.306 converged in: 6 iterations 

estimated fixed effects:
                 Estimate     StdError   test.st      p.value
(Intercept) -7.266466e+01 5.872987e+01 -1.237269 2.159871e-01
poly1        3.120948e-04 6.749610e-05  4.623893 3.766039e-06

estimated variance components:
             Estimate  StdError   prop.var
G            26891.19  6831.755 0.10700785
GrowthHabit  15381.11 12725.091 0.06120588
In          209028.82 11995.477 0.83178627

breeding value predicted for 936 first 5 animals shown:
                  [,1]
PYT-S-1863  -33.316189
PYT-S-1361 -170.576631
AYT-S-0704  115.619859
PYT-S-1539   -3.043164
PYT-S-2970 -268.342543
PYT-S-1940 -304.605569
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phet_3_5$adj_GrainYield, Phe_3_5$poly1, use = "pairwise.complete.obs")
            [,1]
[1,] -0.06041408

2.5.2 Cluster 5

  • Transfer maker effects from cluster 3 to cluster 5
Code
trait5_3 <- get_polys(Z_cl[[5]], gwa_c2[[3]], scale = FALSE)
Phet_5_3 <- add_poly(yt5, trait5_3)

gpt_5_3 <- model_gb(G_cl[[5]], Phet_5_3, y_name = 'adj_GrainYield',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -16238.29 converged in: 9 iterations 

estimated fixed effects:
                Estimate     StdError  test.st      p.value
(Intercept) 1.609689e+02 4.811691e+01 3.345370 8.217281e-04
poly1       2.244507e-04 5.599238e-05 4.008593 6.108158e-05

estimated variance components:
             Estimate StdError   prop.var
G            27490.92 4652.817 0.10600376
GrowthHabit  10295.01 8508.905 0.03969711
In          221553.14 7159.107 0.85429913

breeding value predicted for 2421 first 5 animals shown:
                [,1]
AYT-O-1848 -83.22733
AYT-S-0664   2.86922
AYT-O-1485  32.52669
AYT-O-1849 136.31175
AYT-O-1840 255.80215
AYT-S-0657 416.97574
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phet_5_3$adj_GrainYield, Phet_5_3$poly1, use = "pairwise.complete.obs")
          [,1]
[1,] 0.1257553

2.5.3 Cluster 4

  • Transfer maker effects from cluster 8 to cluster 4
Code
trait4_8 <- get_polys(Z_cl[[4]], gwa_c2[[8]], scale = FALSE)
yt4 <- new_pheno[new_pheno$cluster == 4, -1]
yt4 <- column_to_rownames(yt4, var = "ID")
yt4$GrowthHabit <- as.factor(yt4$GrowthHabit)
Phet_4_8 <- add_poly(yt4, trait4_8)

gpt_4_8 <- model_gb(G_cl[[4]], Phet_4_8, y_name = 'adj_GrainYield',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -8666.788 converged in: 8 iterations 

estimated fixed effects:
                Estimate     StdError  test.st      p.value
(Intercept) 3.250361e+01 1.786590e+01 1.819310 6.886418e-02
poly1       2.198157e-04 4.360691e-05 5.040846 4.634790e-07

estimated variance components:
               Estimate  StdError    prop.var
G            32042.0518  7461.682 0.113060727
GrowthHabit    313.5394  1121.047 0.001106327
In          251050.0843 11719.284 0.885832945

breeding value predicted for 1280 first 5 animals shown:
                 [,1]
PYT-S-7551  221.20590
PYT-S-0979  -10.73139
PYT-S-0746  -90.65857
PYT-S-1731   44.32125
PYT-S-7351 -187.28622
PYT-S-0473  222.00087
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phet_4_8$adj_GrainYield, Phet_4_8$poly1, use = "pairwise.complete.obs")
          [,1]
[1,] 0.1720852

2.5.4 Cluster 8

  • Transfer maker effects from cluster 4 to cluster 8
Code
trait8_4 <- get_polys(Z_cl[[8]], gwa_c2[[4]], scale = FALSE)
yt8 <- new_pheno[new_pheno$cluster == 8, -1]
yt8 <- column_to_rownames(yt8, var = "ID")
yt8$GrowthHabit <- as.factor(yt8$GrowthHabit)
Phet_8_4 <- add_poly(yt8, trait8_4)

gpt_8_4 <- model_gb(G_cl[[8]], Phet_8_4, y_name = 'adj_GrainYield',fixed = 'poly1', random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ poly1

random effects equation:
~G + GrowthHabit + In

log-likelihood: -7377.889 converged in: 8 iterations 

estimated fixed effects:
                 Estimate     StdError   test.st      p.value
(Intercept) -1.342921e+02 4.285484e+01 -3.133651 1.726461e-03
poly1        1.870609e-04 4.173433e-05  4.482182 7.388368e-06

estimated variance components:
             Estimate  StdError   prop.var
G            35140.59  8907.553 0.10353087
GrowthHabit   6676.55  6701.683 0.01967039
In          297604.22 15922.137 0.87679874

breeding value predicted for 1075 first 5 animals shown:
                 [,1]
AYT-O-1546 -117.84850
AYT-S-0310  254.43884
PYT-S-0254  122.07198
PYT-S-7507   43.38334
PYT-S-7529   25.56510
PYT-S-0010   95.87267
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate
cor(Phet_8_4$adj_GrainYield, Phet_8_4$poly1, use = "pairwise.complete.obs")
          [,1]
[1,] 0.1761822

2.6 SNP Effects: Trait2

2.6.1 Cluster 3

Code
gwt_3_5 <- gwas(gpt_3_5, t(Z_cl[[3]]))
head(gwt_3_5)
                     ghat  var_ghat
AXAFFSNP-00001 -177.94258  557750.2
AXAFFSNP-00004  -61.39971  348300.0
AXAFFSNP-00006  600.46027 1045137.9
AXAFFSNP-00009  924.01244  544280.0
AXAFFSNP-00010  259.32580 1139095.1
AXAFFSNP-00013  159.94837  789041.1

2.6.2 Cluster 5

Code
gwt_5_3 <- gwas(gpt_5_3, t(Z_cl[[5]]))
head(gwt_5_3)
                        ghat  var_ghat
AXAFFSNP-00001  1431.0812839 1824655.3
AXAFFSNP-00004  -947.2685172  800398.8
AXAFFSNP-00006 -1280.4658222 1436185.7
AXAFFSNP-00010   533.6923856 2529151.9
AXAFFSNP-00013     0.7105884  943213.3
AXAFFSNP-00015  -549.6032758 1196787.5

2.6.3 Cluster 4

Code
gwt_4_8 <- gwas(gpt_4_8, t(Z_cl[[4]]))
head(gwt_4_8)
                      ghat  var_ghat
AXAFFSNP-00001   816.33810  758582.9
AXAFFSNP-00002   330.03032  471791.5
AXAFFSNP-00004 -1465.06355  621739.0
AXAFFSNP-00006   734.51203 1532009.2
AXAFFSNP-00009    99.13708 1096021.9
AXAFFSNP-00010 -1383.95949 1912008.9

2.6.4 Cluster 8

Code
gwt_8_4 <- gwas(gpt_8_4, t(Z_cl[[8]]))
head(gwt_8_4)
                     ghat  var_ghat
AXAFFSNP-00001  -319.5924 1514546.4
AXAFFSNP-00002   582.5479  460642.9
AXAFFSNP-00003  -107.3510  882279.3
AXAFFSNP-00004 -1406.7452 1317352.6
AXAFFSNP-00006   364.2936 2049731.9
AXAFFSNP-00009  1248.2520 1543223.6

2.7 Plots: Trait 2

N.B: Marker effects in trait 2 is large because adjustment wasn’t made for unit of measurement.This applies to all results for trait 2 in this report.

Code
df_TL2 <- df_ga[,-2]
colnames(df_TL2) <- c("ID", "Clst1", "Clst2", "Clst3", "Clst4", "Clst5", "Clst6", "Clst7", "Clst8")
 ht <- na.omit(df_TL2[,c(1,4,6)])
 dim(ht)
[1] 9726    3
Code
 head(ht)
               ID     Clst3      Clst5
1  AXAFFSNP-00001 -144.7706  1636.0139
4  AXAFFSNP-00004  106.6463  -943.6813
5  AXAFFSNP-00006  229.1155 -1313.9434
7  AXAFFSNP-00010  121.8277   611.8697
8  AXAFFSNP-00013  393.1254  -161.1688
10 AXAFFSNP-00016  765.7405  -942.1816
Code
 ht5 <- gwa_c2[[5]][ht$ID,]
 ht3 <- gwa_c2[[3]][ht$ID, ]
 dt3 <- gwt_3_5[ht$ID, ]
 dt5 <- gwt_5_3[ht$ID, ]
 Gy3 <- gpt_3_5$coefm[2]*ht5$ghat+dt3$ghat

Gy5 <- gpt_5_3$coefm[2]*ht3$ghat+dt5$ghat

cr4 <- cor(ht3[,1], Gy3)
tibble(x=ht3[,1], y=Gy3)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -3750, y = 2100, label = paste0("r: ", round(cr4, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 3",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
cr4_5 <- cor(ht5[,1], Gy5)
tibble(x=ht5[,1], y=Gy5)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -5900, y = 4900, label = paste0("r: ", round(cr4_5, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 5",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
 htb <- na.omit(df_TL2[,c(1,5,9)])
dim(htb)
[1] 11301     3
Code
head(htb)
              ID      Clst4      Clst8
1 AXAFFSNP-00001  1020.5808   215.8252
2 AXAFFSNP-00002   339.9511   938.6636
4 AXAFFSNP-00004 -1896.9006 -1736.1005
5 AXAFFSNP-00006  1148.9042   841.0824
6 AXAFFSNP-00009   747.1646  1784.3746
7 AXAFFSNP-00010 -1266.6596   804.4407
Code
 ht4 <- gwa_c2[[4]][htb$ID,]
 ht8 <- gwa_c2[[8]][htb$ID, ]
 dt4 <- gwt_4_8[htb$ID, ]
 dt8 <- gwt_8_4[htb$ID, ]
 Gy4 <- gpt_4_8$coefm[2]*ht8$ghat+dt4$ghat

Gy8 <- gpt_8_4$coefm[2]*ht4$ghat+dt8$ghat


cr5 <- cor(ht4[,1], Gy4)
tibble(x=ht4[,1], y=Gy4)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -6000, y = 3000, label = paste0("r: ", round(cr5, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 4",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

Code
cr5_8 <- cor(ht8[,1], Gy8)
tibble(x=ht8[,1], y=Gy8)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -4200, y = 4200, label = paste0("r: ", round(cr5_8, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 8",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

2.8 Cross-validation: Trait 2

  • Individual Cluster(3, 4, 5, and 8)

  • Transfer effects from Cluster 5 to 3 & 8 to 4 and vice-versa

Code
trt2_cv_3 <- cross_validation(G_cl[[3]], iterate=10, yc[[3]], y_name = 'adj_GrainYield',rand_var =  'GrowthHabit')

trt2_cv_5 <- cross_validation(G_cl[[5]], iterate=10, yc[[5]], y_name = 'adj_GrainYield',rand_var =  'GrowthHabit')

trt2_cv_4 <- cross_validation(G_cl[[4]], iterate=10, yc[[4]], y_name = 'adj_GrainYield',rand_var =  'GrowthHabit')

trt2_cv_8 <- cross_validation(G_cl[[8]], iterate=10, yc[[8]], y_name = 'adj_GrainYield',rand_var =  'GrowthHabit')

trt2.Tf_cv_3 <- cross_validation(G_cl[[3]], iterate=10, Phet_3_5, y_name='adj_GrainYield', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt2.Tf_cv_5 <- cross_validation(G_cl[[5]], iterate=10, Phet_5_3, y_name='adj_GrainYield', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt2.Tf_cv_4 <- cross_validation(G_cl[[4]], iterate=10, Phet_4_8, y_name='adj_GrainYield', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")

trt2.Tf_cv_8 <- cross_validation(G_cl[[8]], iterate=10, Phet_8_4, y_name='adj_GrainYield', fixed = 'poly1', rand_var =  'GrowthHabit', poly_name= "poly1")
Code
kable(cbind(c2.v3, c2.v5, c2.v4, c2.v8, c2.Tf.3 , c2.Tf.5, c2.Tf.4, c2.Tf.8), 
      caption = "Prediction accuracy under cross-validation: Trait 2", col.names = c("Cluster.3", "Cluster.5", "Cluster.4","Cluster.4","Cluster.Trf.3", "Cluster.Trf.5", "Cluster.Trf.4", "Cluster.Trf.8"), align = "c")|>
  kable_styling(latex_options = "scale down")|>
  add_footnote(c("values are mean ± standard deviation",  "Cluster.Trf = Cluster with transfer"))
Prediction accuracy under cross-validation: Trait 2
Cluster.3 Cluster.5 Cluster.4 Cluster.4 Cluster.Trf.3 Cluster.Trf.5 Cluster.Trf.4 Cluster.Trf.8
0.373±0.049 0.334±0.062 0.333±0.072 0.23±0.077 0.385±0.056 0.335±0.059 0.37±0.068 0.236±0.065
a values are mean ± standard deviation
b Cluster.Trf = Cluster with transfer

2.9 Comparing Heritabilites

  • Comparing heritabilities between individual clusters (3, 4, 5, 7, and 8) with their transfers.
Code
T1_3 <- varcomp(geno_pred[[3]])$prop.var[1]
T1_5 <- varcomp(geno_pred[[5]])$prop.var[1]
T1_7 <- varcomp(geno_pred[[7]])$prop.var[1]
T2_3 <- varcomp(geno_pred2[[3]])$prop.var[1]
T2_5 <- varcomp(geno_pred2[[5]])$prop.var[1]
T2_4 <- varcomp(geno_pred2[[4]])$prop.var[1]
T2_8 <- varcomp(geno_pred2[[8]])$prop.var[1]

Tf1_3 <- varcomp(gp_3_5)$prop.var[1]
Tf1_5 <- varcomp(gp_5_3)$prop.var[1]
Tf1_7 <- varcomp(gp_7_5)$prop.var[1]
Tf1_5a <- varcomp(gp_5_7)$prop.var[1]
Tf2_3 <- varcomp(gpt_3_5)$prop.var[1]
Tf2_5 <- varcomp(gpt_5_3)$prop.var[1]
Tf2_4 <- varcomp(gpt_4_8)$prop.var[1]
Tf2_8 <- varcomp(gpt_8_4)$prop.var[1]

H2_df <- rbind(T1_3, T1_5, T1_7,T2_3, T2_5,T2_4, T2_8, Tf1_3, Tf1_5, Tf1_7, Tf1_5a,Tf2_3, Tf2_5, Tf2_4, Tf2_8)
rownames(H2_df) <- c("Trt1.Ind.cluster3", "Trt1.Ind.cluster5", "Trt1.Ind.cluster7", "Trt2.Ind.cluster3", "Trt2.Ind.cluster5","Trt2.Ind.cluster4","Trt2.Ind.cluster8","Trt1.Tranf.cluster3", "Trt1.Tranf.cluster5","Trt1.Tranf.cluster7","Trt1.Tranf.cluster5*","Trt2.Tranf.cluster3", "Trt2.Tranf.cluster5","Trt2.Tranf.cluster4","Trt2.Tranf.cluster8")
kable(H2_df,
      col.names =" Heritability") |>
  add_footnote(c("Trt1.Ind.cluster = Trait1 Individual cluster",  "Trt1.Tranf.cluster = Trait1 Individual cluster with transfer"))
Heritability
Trt1.Ind.cluster3 0.3447015
Trt1.Ind.cluster5 0.4335111
Trt1.Ind.cluster7 0.4194744
Trt2.Ind.cluster3 0.1251679
Trt2.Ind.cluster5 0.1124952
Trt2.Ind.cluster4 0.1428246
Trt2.Ind.cluster8 0.1289976
Trt1.Tranf.cluster3 0.2322860
Trt1.Tranf.cluster5 0.3999158
Trt1.Tranf.cluster7 0.2671075
Trt1.Tranf.cluster5* 0.3316219
Trt2.Tranf.cluster3 0.1070078
Trt2.Tranf.cluster5 0.1060038
Trt2.Tranf.cluster4 0.1130607
Trt2.Tranf.cluster8 0.1035309

Note: aTrt1.Ind.cluster = Trait1 Individual cluster bTrt1.Tranf.cluster = Trait1 Individual cluster with transfer

3 1 x 7 cluster transfer

3.1 Genomic Prediction: Trait 1

  • Using cluster 1 as the target population with clusters 2 - 8 as the source populations.
Code
Clust1_2 <- get_polys(Z_cl[[1]], gwa_c[[2]], scale = FALSE)
Clust1_3 <- get_polys(Z_cl[[1]], gwa_c[[3]], scale = FALSE)
Clust1_4 <- get_polys(Z_cl[[1]], gwa_c[[4]], scale = FALSE)
Clust1_5 <- get_polys(Z_cl[[1]], gwa_c[[5]], scale = FALSE)
Clust1_6 <- get_polys(Z_cl[[1]], gwa_c[[6]], scale = FALSE)
Clust1_7 <- get_polys(Z_cl[[1]], gwa_c[[7]], scale = FALSE)
Clust1_8 <- get_polys(Z_cl[[1]], gwa_c[[8]], scale = FALSE)
yt1 <- new_pheno[new_pheno$cluster == 1, -1]
yt1 <- column_to_rownames(yt1, var = "ID")
yt1$GrowthHabit <- as.factor(yt1$GrowthHabit)
Phe_1 <- add_poly(yt1, Clust1_2,Clust1_3,Clust1_4,Clust1_5,Clust1_6,Clust1_7,Clust1_8)

gp_1t <- model_gb(G_cl[[1]], Phe_1, y_name = 'adj_PlantHeight',fixed = c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'), random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_PlantHeight 

fixed effects equation:
y ~ poly1 + poly2 + poly3 + poly4 + poly5 + poly6 + poly7

random effects equation:
~G + GrowthHabit + In

log-likelihood: -419.9313 converged in: 7 iterations 

estimated fixed effects:
                Estimate     StdError   test.st      p.value
(Intercept) 2.477448e-01 4.049135e-02 6.1184629 9.448227e-10
poly1       7.323480e-05 3.363995e-05 2.1770184 2.947920e-02
poly2       1.055218e-04 4.934405e-05 2.1384906 3.247695e-02
poly3       2.693858e-05 3.833089e-05 0.7027903 4.821865e-01
poly4       8.519155e-05 3.867441e-05 2.2027883 2.760967e-02
poly5       1.349468e-05 5.322720e-05 0.2535297 7.998589e-01
poly6       5.627363e-05 4.612186e-05 1.2201076 2.224241e-01
poly7       1.078386e-04 4.139340e-05 2.6052122 9.181738e-03

estimated variance components:
               Estimate    StdError    prop.var
G           0.333617188 0.043452772 0.392260456
GrowthHabit 0.005788693 0.007103939 0.006806231
In          0.511093278 0.031274029 0.600933314

breeding value predicted for 1128 first 5 animals shown:
                  [,1]
AYT-O-1092  0.14298796
AYT-O-1524  0.27137862
AYT-O-0789  0.62983107
AYT-O-1055  0.71748399
AYT-O-1352  0.09626098
AYT-O-1390 -0.14798805
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate (polygenic using clusters 2 - 8)
pr1 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly1)
pr2 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly2)
pr3 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly3)
pr4 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly4)
pr5 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly5)
pr6 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly6)
pr7 <- cor(Phe_1$adj_PlantHeight, Phe_1$poly7)

lat <- rbind(pr1, pr2, pr3, pr4, pr5, pr6, pr7)

rownames(lat) <- c("Poly1", "Poly2", "Poly3", "Poly4", "Poly5", "Poly6", "Poly7")
kable(lat, col.names = "r", caption = "Correlation between trait and polygenic estimate")
Correlation between trait and polygenic estimate
r
Poly1 0.2873773
Poly2 0.3209686
Poly3 0.1829230
Poly4 0.1611031
Poly5 0.1582615
Poly6 0.2545195
Poly7 0.3366465

3.2 SNP Effects: Trait 1

Code
gw_1 <- gwas(gp_1t, t(Z_cl[[1]]))
head(gw_1)
                     ghat var_ghat
AXAFFSNP-00001 -6.9530750 53.20129
AXAFFSNP-00004  8.3768802 32.09548
AXAFFSNP-00006  7.9544541 46.81365
AXAFFSNP-00010 -0.2862882 45.61901
AXAFFSNP-00016  3.8303651 33.35126
AXAFFSNP-00017 -2.5925824 16.02908

3.3 Plot : Trait 1

Code
up <- na.omit(df_ha)
dim(up)
[1] 7762   10
Code
head(up)
               ID      Clst      Clst          Clst        Clst        Clst
1  AXAFFSNP-00001 -4.626070 -4.626070   4.750672460  -1.1743799   2.0139982
5  AXAFFSNP-00006  3.982197  3.982197 -14.287954586 -12.3025866  -3.4315788
7  AXAFFSNP-00010 -8.261871 -8.261871 -16.014365415  -7.4772961 -21.1713539
10 AXAFFSNP-00016 -1.583187 -1.583187   0.009652469  -9.7745216  -5.9819037
11 AXAFFSNP-00017 -4.548304 -4.548304  -0.472720805   0.1339552  -0.5867993
12 AXAFFSNP-00018 -3.596939 -3.596939  -0.311583459  -3.4787609   0.9587564
         Clst      Clst      Clst      Clst
1   10.931529  1.842982 -5.613757  8.807511
5    7.985127 -3.197075  8.922005 -4.033231
7  -12.790644 -8.636853 -3.810679 -9.008555
10  -5.775532 -1.726070 -8.368099 -6.151319
11 -10.308467 -3.310330 -3.259819 -1.380820
12  17.963227 -1.915233  3.776699  4.087703
Code
k1 <- gwa_c[[1]][up$ID, ]
k2 <- gwa_c[[2]][up$ID, ]
k3 <- gwa_c[[3]][up$ID, ]
k4 <- gwa_c[[4]][up$ID, ]
k5 <- gwa_c[[5]][up$ID, ]
k6 <- gwa_c[[6]][up$ID, ]
k7 <- gwa_c[[7]][up$ID, ]
k8 <- gwa_c[[8]][up$ID, ]
id1 <- gw_1[up$ID, ]

Ph1 <- gp_1t$coefm[2]*k2$ghat+
  gp_1t$coefm[3]*k3$ghat+
  gp_1t$coefm[4]*k4$ghat+
  gp_1t$coefm[5]*k5$ghat+
  gp_1t$coefm[6]*k6$ghat+
  gp_1t$coefm[7]*k7$ghat+
  gp_1t$coefm[8]*k8$ghat+
  id1$ghat

cr3 <- cor(k1[,1], Ph1)
tibble(x=k1[,1], y=Ph1)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -32, y = 18, label = paste0("r: ", round(cr3, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 1",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

3.4 Cross-validation: Trait 1

Code
trt1_cv_1 <- cross_validation(G_cl[[1]], iterate=10, yc[[1]], y_name = 'adj_PlantHeight',rand_var =  'GrowthHabit')

trt1.Tf_cv_1 <- cross_validation(G_cl[[1]], iterate=10, Phe_1, y_name='adj_PlantHeight', fixed = c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'), rand_var =  'GrowthHabit', poly_name= c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'))
Code
kable(cbind(c1.v1, c1.Tf.1), 
      caption = "Prediction accuracy under cross-validation: Trait 1", col.names = c("Cluster.1","Cluster.Trf.1"), align = "c")|>
  kable_styling(latex_options = "scale down")|>
  add_footnote("values are mean ± standard deviation.  Cluster.Trf = Cluster with transfer")
Prediction accuracy under cross-validation: Trait 1
Cluster.1 Cluster.Trf.1
0.623±0.055 0.645±0.052
a values are mean ± standard deviation. Cluster.Trf = Cluster with transfer

3.5 Genomic Prediction: Trait 2

  • Using cluster 4 as the target population with clusters 1 - 8 (except 4) as the source populations.
Code
gClust4_1 <- get_polys(Z_cl[[4]], gwa_c2[[1]], scale = FALSE)
gClust4_2 <- get_polys(Z_cl[[4]], gwa_c2[[2]], scale = FALSE)
gClust4_3 <- get_polys(Z_cl[[4]], gwa_c2[[3]], scale = FALSE)
gClust4_5 <- get_polys(Z_cl[[4]], gwa_c2[[5]], scale = FALSE)
gClust4_6 <- get_polys(Z_cl[[4]], gwa_c2[[6]], scale = FALSE)
gClust4_7 <- get_polys(Z_cl[[4]], gwa_c2[[7]], scale = FALSE)
gClust4_8 <- get_polys(Z_cl[[4]], gwa_c2[[8]], scale = FALSE)

Phet_4 <- add_poly(yt4, gClust4_1,gClust4_2,gClust4_3,gClust4_5,gClust4_6,gClust4_7,gClust4_8)

gpt_4t <- model_gb(G_cl[[4]], Phet_4, y_name = 'adj_GrainYield',fixed = c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'), random =  'GrowthHabit')
.......................................................... 
summary gblup analysis of trait: adj_GrainYield 

fixed effects equation:
y ~ poly1 + poly2 + poly3 + poly4 + poly5 + poly6 + poly7

random effects equation:
~G + GrowthHabit + In

log-likelihood: -8614.384 converged in: 8 iterations 

estimated fixed effects:
                 Estimate     StdError    test.st     p.value
(Intercept)  3.265249e+01 1.829695e+01  1.7845863 0.074328453
poly1        7.469764e-05 5.820224e-05  1.2834152 0.199346636
poly2        1.411006e-04 4.944211e-05  2.8538552 0.004319221
poly3        4.277481e-05 6.646270e-05  0.6435912 0.519840547
poly4        1.829457e-05 6.670221e-05  0.2742724 0.783875310
poly5        1.287853e-04 5.200515e-05  2.4763958 0.013271636
poly6       -4.153976e-05 6.656298e-05 -0.6240670 0.532583614
poly7        1.588697e-04 4.495421e-05  3.5340330 0.000409270

estimated variance components:
               Estimate  StdError    prop.var
G            29375.5489  7302.955 0.105200088
GrowthHabit    360.0202  1166.795 0.001289309
In          249499.4536 11630.050 0.893510603

breeding value predicted for 1280 first 5 animals shown:
                 [,1]
PYT-S-7551  180.87555
PYT-S-0979  -62.51787
PYT-S-0746  -95.81856
PYT-S-1731  -85.67821
PYT-S-7351 -251.00216
PYT-S-0473  228.79739
.......................................................... 
  • Correlation between trait and polygenic estimate
Code
#Correlation between trait and polygenic estimate (polygenic using clusters 1 - 8, excluding 4)
pr1 <- cor(Phet_4$adj_GrainYield, Phet_4$poly1, use = "pairwise.complete.obs")
pr2 <- cor(Phet_4$adj_GrainYield, Phet_4$poly2, use = "pairwise.complete.obs")
pr3 <- cor(Phet_4$adj_GrainYield, Phet_4$poly3, use = "pairwise.complete.obs")
pr4 <- cor(Phet_4$adj_GrainYield, Phet_4$poly4, use = "pairwise.complete.obs")
pr5 <- cor(Phet_4$adj_GrainYield, Phet_4$poly5, use = "pairwise.complete.obs")
pr6 <- cor(Phet_4$adj_GrainYield, Phet_4$poly6, use = "pairwise.complete.obs")
pr7 <- cor(Phet_4$adj_GrainYield, Phet_4$poly7, use = "pairwise.complete.obs")

yd <- rbind(pr1, pr2, pr3, pr4, pr5, pr6, pr7)

rownames(yd) <- c("Poly1", "Poly2", "Poly3", "Poly4", "Poly5", "Poly6", "Poly7")
kable(yd, col.names = "r", caption = "Correlation between trait and polygenic estimate")
Correlation between trait and polygenic estimate
r
Poly1 0.1218245
Poly2 0.1821925
Poly3 0.0538204
Poly4 0.0507956
Poly5 -0.0021168
Poly6 -0.0216012
Poly7 0.1720852

3.6 SNP Effects: Trait 2

Code
gwt_4 <- gwas(gpt_4t, t(Z_cl[[4]]))
head(gwt_4)
                     ghat  var_ghat
AXAFFSNP-00001   367.3584  642038.8
AXAFFSNP-00002   209.1290  405322.4
AXAFFSNP-00004 -1602.2808  530769.2
AXAFFSNP-00006  1101.6353 1301350.7
AXAFFSNP-00009   308.3978  946568.9
AXAFFSNP-00010 -1509.7699 1568553.1

3.7 Plot: Trait 2

Code
rp <- na.omit(df_ga)
dim(rp)
[1] 7762   10
Code
head(rp)
               ID       Clst       Clst       Clst      Clst      Clst
1  AXAFFSNP-00001   731.8357   731.8357  1163.0630 -144.7706  1020.581
5  AXAFFSNP-00006 -2961.0058 -2961.0058  1640.8742  229.1155  1148.904
7  AXAFFSNP-00010  -740.9045  -740.9045 -1779.0850  121.8277 -1266.660
10 AXAFFSNP-00016   881.9000   881.9000   751.1217  765.7405 -1615.364
11 AXAFFSNP-00017  -714.3277  -714.3277  1886.6409 -912.2884 -1043.651
12 AXAFFSNP-00018  1405.7587  1405.7587  1500.6570 -578.3201 -4505.774
         Clst      Clst       Clst       Clst
1   1636.0139  2275.746  959.57151   215.8252
5  -1313.9434 -1460.343  795.38668   841.0824
7    611.8697  3727.960 1929.09044   804.4407
10  -942.1816 -1377.671  896.31980 -1068.3038
11  -574.6855 -3831.578  -57.20652 -1668.5754
12  -642.5638  1119.864  397.27942   539.8886
Code
h1 <- gwa_c2[[1]][rp$ID, ]
h2 <- gwa_c2[[2]][rp$ID, ]
h3 <- gwa_c2[[3]][rp$ID, ]
h4 <- gwa_c2[[4]][rp$ID, ]
h5 <- gwa_c2[[5]][rp$ID, ]
h6 <- gwa_c2[[6]][rp$ID, ]
h7 <- gwa_c2[[7]][rp$ID, ]
h8 <- gwa_c2[[8]][rp$ID, ]
id4 <- gwt_4[rp$ID, ]

Gy4a <- gp_1t$coefm[2]*h1$ghat+
  gp_1t$coefm[3]*h2$ghat+
  gp_1t$coefm[4]*h3$ghat+
  gp_1t$coefm[5]*h5$ghat+
  gp_1t$coefm[6]*h6$ghat+
  gp_1t$coefm[7]*h7$ghat+
  gp_1t$coefm[8]*h8$ghat+
  id4$ghat

cr6 <- cor(h4[,1], Gy4a)
tibble(x=h4[,1], y=Gy4a)%>% ggplot(., aes(x,y))+
  geom_point(color ="gold3") +
  geom_abline(color = "firebrick") +
  geom_label(x = -5800, y = 2700, label = paste0("r: ", round(cr6, 2)))+
  labs(title = "Individual SNP Effects with Transfer SNP Effects",
       subtitle = "Cluster 4*",
       x = expression(hat(g)[Ind]),
       y = expression(hat(g)[Transfer]))+
  theme(plot.title = element_text(color = "firebrick3"),plot.subtitle = element_text(color = "goldenrod4"),axis.title.x = element_text(color = "firebrick3"),axis.title.y = element_text(color = "goldenrod4"))

3.8 Cross-validation: Trait2

Code
trt2_cv_4 <- cross_validation(G_cl[[4]], iterate=10, yc[[4]], y_name = 'adj_GrainYield',rand_var =  'GrowthHabit')

trt2.Tf_cv_4 <- cross_validation(G_cl[[4]], iterate=10, Phet_4, y_name='adj_GrainYield', fixed = c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'), rand_var =  'GrowthHabit', poly_name= c('poly1', 'poly2', 'poly3', 'poly4', 'poly5', 'poly6', 'poly7'))
Code
kable(cbind(c2.v4, c2.Tf.4), 
      caption = "Prediction accuracy under cross-validation: Trait 2", col.names = c("Cluster.4","Cluster.Trf.4"), align = "c")|>
  kable_styling(latex_options = "scale down")|>
  add_footnote("values are mean ± standard deviation.  Cluster.Trf = Cluster with transfer")
Prediction accuracy under cross-validation: Trait 2
Cluster.4 Cluster.Trf.4
0.333±0.072 0.377±0.071
a values are mean ± standard deviation. Cluster.Trf = Cluster with transfer

3.9 Comparing Heritabilites

  • Comparing heritabilities between individual clusters (1 and 4) with their different transfers.
Code
T1_1 <- varcomp(geno_pred[[1]])$prop.var[1]
T2_4 <- varcomp(geno_pred2[[4]])$prop.var[1]


Tf1_1 <- varcomp(gp_1t)$prop.var[1]
Tf2_4 <- varcomp(gpt_4t)$prop.var[1]

H2_df <- rbind(T1_1, T2_4, Tf1_1, Tf2_4)
rownames(H2_df) <- c("Trt1.Ind.cluster1", "Trt2.Ind.cluster4","Trt1.Tranf.cluster1","Trt2.Tranf.cluster4")
kable(H2_df,
      col.names =" Heritability") |>
  add_footnote(c("Trt1.Ind.cluster = Trait1 Individual Cluster", "Trt1.Tranf.cluster = Trait1 Cluster with transfer"))
Heritability
Trt1.Ind.cluster1 0.4839119
Trt2.Ind.cluster4 0.1428246
Trt1.Tranf.cluster1 0.3922605
Trt2.Tranf.cluster4 0.1052001

Note: aTrt1.Ind.cluster = Trait1 Individual Cluster bTrt1.Tranf.cluster = Trait1 Cluster with transfer

N.B: Homebrew “package” called Toolkit_function.R was used as a part of this report with some of its dependent functions. It’s yet to be a documented package.