## [1] 2504 1882246
Set up: Parsing the data and converting into a matrix (2504 people x 1882245 variants) took about 20 minutes.
## Warning: package 'irlba' was built under R version 4.0.2
## Warning: package 'threejs' was built under R version 4.0.2
## Loading required package: igraph
## Warning: package 'igraph' was built under R version 4.0.2
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
## Warning in ans[i] <- sprintf("%.2e", x): number of items to replace is not a
## multiple of replacement length
Firstly, the African population has the most distinct cluster for PC1, while the Admixed American population forms a less distinct cluster that pverlaps partly with the rest. South Asian, East Asian, and European population are most similar for PC1. Next, half of the South Asian population forms a cluster for PC2. Thirdly, in terms of PC3, two clusters (East Asian and European populations) are formed that distinguish themselves from the rest. The two populations are the most different from each other.
When comparing with the PCA of chromosome 10 with that produced by chromosome 20, PC1 is not very different. PC2 and PC3 both have form the same clusters, but in opposite directions. For PC2, for example, the South Asian population is still the one that stands out, but in the positive direction (instead of the negative direction as obtained in chromosome 20). For PC3, in a similar manner, the East Asian population stands out in the positive direction as opposed to negative direction in chromosome 20, and vice versa for the European population.