Preparing data
gdsfmt package provides the genome data structure (GDS) file format for array-oriented bioinformatic data, which is a container for annotation and SNP genotypes.
#Clear cache if necessary using
#closeAllConnections()
#snpgdsClose(genofile) #in case you need to close or re-write a gds file
#closefn.gds(test1.gds)
#Load up our vcf file (SNP data for 24 red squirrels)
vcf.fn <- "/Users/tanyalama/vcftools/stringent_revised.vcf.recode.vcf"
#Reformat VCF --> SNP GDS (genomic data structure) Format
snpgdsVCF2GDS(vcf.fn, "test.gds", method="biallelic.only")
## VCF Format ==> SNP GDS Format
## Method: exacting biallelic SNPs
## Number of samples: 23
## Parsing "/Users/tanyalama/vcftools/stringent_revised.vcf.recode.vcf" ...
## import 73055 variants.
## + genotype { Bit2 23x73055, 410.2K } *
## Optimize the access efficiency ...
## Clean up the fragments of GDS file:
## open the file 'test.gds' (872.9K)
## # of fragments: 53
## save to 'test.gds.tmp'
## rename 'test.gds.tmp' (872.5K, reduced: 396B)
## # of fragments: 20
#Summary
snpgdsSummary("test.gds")
## The file name: /Users/tanyalama/Box Sync/Squirrel/test.gds
## The total number of samples: 23
## The total number of SNPs: 73055
## SNP genotypes are stored in SNP-major mode (Sample X SNP).
genofile <- snpgdsOpen("/Users/tanyalama/Box Sync/Squirrel/test.gds", readonly= FALSE)
#Add population information to your gds genofile
add.gdsn(genofile, "pop_code", c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)) #all identified to 1 population for starters
#get sample id information
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
Data Analysis
gdsfmt and SNPRelate accelerate two key computations in GWAS (genome wide association studies): principal component analysis (PCA) and relatedness analysis using identity-by-descent (IBD) measures.
LD-based SNP pruning
It is suggested to use a pruned set of SNPs which are in approximate linkage equilibrium with each other to avoid the strong influence of SNP clusters in principal component analysis and relatedness analysis.
set.seed(1) #should be 1000, reduced for rmarkdown
#Try different LD thresholds for sensitivity analysis. #Need to include autosome.only = FALSE clause to run.
snpset <- snpgdsLDpruning(genofile, autosome.only = FALSE,remove.monosnp = TRUE, ld.threshold=0.1, verbose = FALSE) #38,384/72,744 markers are selected in total with .1 ld.
#Create an object to hold our ~38k selected SNPs
snpset.id <- unlist(snpset)
Principal Component Analysis (PCA)
The functions in SNPRelate for PCA include calculating the genetic covariance matrix from genotypes, computing the correlation coefficients between sample loadings and genotypes for each SNP, calculating SNP eigenvectors (loadings), and estimating the sample loadings of a new dataset from specified SNP eigenvectors.
#Run PCA using the ldpruned set of 38k SNPs (snpset.id) instead of the full set.
#get sample id information
genofile
## File: /Users/tanyalama/Box Sync/Squirrel/test.gds (872.5K)
## + [ ] *
## |--+ sample.id { Str8 23 ZIP_ra(53.4%), 101B }
## |--+ snp.id { Int32 73055 ZIP_ra(34.6%), 98.7K }
## |--+ snp.rs.id { Str8 73055 ZIP_ra(0.14%), 111B }
## |--+ snp.position { Int32 73055 ZIP_ra(27.5%), 78.5K }
## |--+ snp.chromosome { Str8 73055 ZIP_ra(5.30%), 64.2K }
## |--+ snp.allele { Str8 73055 ZIP_ra(14.2%), 40.6K }
## |--+ genotype { Bit2 23x73055, 410.2K } *
## |--+ snp.annot [ ]
## | |--+ qual { Float32 73055 ZIP_ra(62.1%), 177.2K }
## | \--+ filter { Str8 73055 ZIP_ra(0.23%), 831B }
## \--+ pop_code { Float64 24, 192B }
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id")) #run
pca <- snpgdsPCA(genofile, sample.id = NULL, snp.id=snpset.id, autosome.only = FALSE, num.thread=2, verbose= TRUE)
## Principal Component Analysis (PCA) on genotypes:
## Excluding 0 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
## Working space: 23 samples, 38,794 SNPs
## using 2 (CPU) cores
## PCA: the sum of all selected genotypes (0,1,2) = 1524278
## CPU capabilities: Double-Precision SSE2
## Thu Jul 26 14:08:05 2018 (internal increment: 21192)
##
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed in 0s
## Thu Jul 26 14:08:05 2018 Begin (eigenvalues and eigenvectors)
## Thu Jul 26 14:08:05 2018 Done.
#if autosome.only = TRUE, use autosomal (non-sex chromosome) SNPs only; if it is a numeric or character value, keep SNPs according to the specified chromosome
#tried with autosome.only = TRUE and error read "There is no SNP!"
#The following code shows how to calculate the percent of variation that is accounted for by the top principal components. It is clear to see the first two eigenvectors hold the largest percentage of variance among the population, although the total variance accounted for is still less the one-quarter of the total.
# variance proportion (%)
pc.percent <- pca$varprop*100
head(round(pc.percent, 2))
## [1] 5.84 5.62 5.42 5.29 5.24 5.08
#get population information
pop_code<- read.gdsn(index.gdsn(genofile, "pop_code"))
head(cbind(sample.id, pop_code))
## sample.id pop_code
## [1,] "UMA2_1" "1"
## [2,] "UMA35_1" "1"
## [3,] "RS32_1" "1"
## [4,] "UMA20_1" "1"
## [5,] "UMA39_1" "1"
## [6,] "UMA19_1" "1"
#make a data.frame
tab <- data.frame(sample.id = pca$sample.id,
pop = factor(pop_code)[match(pca$sample.id,sample.id)],
EV1 = pca$eigenvect[,1], # the first eigenvector
EV2 = pca$eigenvect[,2], # the second eigenvector
stringsAsFactors = FALSE)
tab #view the dataframe
## sample.id pop EV1 EV2
## 1 UMA2_1 1 -0.063676681 0.084596181
## 2 UMA35_1 1 -0.005127551 0.069118871
## 3 RS32_1 1 -0.030475294 0.037277902
## 4 UMA20_1 1 -0.058753040 -0.020379122
## 5 UMA39_1 1 -0.053058858 0.095025643
## 6 UMA19_1 1 -0.063469156 0.079980237
## 7 RS104_1 1 -0.164887290 0.127483231
## 8 UMA13_1 1 -0.043263337 -0.004482035
## 9 UMA5_1 1 -0.039427354 0.010210736
## 10 UMA3_1 1 -0.096313791 -0.675158601
## 11 UMA26_1 1 -0.040642768 -0.001002618
## 12 RS33_1 1 -0.067450454 0.103904687
## 13 UMA23_1 1 -0.083062793 0.026626533
## 14 UMA40_1 1 0.659397709 -0.005952129
## 15 UMA8_1 1 -0.047733814 0.106796905
## 16 UMA6_1 1 -0.005187611 0.057058837
## 17 UMA9_1 1 0.015748784 0.057635036
## 18 UMA38_1 1 0.660806103 -0.005901249
## 19 UMA17_1 1 -0.192091152 0.250090104
## 20 UMA41_1 1 -0.046733023 0.058452053
## 21 UMA37_1 1 -0.069636955 0.077839051
## 22 UMA21_1 1 -0.090893932 -0.623080340
## 23 UMA42_1 1 -0.074067744 0.093860089
plot(tab$EV2, tab$EV1, xlab="eigenvector 2", ylab="eigenvector 1") #three distinct groups
##Zoom In 


Relatedness Analysis: Estimating IBD Using PLINK method of moments (MoM)
For relatedness analysis, identity-by-descent (IBD) estimation in SNPRelate can be done by either the method of moments (MoM) (Purcell et al., 2007) or maximum likelihood estimation (MLE) (Milligan, 2003; Choi et al., 2009). For both of these methods it is preffered to use a LD pruned SNP set.
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
#Estimate IBD coefficients
ibd <- snpgdsIBDMoM(genofile, sample.id, snp.id=snpset.id, autosome.only = FALSE, remove.monosnp= TRUE, kinship = TRUE, maf=0.05, missing.rate=0.05, num.thread=2, verbose= TRUE)
## IBD analysis (PLINK method of moment) on genotypes:
## Excluding 23,659 SNPs (monomorphic: TRUE, MAF: 0.05, missing rate: 0.05)
## Working space: 23 samples, 15,135 SNPs
## using 2 (CPU) cores
## PLINK IBD: the sum of all selected genotypes (0,1,2) = 507583
## Thu Jul 26 14:08:06 2018 (internal increment: 65536)
##
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed in 0s
## Thu Jul 26 14:08:06 2018 Done.
#maf (minor allele frequency) threshold is the frequency at which the second most common allele occurs in a given population.
#Make a dataframe
ibd.coeff <- snpgdsIBDSelection(ibd)
head(ibd.coeff)
## ID1 ID2 k0 k1 kinship
## 1 UMA2_1 UMA35_1 1 0 0
## 2 UMA2_1 RS32_1 1 0 0
## 3 UMA2_1 UMA20_1 1 0 0
## 4 UMA2_1 UMA39_1 1 0 0
## 5 UMA2_1 UMA19_1 1 0 0
## 6 UMA2_1 RS104_1 1 0 0
ibd.coeff
## ID1 ID2 k0 k1 kinship
## 1 UMA2_1 UMA35_1 1.000000000 0.00000000 0.0000000
## 2 UMA2_1 RS32_1 1.000000000 0.00000000 0.0000000
## 3 UMA2_1 UMA20_1 1.000000000 0.00000000 0.0000000
## 4 UMA2_1 UMA39_1 1.000000000 0.00000000 0.0000000
## 5 UMA2_1 UMA19_1 1.000000000 0.00000000 0.0000000
## 6 UMA2_1 RS104_1 1.000000000 0.00000000 0.0000000
## 7 UMA2_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 8 UMA2_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 9 UMA2_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 10 UMA2_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 11 UMA2_1 RS33_1 1.000000000 0.00000000 0.0000000
## 12 UMA2_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 13 UMA2_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 14 UMA2_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 15 UMA2_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 16 UMA2_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 17 UMA2_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 18 UMA2_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 19 UMA2_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 20 UMA2_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 21 UMA2_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 22 UMA2_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 23 UMA35_1 RS32_1 1.000000000 0.00000000 0.0000000
## 24 UMA35_1 UMA20_1 1.000000000 0.00000000 0.0000000
## 25 UMA35_1 UMA39_1 1.000000000 0.00000000 0.0000000
## 26 UMA35_1 UMA19_1 1.000000000 0.00000000 0.0000000
## 27 UMA35_1 RS104_1 1.000000000 0.00000000 0.0000000
## 28 UMA35_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 29 UMA35_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 30 UMA35_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 31 UMA35_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 32 UMA35_1 RS33_1 1.000000000 0.00000000 0.0000000
## 33 UMA35_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 34 UMA35_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 35 UMA35_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 36 UMA35_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 37 UMA35_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 38 UMA35_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 39 UMA35_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 40 UMA35_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 41 UMA35_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 42 UMA35_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 43 UMA35_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 44 RS32_1 UMA20_1 1.000000000 0.00000000 0.0000000
## 45 RS32_1 UMA39_1 1.000000000 0.00000000 0.0000000
## 46 RS32_1 UMA19_1 1.000000000 0.00000000 0.0000000
## 47 RS32_1 RS104_1 1.000000000 0.00000000 0.0000000
## 48 RS32_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 49 RS32_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 50 RS32_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 51 RS32_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 52 RS32_1 RS33_1 1.000000000 0.00000000 0.0000000
## 53 RS32_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 54 RS32_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 55 RS32_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 56 RS32_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 57 RS32_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 58 RS32_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 59 RS32_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 60 RS32_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 61 RS32_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 62 RS32_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 63 RS32_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 64 UMA20_1 UMA39_1 1.000000000 0.00000000 0.0000000
## 65 UMA20_1 UMA19_1 1.000000000 0.00000000 0.0000000
## 66 UMA20_1 RS104_1 1.000000000 0.00000000 0.0000000
## 67 UMA20_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 68 UMA20_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 69 UMA20_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 70 UMA20_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 71 UMA20_1 RS33_1 1.000000000 0.00000000 0.0000000
## 72 UMA20_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 73 UMA20_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 74 UMA20_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 75 UMA20_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 76 UMA20_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 77 UMA20_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 78 UMA20_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 79 UMA20_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 80 UMA20_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 81 UMA20_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 82 UMA20_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 83 UMA39_1 UMA19_1 1.000000000 0.00000000 0.0000000
## 84 UMA39_1 RS104_1 1.000000000 0.00000000 0.0000000
## 85 UMA39_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 86 UMA39_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 87 UMA39_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 88 UMA39_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 89 UMA39_1 RS33_1 1.000000000 0.00000000 0.0000000
## 90 UMA39_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 91 UMA39_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 92 UMA39_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 93 UMA39_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 94 UMA39_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 95 UMA39_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 96 UMA39_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 97 UMA39_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 98 UMA39_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 99 UMA39_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 100 UMA39_1 UMA42_1 0.316169023 0.45225758 0.2288511
## 101 UMA19_1 RS104_1 1.000000000 0.00000000 0.0000000
## 102 UMA19_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 103 UMA19_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 104 UMA19_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 105 UMA19_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 106 UMA19_1 RS33_1 1.000000000 0.00000000 0.0000000
## 107 UMA19_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 108 UMA19_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 109 UMA19_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 110 UMA19_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 111 UMA19_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 112 UMA19_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 113 UMA19_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 114 UMA19_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 115 UMA19_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 116 UMA19_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 117 UMA19_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 118 RS104_1 UMA13_1 1.000000000 0.00000000 0.0000000
## 119 RS104_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 120 RS104_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 121 RS104_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 122 RS104_1 RS33_1 1.000000000 0.00000000 0.0000000
## 123 RS104_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 124 RS104_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 125 RS104_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 126 RS104_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 127 RS104_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 128 RS104_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 129 RS104_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 130 RS104_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 131 RS104_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 132 RS104_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 133 RS104_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 134 UMA13_1 UMA5_1 1.000000000 0.00000000 0.0000000
## 135 UMA13_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 136 UMA13_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 137 UMA13_1 RS33_1 1.000000000 0.00000000 0.0000000
## 138 UMA13_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 139 UMA13_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 140 UMA13_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 141 UMA13_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 142 UMA13_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 143 UMA13_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 144 UMA13_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 145 UMA13_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 146 UMA13_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 147 UMA13_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 148 UMA13_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 149 UMA5_1 UMA3_1 1.000000000 0.00000000 0.0000000
## 150 UMA5_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 151 UMA5_1 RS33_1 1.000000000 0.00000000 0.0000000
## 152 UMA5_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 153 UMA5_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 154 UMA5_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 155 UMA5_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 156 UMA5_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 157 UMA5_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 158 UMA5_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 159 UMA5_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 160 UMA5_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 161 UMA5_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 162 UMA5_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 163 UMA3_1 UMA26_1 1.000000000 0.00000000 0.0000000
## 164 UMA3_1 RS33_1 1.000000000 0.00000000 0.0000000
## 165 UMA3_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 166 UMA3_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 167 UMA3_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 168 UMA3_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 169 UMA3_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 170 UMA3_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 171 UMA3_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 172 UMA3_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 173 UMA3_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 174 UMA3_1 UMA21_1 0.333725723 0.43209317 0.2251138
## 175 UMA3_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 176 UMA26_1 RS33_1 1.000000000 0.00000000 0.0000000
## 177 UMA26_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 178 UMA26_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 179 UMA26_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 180 UMA26_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 181 UMA26_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 182 UMA26_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 183 UMA26_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 184 UMA26_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 185 UMA26_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 186 UMA26_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 187 UMA26_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 188 RS33_1 UMA23_1 1.000000000 0.00000000 0.0000000
## 189 RS33_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 190 RS33_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 191 RS33_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 192 RS33_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 193 RS33_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 194 RS33_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 195 RS33_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 196 RS33_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 197 RS33_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 198 RS33_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 199 UMA23_1 UMA40_1 1.000000000 0.00000000 0.0000000
## 200 UMA23_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 201 UMA23_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 202 UMA23_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 203 UMA23_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 204 UMA23_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 205 UMA23_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 206 UMA23_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 207 UMA23_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 208 UMA23_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 209 UMA40_1 UMA8_1 1.000000000 0.00000000 0.0000000
## 210 UMA40_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 211 UMA40_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 212 UMA40_1 UMA38_1 0.001413618 0.01656152 0.4951528
## 213 UMA40_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 214 UMA40_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 215 UMA40_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 216 UMA40_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 217 UMA40_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 218 UMA8_1 UMA6_1 1.000000000 0.00000000 0.0000000
## 219 UMA8_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 220 UMA8_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 221 UMA8_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 222 UMA8_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 223 UMA8_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 224 UMA8_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 225 UMA8_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 226 UMA6_1 UMA9_1 1.000000000 0.00000000 0.0000000
## 227 UMA6_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 228 UMA6_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 229 UMA6_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 230 UMA6_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 231 UMA6_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 232 UMA6_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 233 UMA9_1 UMA38_1 1.000000000 0.00000000 0.0000000
## 234 UMA9_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 235 UMA9_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 236 UMA9_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 237 UMA9_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 238 UMA9_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 239 UMA38_1 UMA17_1 1.000000000 0.00000000 0.0000000
## 240 UMA38_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 241 UMA38_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 242 UMA38_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 243 UMA38_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 244 UMA17_1 UMA41_1 1.000000000 0.00000000 0.0000000
## 245 UMA17_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 246 UMA17_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 247 UMA17_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 248 UMA41_1 UMA37_1 1.000000000 0.00000000 0.0000000
## 249 UMA41_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 250 UMA41_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 251 UMA37_1 UMA21_1 1.000000000 0.00000000 0.0000000
## 252 UMA37_1 UMA42_1 1.000000000 0.00000000 0.0000000
## 253 UMA21_1 UMA42_1 1.000000000 0.00000000 0.0000000

Estimating IBD: Maximum Likelihood Estimation
Identical by descent (IBD) describes a matching segment of DNA shared by two or more individuals that has been inherited from a common ancestor without any intervening recombination. Alleles that are IBD have been interested by a recent common ancestor (and indicate recent gene flow).
# Estimate IBD coefficients
set.seed(1) #should be 1000
ibd <- snpgdsIBDMLE(genofile, sample.id = NULL, snp.id=snpset.id,autosome.only = FALSE, remove.monosnp= TRUE, maf=0.05, missing.rate=0.05, kinship = TRUE, num.thread=2, verbose = TRUE)
## Identity-By-Descent analysis (MLE) on genotypes:
## Excluding 23,659 SNPs (monomorphic: TRUE, MAF: 0.05, missing rate: 0.05)
## Working space: 23 samples, 15,135 SNPs
## using 2 (CPU) cores
## MLE IBD: the sum of all selected genotypes (0,1,2) = 507583
## MLE IBD: Thu Jul 26 14:08:06 2018 0%
## MLE IBD: Thu Jul 26 14:08:11 2018 100%
# Make a dataframe
ibd.coeff <- snpgdsIBDSelection(ibd)
