So our C. virginica data has a wonky sample, where we got ~1/10th the expected number of reads, which reduces our total loci with at least 10x coverage by 80%. I’m trying MACAU without that sample, and then with to see if there are any major differences caused by that sample.

list.files(".")
[1] "10XCoverageMethylationCounts5Samples.txt" "10XCoverageTotalCounts5Samples.txt"       "5Sample-predictor.csv"                   
[4] "5samplerelatedness.csv"                   "output"                                   "predictor.csv"                           
[7] "relatedness_matrix.csv"                   "TenXCovMethCounts.txt"                    "TenXCovTotCounts.txt"                    
system("macau -g 10XCoverageMethylationCounts5Samples.txt -t 10XCoverageTotalCounts5Samples.txt -p 5Sample-predictor.csv -k 5samplerelatedness.csv -o 5Sample_output -s 10000 -seed 123")
Reading Files ... 
## number of total individuals = 5
## number of analyzed individuals = 5
## number of covariates = 1
## number of total genes/sites = 60536
Performing Analysis                                                   0.00%
Performing Analysis                                                   1.65%
Performing Analysis =                                                 3.30%
Performing Analysis ==                                                4.96%
Performing Analysis ===                                               6.61%
Performing Analysis ====                                              8.26%
Performing Analysis ====                                              9.91%
Performing Analysis =====                                             11.56%
Performing Analysis ======                                            13.22%
Performing Analysis =======                                           14.87%
Performing Analysis ========                                          16.52%
Performing Analysis =========                                         18.17%
Performing Analysis =========                                         19.82%
Performing Analysis ==========                                        21.48%
Performing Analysis ===========                                       23.13%
Performing Analysis ============                                      24.78%
Performing Analysis =============                                     26.43%
Performing Analysis ==============                                    28.08%
Performing Analysis ==============                                    29.73%
Performing Analysis ===============                                   31.39%
Performing Analysis ================                                  33.04%
Performing Analysis =================                                 34.69%
Performing Analysis ==================                                36.34%
Performing Analysis ==================                                37.99%
Performing Analysis ===================                               39.65%
Performing Analysis ====================                              41.30%
Performing Analysis =====================                             42.95%
Performing Analysis ======================                            44.60%
Performing Analysis =======================                           46.25%
Performing Analysis =======================                           47.91%
Performing Analysis ========================                          49.56%
Performing Analysis =========================                         51.21%
Performing Analysis ==========================                        52.86%
Performing Analysis ===========================                       54.51%
Performing Analysis ============================                      56.17%
Performing Analysis ============================                      57.82%
Performing Analysis =============================                     59.47%
Performing Analysis ==============================                    61.12%
Performing Analysis ===============================                   62.77%
Performing Analysis ================================                  64.43%
Performing Analysis =================================                 66.08%
Performing Analysis =================================                 67.73%
Performing Analysis ==================================                69.38%
Performing Analysis ===================================               71.03%
Performing Analysis ====================================              72.69%
Performing Analysis =====================================             74.34%
Performing Analysis =====================================             75.99%
Performing Analysis ======================================            77.64%
Performing Analysis =======================================           79.29%
Performing Analysis ========================================          80.94%
Performing Analysis =========================================         82.60%
Performing Analysis ==========================================        84.25%
Performing Analysis ==========================================        85.90%
Performing Analysis ===========================================       87.55%
Performing Analysis ============================================      89.20%
Performing Analysis =============================================     90.86%
Performing Analysis ==============================================    92.51%
Performing Analysis ===============================================   94.16%
Performing Analysis ===============================================   95.81%
Performing Analysis ================================================  97.46%
Performing Analysis ================================================= 99.12%
Performing Analysis ==================================================100.00%

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

It’s done! Lets read in our new data.


FiveSample_output_assoc <- read_delim("~/Documents/C_virginica_MACAU/output/5Sample_output.assoc.txt", 
    "\t", escape_double = FALSE, trim_ws = TRUE)

And see the range of the p-values for Beta term significance.

print(nrow(FiveSample_output_assoc))
[1] 16
print(max(FiveSample_output_assoc$pvalue))
[1] 0.9999
print(min(FiveSample_output_assoc$pvalue))
[1] 0.9215

Well… boo. We had 16

I ran the 6 Sample MACAU via ssh, so I’ll throw the results up here as as well, they were less informative than the 5 sample run.

SixSample_output_assoc <- read_delim("~/Documents/C_virginica_MACAU/output/output.assoc.txt", 
    "\t", escape_double = FALSE, trim_ws = TRUE)
Parsed with column specification:
cols(
  id = col_character(),
  n = col_integer(),
  acpt_rate = col_double(),
  beta = col_double(),
  se_beta = col_double(),
  pvalue = col_double(),
  h = col_double(),
  se_h = col_double(),
  sigma2 = col_double(),
  se_sigma2 = col_double(),
  alpha0 = col_double(),
  se_alpha0 = col_double()
)
print(nrow(SixSample_output_assoc))
[1] 4
print(max(SixSample_output_assoc$pvalue))
[1] 0.9983
print(min(SixSample_output_assoc$pvalue))
[1] 0.927
LS0tCnRpdGxlOiAiNSBTYW1wbGUgQy4gdmlyZ2luaWNhIE1BQ0FVIHJ1biIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKU28gb3VyIEMuIHZpcmdpbmljYSBkYXRhIGhhcyBhIHdvbmt5IHNhbXBsZSwgd2hlcmUgd2UgZ290IH4xLzEwdGggdGhlIGV4cGVjdGVkIG51bWJlciBvZiByZWFkcywgd2hpY2ggcmVkdWNlcyBvdXIgdG90YWwgbG9jaSB3aXRoIGF0IGxlYXN0IDEweCBjb3ZlcmFnZSBieSA4MCUuIEknbSB0cnlpbmcgTUFDQVUgd2l0aG91dCB0aGF0IHNhbXBsZSwgYW5kIHRoZW4gd2l0aCB0byBzZWUgaWYgdGhlcmUgYXJlIGFueSBtYWpvciBkaWZmZXJlbmNlcyBjYXVzZWQgYnkgdGhhdCBzYW1wbGUuCgpgYGB7cn0KbGlicmFyeShyZWFkcikKCgpzZXR3ZCgifi9Eb2N1bWVudHMvQ192aXJnaW5pY2FfTUFDQVUiKQoKbGlzdC5maWxlcygiLiIpCgpgYGAKCmBgYHtyfQoKc3lzdGVtKCJtYWNhdSAtZyAxMFhDb3ZlcmFnZU1ldGh5bGF0aW9uQ291bnRzNVNhbXBsZXMudHh0IC10IDEwWENvdmVyYWdlVG90YWxDb3VudHM1U2FtcGxlcy50eHQgLXAgNVNhbXBsZS1wcmVkaWN0b3IuY3N2IC1rIDVzYW1wbGVyZWxhdGVkbmVzcy5jc3YgLW8gNVNhbXBsZV9vdXRwdXQgLXMgMTAwMDAgLXNlZWQgMTIzIikKCmBgYAoKCkFkZCBhIG5ldyBjaHVuayBieSBjbGlja2luZyB0aGUgKkluc2VydCBDaHVuayogYnV0dG9uIG9uIHRoZSB0b29sYmFyIG9yIGJ5IHByZXNzaW5nICpDdHJsK0FsdCtJKi4KCldoZW4geW91IHNhdmUgdGhlIG5vdGVib29rLCBhbiBIVE1MIGZpbGUgY29udGFpbmluZyB0aGUgY29kZSBhbmQgb3V0cHV0IHdpbGwgYmUgc2F2ZWQgYWxvbmdzaWRlIGl0IChjbGljayB0aGUgKlByZXZpZXcqIGJ1dHRvbiBvciBwcmVzcyAqQ3RybCtTaGlmdCtLKiB0byBwcmV2aWV3IHRoZSBIVE1MIGZpbGUpLgoKSXQncyBkb25lISBMZXRzIHJlYWQgaW4gb3VyIG5ldyBkYXRhLgoKYGBge3J9CgpGaXZlU2FtcGxlX291dHB1dF9hc3NvYyA8LSByZWFkX2RlbGltKCJ+L0RvY3VtZW50cy9DX3ZpcmdpbmljYV9NQUNBVS9vdXRwdXQvNVNhbXBsZV9vdXRwdXQuYXNzb2MudHh0IiwgCiAgICAiXHQiLCBlc2NhcGVfZG91YmxlID0gRkFMU0UsIHRyaW1fd3MgPSBUUlVFKQoKYGBgCgpBbmQgc2VlIHRoZSByYW5nZSBvZiB0aGUgcC12YWx1ZXMgZm9yIEJldGEgdGVybSBzaWduaWZpY2FuY2UuCmBgYHtyfQpwcmludChucm93KEZpdmVTYW1wbGVfb3V0cHV0X2Fzc29jKSkKCnByaW50KG1heChGaXZlU2FtcGxlX291dHB1dF9hc3NvYyRwdmFsdWUpKQpwcmludChtaW4oRml2ZVNhbXBsZV9vdXRwdXRfYXNzb2MkcHZhbHVlKSkKCgoKYGBgCgpXZWxsLi4uIGJvby4gV2UgaGFkIDE2IAoKSSByYW4gdGhlIDYgU2FtcGxlIE1BQ0FVIHZpYSBzc2gsIHNvIEknbGwgdGhyb3cgdGhlIHJlc3VsdHMgdXAgaGVyZSBhcyBhcyB3ZWxsLCB0aGV5IHdlcmUgbGVzcyBpbmZvcm1hdGl2ZSB0aGFuIHRoZSA1IHNhbXBsZSBydW4uCgpgYGB7cn0KClNpeFNhbXBsZV9vdXRwdXRfYXNzb2MgPC0gcmVhZF9kZWxpbSgifi9Eb2N1bWVudHMvQ192aXJnaW5pY2FfTUFDQVUvb3V0cHV0L291dHB1dC5hc3NvYy50eHQiLCAKICAgICJcdCIsIGVzY2FwZV9kb3VibGUgPSBGQUxTRSwgdHJpbV93cyA9IFRSVUUpCgpgYGAKCgoKCmBgYHtyfQoKcHJpbnQobnJvdyhTaXhTYW1wbGVfb3V0cHV0X2Fzc29jKSkKCnByaW50KG1heChTaXhTYW1wbGVfb3V0cHV0X2Fzc29jJHB2YWx1ZSkpCnByaW50KG1pbihTaXhTYW1wbGVfb3V0cHV0X2Fzc29jJHB2YWx1ZSkpCgpgYGA=