Helped Grace out with this, so wanted to document the process real quick.

First, we downloaded SRA tools from https://www.ncbi.nlm.nih.gov/books/NBK158900/

Then we mounted Eagle as a drive on Emu to hopefully ease the transfer process. We’re a little space constrained on both Emu and Eagle (when downloading multiple 55gb files) so in the future this may have to be tackled differently.

The command to mount eagle is mount -t cifs -o username=srlab //eagle.fish.washington.edu/web ~/Documents/eagle

Then we searched the SRA database to get possible data sets. We searched a few different ways, including for “C gigas bisulfite” “C gigas methylation” “C gigas bis-seq” and “C gigas bisseq” and “C gigas bis seq” and settled on the following SRA accession numbers.

SRA.list <- c("SRR5085013", "SRR5085014", "SRR5085015", "SRR5085016", "SRR5085017", "SRR5085018", "SRR46473", "SRR46474", "SRR546472", "SRR546471")

Then we whipped up a little script to get to downloading them.

setwd("~/Documents/eagle/scaphapoda/Grace/SRA Data - C. gigas methylation")

The below script downloads via the fastq-dump tool in SRA-Tools, .tar.gz’s the fastq file, and then deletes the uncompressed file.


for(i in 1:length(SRA.list))   {
  
  system(paste0("~/Downloads/sratoolkit.2.8.2-1-ubuntu64/bin/fastq-dump "), SRA.list[i])
  system(paste0("tar -czf ", SRA.list[i], ".fastq ", SRA.list[i], ".fastq.tar.gz"))
  system(paste0("rm ", SRA.list[i], ".fastq"))
  
}

Will update soon with experimental info regarding each SRA.

LS0tCnRpdGxlOiAiRG93bmxvYWRpbmcgQy4gR2lnYXMgTWV0aHlsYXRpb24gcmVsYXRlZCBTUkEgZGF0YS4iCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCkhlbHBlZCBHcmFjZSBvdXQgd2l0aCB0aGlzLCBzbyB3YW50ZWQgdG8gZG9jdW1lbnQgdGhlIHByb2Nlc3MgcmVhbCBxdWljay4KCkZpcnN0LCB3ZSBkb3dubG9hZGVkIFNSQSB0b29scyBmcm9tIGh0dHBzOi8vd3d3Lm5jYmkubmxtLm5paC5nb3YvYm9va3MvTkJLMTU4OTAwLwoKVGhlbiB3ZSBtb3VudGVkIEVhZ2xlIGFzIGEgZHJpdmUgb24gRW11IHRvIGhvcGVmdWxseSBlYXNlIHRoZSB0cmFuc2ZlciBwcm9jZXNzLiBXZSdyZSBhIGxpdHRsZSBzcGFjZSBjb25zdHJhaW5lZCBvbiBib3RoIEVtdSBhbmQgRWFnbGUgKHdoZW4gZG93bmxvYWRpbmcgbXVsdGlwbGUgNTVnYiBmaWxlcykgc28gaW4gdGhlIGZ1dHVyZSB0aGlzIG1heSBoYXZlIHRvIGJlIHRhY2tsZWQgZGlmZmVyZW50bHkuCgpUaGUgY29tbWFuZCB0byBtb3VudCBlYWdsZSBpcyBtb3VudCAtdCBjaWZzIC1vIHVzZXJuYW1lPXNybGFiIC8vZWFnbGUuZmlzaC53YXNoaW5ndG9uLmVkdS93ZWIgfi9Eb2N1bWVudHMvZWFnbGUKClRoZW4gd2Ugc2VhcmNoZWQgdGhlIFNSQSBkYXRhYmFzZSB0byBnZXQgcG9zc2libGUgZGF0YSBzZXRzLiBXZSBzZWFyY2hlZCBhIGZldyBkaWZmZXJlbnQgd2F5cywgaW5jbHVkaW5nIGZvciAiQyBnaWdhcyBiaXN1bGZpdGUiICJDIGdpZ2FzIG1ldGh5bGF0aW9uIiAiQyBnaWdhcyBiaXMtc2VxIiBhbmQgIkMgZ2lnYXMgYmlzc2VxIiBhbmQgIkMgZ2lnYXMgYmlzIHNlcSIgYW5kIHNldHRsZWQgb24gdGhlIGZvbGxvd2luZyBTUkEgYWNjZXNzaW9uIG51bWJlcnMuCgpgYGB7cn0KClNSQS5saXN0IDwtIGMoIlNSUjUwODUwMTMiLCAiU1JSNTA4NTAxNCIsICJTUlI1MDg1MDE1IiwgIlNSUjUwODUwMTYiLCAiU1JSNTA4NTAxNyIsICJTUlI1MDg1MDE4IiwgIlNSUjU0NjQ3MyIsICJTUlI1NDY0NzQiLCAiU1JSNTQ2NDcyIiwgIlNSUjU0NjQ3MSIpCgpgYGAKClRoZW4gd2Ugd2hpcHBlZCB1cCBhIGxpdHRsZSBzY3JpcHQgdG8gZ2V0IHRvIGRvd25sb2FkaW5nIHRoZW0uCgpgYGB7cn0KCnNldHdkKCJ+L0RvY3VtZW50cy9lYWdsZS9zY2FwaGFwb2RhL0dyYWNlL1NSQSBEYXRhIC0gQy4gZ2lnYXMgbWV0aHlsYXRpb24iKQoKCmBgYAoKVGhlIGJlbG93IHNjcmlwdCBkb3dubG9hZHMgdmlhIHRoZSBmYXN0cS1kdW1wIHRvb2wgaW4gU1JBLVRvb2xzLCAudGFyLmd6J3MgdGhlIGZhc3RxIGZpbGUsIGFuZCB0aGVuIGRlbGV0ZXMgdGhlIHVuY29tcHJlc3NlZCBmaWxlLiAKCgpgYGB7cn0KCmZvcihpIGluIDE6bGVuZ3RoKFNSQS5saXN0KSkgICB7CiAgCiAgc3lzdGVtKHBhc3RlMCgifi9Eb3dubG9hZHMvc3JhdG9vbGtpdC4yLjguMi0xLXVidW50dTY0L2Jpbi9mYXN0cS1kdW1wICIpLCBTUkEubGlzdFtpXSkKICBzeXN0ZW0ocGFzdGUwKCJ0YXIgLWN6ZiAiLCBTUkEubGlzdFtpXSwgIi5mYXN0cSAiLCBTUkEubGlzdFtpXSwgIi5mYXN0cS50YXIuZ3oiKSkKICBzeXN0ZW0ocGFzdGUwKCJybSAiLCBTUkEubGlzdFtpXSwgIi5mYXN0cSIpKQogIAp9CgpgYGAKCgpXaWxsIHVwZGF0ZSBzb29uIHdpdGggZXhwZXJpbWVudGFsIGluZm8gcmVnYXJkaW5nIGVhY2ggU1JBLgoK