MethylExtract throws an error, saying the normal Bismark sam flags arguments, flagW=99,147 flagC=83,163, aren’t correct. So This should cruise through and look for what flags are present in the sam files.

I use awk to grab the 2nd column of the .sam file, which is the flag column, then I iterate through, looking for the different unique values in each row.

setwd("~/Documents/C-virginica-BSSeq/MethylExtract/1/")
list.files(path = "." ,pattern = "*.sam") -> samfiles
flaguniques <- (unique(system(paste0("awk '{ print $2}' ", samfiles[i]), intern = TRUE)))
for(i in 2:length(samfiles))   {
    
flaguniques <- rbind(flaguniques, unique(system(paste0("awk '{ print $2}' ", samfiles[i]), intern = TRUE)))
  
}
head(flaguniques)
            [,1] [,2]
flaguniques "16" "0" 
            "0"  "16"
            "16" "0" 
            "16" "0" 
            "16" "0" 
            "0"  "16"
unique(flaguniques)
            [,1] [,2]
flaguniques "16" "0" 
            "0"  "16"

Looks like there are just 0 and 16. If I read the Samtools manual correctly, 0 is the forward strand flag, and 16 is the reverse strand flag, so this is looking like single ended data after Bismark is done with it.

Next just for fun I go through and check how many individual flags there are per read, just to make sure there isn’t some strangeness. Looks like there are only two options, so that’s a good sign.

uniquelength <- length(unique(as.numeric(flaguniques[1,])))
for(i in 2:dim(flaguniques)[1])   {
  uniquelength <- rbind(uniquelength, length(unique(as.numeric(flaguniques[i,]))))
}
head(uniquelength)
             [,1]
uniquelength    2
                2
                2
                2
                2
                2
unique(uniquelength)
             [,1]
uniquelength    2
LS0tCnRpdGxlOiAiTG9va2luZyBhdCBTYW0gRmxhZ3MiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCgpNZXRoeWxFeHRyYWN0IHRocm93cyBhbiBlcnJvciwgc2F5aW5nIHRoZSBub3JtYWwgQmlzbWFyayBzYW0gZmxhZ3MgYXJndW1lbnRzLCBmbGFnVz05OSwxNDcgZmxhZ0M9ODMsMTYzLCBhcmVuJ3QgY29ycmVjdC4gU28gVGhpcyBzaG91bGQgY3J1aXNlIHRocm91Z2ggYW5kIGxvb2sgZm9yIHdoYXQgZmxhZ3MgYXJlIHByZXNlbnQgaW4gdGhlIHNhbSBmaWxlcy4KCkkgdXNlIGF3ayB0byBncmFiIHRoZSAybmQgY29sdW1uIG9mIHRoZSAuc2FtIGZpbGUsIHdoaWNoIGlzIHRoZSBmbGFnIGNvbHVtbiwgdGhlbiBJIGl0ZXJhdGUgdGhyb3VnaCwgbG9va2luZyBmb3IgdGhlIGRpZmZlcmVudCB1bmlxdWUgdmFsdWVzIGluIGVhY2ggcm93LgoKYGBge3J9CgoKc2V0d2QoIn4vRG9jdW1lbnRzL0MtdmlyZ2luaWNhLUJTU2VxL01ldGh5bEV4dHJhY3QvMS8iKQoKbGlzdC5maWxlcyhwYXRoID0gIi4iICxwYXR0ZXJuID0gIiouc2FtIikgLT4gc2FtZmlsZXMKCmZsYWd1bmlxdWVzIDwtICh1bmlxdWUoc3lzdGVtKHBhc3RlMCgiYXdrICd7IHByaW50ICQyfScgIiwgc2FtZmlsZXNbaV0pLCBpbnRlcm4gPSBUUlVFKSkpCgoKCmZvcihpIGluIDI6bGVuZ3RoKHNhbWZpbGVzKSkgICB7CgogICAgCmZsYWd1bmlxdWVzIDwtIHJiaW5kKGZsYWd1bmlxdWVzLCB1bmlxdWUoc3lzdGVtKHBhc3RlMCgiYXdrICd7IHByaW50ICQyfScgIiwgc2FtZmlsZXNbaV0pLCBpbnRlcm4gPSBUUlVFKSkpCiAgCn0KCmhlYWQoZmxhZ3VuaXF1ZXMpCnVuaXF1ZShmbGFndW5pcXVlcykKYGBgCgoKTG9va3MgbGlrZSB0aGVyZSBhcmUganVzdCAwIGFuZCAxNi4gSWYgSSByZWFkIHRoZSBTYW10b29scyBtYW51YWwgY29ycmVjdGx5LCAwIGlzIHRoZSBmb3J3YXJkIHN0cmFuZCBmbGFnLCBhbmQgMTYgaXMgdGhlIHJldmVyc2Ugc3RyYW5kIGZsYWcsIHNvIHRoaXMgaXMgbG9va2luZyBsaWtlIHNpbmdsZSBlbmRlZCBkYXRhIGFmdGVyIEJpc21hcmsgaXMgZG9uZSB3aXRoIGl0LgoKCk5leHQganVzdCBmb3IgZnVuIEkgZ28gdGhyb3VnaCBhbmQgY2hlY2sgaG93IG1hbnkgaW5kaXZpZHVhbCBmbGFncyB0aGVyZSBhcmUgcGVyIHJlYWQsIGp1c3QgdG8gbWFrZSBzdXJlIHRoZXJlIGlzbid0IHNvbWUgc3RyYW5nZW5lc3MuIExvb2tzIGxpa2UgdGhlcmUgYXJlIG9ubHkgdHdvIG9wdGlvbnMsIHNvIHRoYXQncyBhIGdvb2Qgc2lnbi4KCmBgYHtyfQoKdW5pcXVlbGVuZ3RoIDwtIGxlbmd0aCh1bmlxdWUoYXMubnVtZXJpYyhmbGFndW5pcXVlc1sxLF0pKSkKCmZvcihpIGluIDI6ZGltKGZsYWd1bmlxdWVzKVsxXSkgICB7CgogIHVuaXF1ZWxlbmd0aCA8LSByYmluZCh1bmlxdWVsZW5ndGgsIGxlbmd0aCh1bmlxdWUoYXMubnVtZXJpYyhmbGFndW5pcXVlc1tpLF0pKSkpCgp9CgpoZWFkKHVuaXF1ZWxlbmd0aCkKdW5pcXVlKHVuaXF1ZWxlbmd0aCkKYGBgCgo=