MethylExtract throws an error, saying the normal Bismark sam flags arguments, flagW=99,147 flagC=83,163, aren’t correct. So This should cruise through and look for what flags are present in the sam files.
I use awk to grab the 2nd column of the .sam file, which is the flag column, then I iterate through, looking for the different unique values in each row.
setwd("~/Documents/C-virginica-BSSeq/MethylExtract/1/")
list.files(path = "." ,pattern = "*.sam") -> samfiles
flaguniques <- (unique(system(paste0("awk '{ print $2}' ", samfiles[i]), intern = TRUE)))
for(i in 2:length(samfiles)) {
flaguniques <- rbind(flaguniques, unique(system(paste0("awk '{ print $2}' ", samfiles[i]), intern = TRUE)))
}
head(flaguniques)
[,1] [,2]
flaguniques "16" "0"
"0" "16"
"16" "0"
"16" "0"
"16" "0"
"0" "16"
unique(flaguniques)
[,1] [,2]
flaguniques "16" "0"
"0" "16"
Looks like there are just 0 and 16. If I read the Samtools manual correctly, 0 is the forward strand flag, and 16 is the reverse strand flag, so this is looking like single ended data after Bismark is done with it.
Next just for fun I go through and check how many individual flags there are per read, just to make sure there isn’t some strangeness. Looks like there are only two options, so that’s a good sign.
uniquelength <- length(unique(as.numeric(flaguniques[1,])))
for(i in 2:dim(flaguniques)[1]) {
uniquelength <- rbind(uniquelength, length(unique(as.numeric(flaguniques[i,]))))
}
head(uniquelength)
[,1]
uniquelength 2
2
2
2
2
2
unique(uniquelength)
[,1]
uniquelength 2