library(ggplot2)
Referring to: http://sape.inf.usi.ch/quick-reference/ggplot2/geom_rect
We want relative range on the x axis and reads on the y axis lets try with just one category first because there is A LOT of data
geom_rect requires continuous scale on y axis, however read count is not continuous. Trying: 1) geom_errorbarh:
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = rel_range_graf, mapping = aes(xmin=rel_start, xmax=rel_end, y=log_reads, color=learning))
Fig.1 All uniquely mapping reads across all three experiments using log scale. It is rather overwhelming. ___________________________________________________________________________________________________________________________________
Plotting inclusion level of same data. I think is less clear than log
ggplot() +
scale_x_continuous(name = "rel") +
geom_errorbarh(data = rel_range_graf, mapping = aes(xmin=rel_start, xmax=rel_end, y=inc_lvl, color=learning))
Fig 2. All uniquely mapping reads across all three experiments using inclusion level as scale. It is also rather overwhelming. ___________________________________________________________________________________________________________________
Giving each experiment a graph to make it less busy looking.
graf_3962 <- rel_range_graf[rel_range_graf$experi == 3962,]
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = graf_3962, mapping = aes(xmin=rel_start, xmax=rel_end, y=log_reads, color=learning))
Fig 3. Experiment 3692, log scale, all reads
graf_4024 <- rel_range_graf[rel_range_graf$experi == 4024,]
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = graf_4024, mapping = aes(xmin=rel_start, xmax=rel_end, y=log_reads, color=learning))
Fig 4. Experiment 4024, log scale, all reads
graf_4049 <- rel_range_graf[rel_range_graf$experi == 4049,]
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = graf_4049, mapping = aes(xmin=rel_start, xmax=rel_end, y=log_reads, color=learning))
Fig 5. Experiment 4049, log scale, all reads
All of these are still pretty unclear. Checking which range of read depths are the most dense.
hist(rel_range_graf$uniq_map)
Fig 6. Histogram of read depths. (Also did log read depths and inclusion level, but the first few graphs i looked at were awful)
Seems to be the first 500
lowerend <- rel_range_graf[rel_range_graf$uniq_map <50, ]
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = lowerend, mapping = aes(xmin=rel_start, xmax=rel_end, y=uniq_map, color=learning))
Fig 7. Read depth of <50 for all experiments.Still too crowded.
Adding a random jitter to unique mapped reads:
rand <- rnorm(759, mean = 0.5, sd = 0.4)
rand[1:25]
## [1] 0.31588560 -0.01084972 0.06708863 0.38454568 0.39248394 0.81113734
## [7] -0.06691915 0.09114095 0.69412247 0.38149500 0.57286811 0.58457901
## [13] 0.35643392 0.33946208 0.68803350 0.66200929 0.12318522 0.20718815
## [19] -0.04204559 0.72366686 -0.02292886 1.08402959 0.58552021 0.74795928
## [25] 0.18701188
rel_range_graf$uniq_map_diff <- rel_range_graf$uniq_map + rand
lowerend2 <- rel_range_graf[rel_range_graf$uniq_map <50, ]
PLotting the “jittered” tracks
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = lowerend2, mapping = aes(xmin=rel_start, xmax=rel_end, y=uniq_map_diff, color=learning))
Fig 8. Jittered tracks has separated them out quite a lot but it is still far too busy.
Trying much lower range of read depths:
lowerend1 <- rel_range_graf[rel_range_graf$uniq_map <5, ]
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = lowerend1, mapping = aes(xmin=rel_start, xmax=rel_end, y=uniq_map_diff, color=learning))
Fig 9. Jittered tracks for reads <5. Still not good.
Options: could use this method and split by experiment or learning
OR:
Change tactic completely!
Many SJs have a read depth of 1 and they overlap far too much
So, plotting only uniquely mapped reads = 1 across all experiments with an arbitrary ID on the y axis.
single_reads <- rel_range_graf[rel_range_graf$uniq_map ==1, ]
single_reads$ID <- 1:81
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = single_reads, mapping = aes(xmin=rel_start, xmax=rel_end, y=ID, color=learning))
Fig 10. Beautiful!! Uniquely mapped reads = 1 for all experiments, not scaled on y-axis, just pisitioned by ID.
But that is only a small portion of the data.
Alternatively:
low_reads_3962 <- graf_3962[graf_3962$uniq_map<=5,]
low_reads_3962$ID <- 1:104
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = low_reads_3962, mapping = aes(xmin=rel_start, xmax=rel_end, y=ID, color=uniq_map, ))
Fig 11. All reads with a depth of 5 or lower for only one experiment.
Maybe this will look better if the data is sorted:
testing <- low_reads_3962[,-10]
low_reads_3962_a <- testing[order(testing$uniq_map),]
low_reads_3962_a$ID <- 1:104
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = low_reads_3962_a, mapping = aes(xmin=rel_start, xmax=rel_end, y=ID, color=uniq_map))
Fig 12. One experiment ordered by read#, is essentially the same thing as above but tidier.
Adding a bit more info:
ggplot() +
scale_x_continuous(name = "relative position") +
geom_errorbarh(data = low_reads_3962_a, mapping = aes(xmin=rel_start, xmax=rel_end, y=ID, color=uniq_map, linetype=learning))
Fig 13. As above but with learning indicated by different line types. Doesn’t look as neat, but gives more information. Can work on the clarity of the line types.