This is data analysis from 2015-11-16. MDM samples with or without RTX and RAL were sequenced at JHU -/+ UDG with Rapid Mode HIseq.
| sample.id | sample.name | cell.type | aligned.reads | hiv | md.read | rmdup.read | discordant.reads | chimeric.reads |
|---|---|---|---|---|---|---|---|---|
| JS207 | MDM no Virus | MDM | 902439 | 6305 | 16713 | 33479 | 20 | 0 |
| JS208 | MDM day 7 | MDM | 136756172 | 22192538 | 1800863 | 5405854 | 87266 | 27585 |
| JS209 | MDM day 14 | MDM | 111212 | 55495 | 4145 | 5198 | 76 | 0 |
| JS210 | v:DPI1 | MDM | 99222555 | 11213519 | 2017310 | 4071983 | 64154 | 44017 |
| JS211 | v:DPI1:RAL | MDM | 14719112 | 2336993 | 1590108 | 2604808 | 9237 | 2114 |
| JS212 | v:DPI30 | MDM | 8831602 | 581421 | 449434 | 689063 | 4435 | 108 |
| JS213 | v:DPI30:RAL | MDM | 6876314 | 1028346 | 401716 | 573430 | 2026 | 144 |
| JS214 | MDM no Virus:UDG | MDM | 16691 | 189 | 4232 | 4999 | 0 | 0 |
| JS215 | MDM day 7:UDG | MDM | 41395460 | 4656825 | 6040141 | 9276311 | 15870 | 9116 |
| JS216 | MDM day 14:UDG | MDM | 72602 | 6415 | 51439 | 53487 | 18 | 0 |
| JS217 | v:DPI1:UDG | MDM | 48085234 | 1627078 | 16719928 | 24108526 | 8637 | 3640 |
| JS218 | v:DPI1:RAL:UDG | MDM | 41539677 | 424954 | 19974111 | 29527517 | 2094 | 193 |
| JS219 | v:DPI30:UDG | MDM | 8702840 | 90062 | 4386910 | 5685454 | 574 | 30 |
| JS220 | v:DPI30:RAL:UDG | MDM | 8284670 | 170694 | 4414203 | 5712800 | 343 | 27 |
Samples from the previous sample that was treated with RTX +/- RAL was added to this sample to get enough DNA for the lockdown protocol. It was mixed with day 0 no virus, day 1 and day14. We see very few reads in the day 0 and day 14(JS207, 209). For the day0 this makes sense as there is no virus in this samples and nothing to lockdown. For day14 we are unclear if the low sample is due to low HIV infection or if it was a poor library. The library DNA amounts were the higher in day 0 and day 14 than day 7 prior to lockdown. Equal amounts of all libraries were mixed prior to the lockdown protocol.
We see a high percentage of HIV reads in the day 14 sample. This could be due to the low number of aligned reads in this sample. I checked this number manually and ~50 of the aligned reads were HIV. We also see about 15% in day 7 as well as the RTX samples. This is much better than the 2% that we were seeing in the last sequencing run indicating that the low percentage was likely due to mixing with HT29 samples that have a much higher infection rate.
This data shows the pcr duplication present in each sample. Since we use lockdown for HIV which only has 9500 bp we expect and often see a high level of duplication especially in day 7 since there are so many HIV reads(22M) In this data we see that between 5-80% of our reads are unique. We see that mark duplicate removes more reads across every sample than remove duplicate does. This is probably due to the program taking into account differences across the whole read not just the end base.
This coverage pattern looks similar to what we have seen before with the UDG treated sample having decreased reads across the HIV genome indicating that Uracil is incorporated throughout. In the Day0 and Day 14 there aren’t enough HIV reads to clearly see any pattern.
This normalization looks horrible and is probably not what we want to use.
This normalization looks much better and makes more sense to me.
We can see chimeras in day 7 and chimeras and discordant reads in day 14. The ability to see discordant reads is always higher but why we see a lot of discordant reads in day 14 and no chimeras is harder to understand. The high discordant percentage is probably due to normalizing to a small number of reads. I wonder if there is some false + in the day14 discordant data.
| chr | pos | ref | mut | depth | sample.id |
|---|---|---|---|---|---|
| HIV | 158 | A | T | DP=298 | JS208 |
| HIV | 164 | GCAA | TTAT | DP=270 | JS208 |
| HIV | 184 | A | C | DP=315 | JS208 |
| HIV | 233 | G | T | DP=252 | JS208 |
| HIV | 343 | G | T | DP=203 | JS208 |
| HIV | 381 | C | T | DP=223 | JS208 |
| HIV | 606 | C | T | DP=6652 | JS208 |
| HIV | 693 | A | T | DP=10184 | JS208 |
| HIV | 887 | A | T | DP=3989 | JS208 |
| HIV | 1143 | A | T | DP=3519 | JS208 |
| HIV | 1348 | A | T | DP=6182 | JS208 |
| HIV | 1481 | C | T | DP=5163 | JS208 |
| HIV | 1552 | C | T | DP=4831 | JS208 |
| HIV | 1553 | C | T | DP=4857 | JS208 |
| HIV | 1663 | C | T | DP=4992 | JS208 |
| HIV | 1772 | C | T | DP=5311 | JS208 |
| HIV | 2256 | C | T | DP=3948 | JS208 |
| HIV | 2470 | G | T | DP=2036 | JS208 |
| HIV | 2562 | A | C | DP=11900 | JS208 |
| HIV | 2846 | G | T | DP=5947 | JS208 |
| HIV | 3021 | G | T | DP=3468 | JS208 |
| HIV | 3244 | A | C | DP=8215 | JS208 |
| HIV | 3375 | G | T | DP=2555 | JS208 |
| HIV | 3834 | T | C | DP=3794 | JS208 |
| HIV | 4464 | G | T | DP=2910 | JS208 |
| HIV | 4754 | A | T | DP=2384 | JS208 |
| HIV | 5252 | G | C | DP=9218 | JS208 |
| HIV | 5328 | A | T | DP=8488 | JS208 |
| HIV | 5410 | A | T | DP=12908 | JS208 |
| HIV | 5814 | C | T | DP=8882 | JS208 |
| HIV | 6084 | A | T | DP=3437 | JS208 |
| HIV | 7022 | G | T | DP=4115 | JS208 |
| HIV | 7209 | A | T | DP=8923 | JS208 |
| HIV | 7450 | C | T | DP=3014 | JS208 |
| HIV | 7696 | A | C | DP=3870 | JS208 |
| HIV | 7964 | A | T | DP=10489 | JS208 |
| HIV | 8109 | A | C | DP=7651 | JS208 |
| HIV | 8531 | G | C | DP=8854 | JS208 |
| HIV | 8647 | G | T | DP=6903 | JS208 |
| HIV | 8691 | G | T | DP=8851 | JS208 |
| HIV | 8837 | G | T | DP=3479 | JS208 |
| HIV | 8926 | G | T | DP=19395 | JS208 |
| HIV | 1757 | TCC | TC | DP=2 | JS214 |
| HIV | 144 | T | C | DP=104 | JS215 |
| HIV | 158 | A | T | DP=108 | JS215 |
| HIV | 164 | GCAA | TTAT | DP=100 | JS215 |
| HIV | 184 | A | C | DP=113 | JS215 |
| HIV | 343 | G | T | DP=44 | JS215 |
| HIV | 381 | C | T | DP=42 | JS215 |
| HIV | 944 | A | T | DP=1509 | JS215 |
| HIV | 1314 | A | C | DP=1065 | JS215 |
| HIV | 1504 | A | T | DP=1475 | JS215 |
| HIV | 1921 | A | T | DP=1084 | JS215 |
| HIV | 2827 | G | T | DP=783 | JS215 |
| HIV | 3202 | A | T | DP=1424 | JS215 |
| HIV | 3373 | A | T | DP=1793 | JS215 |
| HIV | 3397 | T | C | DP=3465 | JS215 |
| HIV | 3489 | G | C | DP=3310 | JS215 |
| HIV | 3574 | A | C | DP=2473 | JS215 |
| HIV | 3587 | T | A | DP=3045 | JS215 |
| HIV | 4018 | G | T | DP=1972 | JS215 |
| HIV | 4902 | G | T | DP=1530 | JS215 |
| HIV | 5633 | A | T | DP=2678 | JS215 |
| HIV | 5730 | G | T | DP=2602 | JS215 |
| HIV | 5887 | A | T | DP=3159 | JS215 |
| HIV | 6023 | A | T | DP=2020 | JS215 |
| HIV | 6347 | C | T | DP=1586 | JS215 |
| HIV | 6591 | G | T | DP=1741 | JS215 |
| HIV | 6695 | A | T | DP=1304 | JS215 |
| HIV | 7669 | A | T | DP=1875 | JS215 |
| HIV | 8206 | A | T | DP=1832 | JS215 |
| HIV | 8215 | G | T | DP=2256 | JS215 |
| HIV | 8408 | A | C | DP=1952 | JS215 |
| HIV | 8711 | G | T | DP=2130 | JS215 |
| chr | pos | ref | mut | depth | sample.id |
|---|---|---|---|---|---|
| HIV | 144 | T | C | DP=104 | JS215 |
| HIV | 158 | A | T | DP=298 | JS208 |
| HIV | 158 | A | T | DP=43 | JS211 |
| HIV | 158 | A | T | DP=108 | JS215 |
| HIV | 164 | GCAA | TTAT | DP=270 | JS208 |
| HIV | 164 | GCAA | TTAT | DP=36 | JS211 |
| HIV | 164 | GCAA | TTAT | DP=100 | JS215 |
| HIV | 183 | G | A | DP=86 | JS210 |
| HIV | 183 | G | A | DP=43 | JS211 |
| HIV | 184 | A | C | DP=315 | JS208 |
| HIV | 184 | A | C | DP=45 | JS211 |
| HIV | 184 | A | C | DP=18 | JS212 |
| HIV | 184 | A | C | DP=113 | JS215 |
| HIV | 196 | A | C | DP=73 | JS210 |
| HIV | 233 | G | T | DP=252 | JS208 |
| HIV | 286 | G | A | DP=111 | JS210 |
| HIV | 343 | G | T | DP=203 | JS208 |
| HIV | 343 | G | T | DP=164 | JS210 |
| HIV | 343 | G | T | DP=59 | JS211 |
| HIV | 343 | G | T | DP=12 | JS212 |
| HIV | 343 | G | T | DP=22 | JS213 |
| HIV | 343 | G | T | DP=44 | JS215 |
| HIV | 343 | G | T | DP=27 | JS217 |
| HIV | 343 | G | T | DP=17 | JS218 |
| HIV | 343 | G | T | DP=2 | JS220 |
| HIV | 375 | G | T | DP=2 | JS220 |
| HIV | 381 | C | T | DP=223 | JS208 |
| HIV | 381 | C | T | DP=210 | JS210 |
| HIV | 381 | C | T | DP=35 | JS213 |
| HIV | 381 | C | T | DP=42 | JS215 |
| HIV | 606 | C | T | DP=6652 | JS208 |
| HIV | 693 | A | T | DP=10184 | JS208 |
| HIV | 887 | A | T | DP=3989 | JS208 |
| HIV | 944 | A | T | DP=1509 | JS215 |
| HIV | 1143 | A | T | DP=3519 | JS208 |
| HIV | 1314 | A | C | DP=1065 | JS215 |
| HIV | 1348 | A | T | DP=6182 | JS208 |
| HIV | 1480 | C | T | DP=1111 | JS218 |
| HIV | 1481 | C | T | DP=5163 | JS208 |
| HIV | 1504 | A | T | DP=1475 | JS215 |
| HIV | 1552 | C | T | DP=4831 | JS208 |
| HIV | 1553 | C | T | DP=4857 | JS208 |
| HIV | 1663 | C | T | DP=4992 | JS208 |
| HIV | 1757 | TCC | TC | DP=2 | JS214 |
| HIV | 1772 | C | T | DP=5311 | JS208 |
| HIV | 1921 | A | T | DP=1084 | JS215 |
| HIV | 2243 | C | T | DP=3925 | JS210 |
| HIV | 2256 | C | T | DP=3948 | JS208 |
| HIV | 2470 | G | T | DP=2036 | JS208 |
| HIV | 2550 | C | T | DP=1376 | JS211 |
| HIV | 2562 | A | C | DP=11900 | JS208 |
| HIV | 2827 | G | T | DP=783 | JS215 |
| HIV | 2846 | G | T | DP=5947 | JS208 |
| HIV | 3021 | G | T | DP=3468 | JS208 |
| HIV | 3202 | A | T | DP=1424 | JS215 |
| HIV | 3244 | A | C | DP=8215 | JS208 |
| HIV | 3373 | A | T | DP=1793 | JS215 |
| HIV | 3375 | G | T | DP=2555 | JS208 |
| HIV | 3397 | T | C | DP=3465 | JS215 |
| HIV | 3489 | G | C | DP=3310 | JS215 |
| HIV | 3574 | A | C | DP=2473 | JS215 |
| HIV | 3587 | T | A | DP=3045 | JS215 |
| HIV | 3834 | T | C | DP=3794 | JS208 |
| HIV | 4018 | G | T | DP=1972 | JS215 |
| HIV | 4135 | A | G | DP=1334 | JS210 |
| HIV | 4190 | AGTA | AA | DP=1546 | JS210 |
| HIV | 4201 | T | A | DP=1901 | JS210 |
| HIV | 4237 | A | G | DP=504 | JS218 |
| HIV | 4343 | C | T | DP=1752 | JS210 |
| HIV | 4420 | A | C | DP=3080 | JS210 |
| HIV | 4464 | G | T | DP=2910 | JS208 |
| HIV | 4754 | A | T | DP=2384 | JS208 |
| HIV | 4768 | T | A | DP=731 | JS210 |
| HIV | 4859 | T | G | DP=2706 | JS210 |
| HIV | 4866 | T | A | DP=2795 | JS210 |
| HIV | 4902 | G | T | DP=1530 | JS215 |
| HIV | 4958 | G | T | DP=1510 | JS217 |
| HIV | 5252 | G | C | DP=9218 | JS208 |
| HIV | 5328 | A | T | DP=8488 | JS208 |
| HIV | 5410 | A | T | DP=12908 | JS208 |
| HIV | 5530 | C | G | DP=2544 | JS210 |
| HIV | 5633 | A | T | DP=2678 | JS215 |
| HIV | 5724 | G | T | DP=5657 | JS210 |
| HIV | 5730 | G | T | DP=2602 | JS215 |
| HIV | 5802 | G | A | DP=1242 | JS217 |
| HIV | 5814 | C | T | DP=8882 | JS208 |
| HIV | 5887 | A | T | DP=3159 | JS215 |
| HIV | 6023 | A | T | DP=2020 | JS215 |
| HIV | 6084 | A | T | DP=3437 | JS208 |
| HIV | 6233 | GAG | GG | DP=2476 | JS210 |
| HIV | 6347 | C | T | DP=1586 | JS215 |
| HIV | 6591 | G | T | DP=1741 | JS215 |
| HIV | 6695 | A | T | DP=1304 | JS215 |
| HIV | 6750 | T | A | DP=2027 | JS210 |
| HIV | 6755 | G | A | DP=2020 | JS210 |
| HIV | 7022 | G | T | DP=4115 | JS208 |
| HIV | 7209 | A | T | DP=8923 | JS208 |
| HIV | 7450 | C | T | DP=3014 | JS208 |
| HIV | 7669 | A | T | DP=1875 | JS215 |
| HIV | 7696 | A | C | DP=3870 | JS208 |
| HIV | 7964 | A | T | DP=10489 | JS208 |
| HIV | 8109 | A | C | DP=7651 | JS208 |
| HIV | 8206 | A | T | DP=1832 | JS215 |
| HIV | 8215 | G | T | DP=2256 | JS215 |
| HIV | 8408 | A | C | DP=1952 | JS215 |
| HIV | 8531 | G | C | DP=8854 | JS208 |
| HIV | 8647 | G | T | DP=6903 | JS208 |
| HIV | 8691 | G | T | DP=8851 | JS208 |
| HIV | 8711 | G | T | DP=2130 | JS215 |
| HIV | 8837 | G | T | DP=3479 | JS208 |
| HIV | 8926 | G | T | DP=19395 | JS208 |
| HIV | 9239 | G | T | DP=2335 | JS218 |
I see a lot of mutations in the sample with a lot of HIV in it day 7. These samples have some level of hUNG activity so mutations are not out of the question here. I used the rmdup sample here so PCR duplication shouldn’t be playing a major role. These mutations are at high depth and are mostly changing to T. Does this make sense with a post replicative repair mechanism??? The mutations seem relatively evenly distributed across the HIV genome but I might want to look into the HIV genes. There is no way currently in this sample to tell if the same mutation is present on both strands.
This data shows the enrichment of the 2 read types in each of 20 segmentation classes. The Chimeric data is going to be more accurate here because we know the exact base the integration is happening. In the discordant data we only know the area of the integration event so the data is looking for the overlap with a large region (1000bp). We are seeing a mild enrichment in elonW, Gene bodys, enh regions so generally open chomatin which makes some sense. We do see potential differences in the +/- UDG indicating that uracil may be inhibiting incorporation in Elon and Enh and enhancing incorporation into TSS (although no TSS in JS215 strikes me as strange).