This demo shows how the PWM are used to evalate the impact of mutations on transcription factor binding sites (TFBS). The data used here is:
1.PWMs for teh yeast TFs downloaded from [http://cisbp.ccbr.utoronto.ca/index.php] CIS-BP Database: Catalog of Inferred Sequence Binding Preferences.
2.The yeast genome sequence downloaded from [https://www.yeastgenome.org/] The Saccharomyces Genome Database (SGD).
3.The Bioconductor [https://bioconductor.org/packages/release/bioc/html/Biostrings.html] BioStrings R Package.
## [1] "Number of loaded PWMs: 1440 - Length of loaded sequence: 230218bp."
## [1] "Length of loaded sequence: 30000bp."
## BS_Start PWM_Length Score PWM_name BS_End BS_Sequence
## 1: 2638 10 5.02 M00110_2.00 2647 GCGGAGGAAA
## 2: 7762 10 5.51 M00110_2.00 7771 CCGGAGTAAT
## 3: 9328 10 5.16 M00110_2.00 9337 TCGGAAAACT
## 4: 10062 10 5.01 M00110_2.00 10071 ACGGATAACT
## 5: 11130 10 5.00 M00110_2.00 11139 TTGGATAAAT
## 6: 13556 10 5.08 M00110_2.00 13565 TCGGGCAATC
## 7: 14786 10 5.02 M00110_2.00 14795 AAGGAAAAAT
## 8: 16608 10 5.28 M00110_2.00 16617 ACGGGTAAAC
## 9: 17925 10 5.21 M00110_2.00 17934 TCGGGAAAAG
## 10: 18888 10 5.26 M00110_2.00 18897 TCGGAAATAT
## 11: 20517 10 5.15 M00110_2.00 20526 GCGGTAAATT
## 12: 21031 10 5.26 M00110_2.00 21040 ACGGGAAAAA
## 13: 21078 10 5.44 M00110_2.00 21087 TCGGAGAAAG
## 14: 23775 10 5.53 M00110_2.00 23784 ACGGTGAAAT
## 15: 29404 10 5.04 M00110_2.00 29413 ACGGTAAAAA
Sequence Logo for the PWM M00110_2.00.
## [1] "The wildtype sequences are mutated in 2 possition"
## BS_Start PWM_Length Score PWM_name BS_End BS_Sequence BS_Sequence_MT
## 1: 2638 10 5.02 M00110_2.00 2647 GCGGAGGAAA GCTTAGGAAA
## 2: 7762 10 5.51 M00110_2.00 7771 CCGGAGTAAT CCTTAGTAAT
## 3: 9328 10 5.16 M00110_2.00 9337 TCGGAAAACT TCTTAAAACT
## 4: 10062 10 5.01 M00110_2.00 10071 ACGGATAACT ACTTATAACT
## 5: 11130 10 5.00 M00110_2.00 11139 TTGGATAAAT TTTTATAAAT
## 6: 13556 10 5.08 M00110_2.00 13565 TCGGGCAATC TCTTGCAATC
## 7: 14786 10 5.02 M00110_2.00 14795 AAGGAAAAAT AATTAAAAAT
## 8: 16608 10 5.28 M00110_2.00 16617 ACGGGTAAAC ACTTGTAAAC
## 9: 17925 10 5.21 M00110_2.00 17934 TCGGGAAAAG TCTTGAAAAG
## 10: 18888 10 5.26 M00110_2.00 18897 TCGGAAATAT TCTTAAATAT
## 11: 20517 10 5.15 M00110_2.00 20526 GCGGTAAATT GCTTTAAATT
## 12: 21031 10 5.26 M00110_2.00 21040 ACGGGAAAAA ACTTGAAAAA
## 13: 21078 10 5.44 M00110_2.00 21087 TCGGAGAAAG TCTTAGAAAG
## 14: 23775 10 5.53 M00110_2.00 23784 ACGGTGAAAT ACTTTGAAAT
## 15: 29404 10 5.04 M00110_2.00 29413 ACGGTAAAAA ACTTTAAAAA
## BS_Start PWM_Length Score PWM_name BS_End BS_Sequence BS_Sequence_MT
## 1: 2638 10 5.02 M00110_2.00 2647 GCGGAGGAAA GCTTAGGAAA
## 2: 7762 10 5.51 M00110_2.00 7771 CCGGAGTAAT CCTTAGTAAT
## 3: 9328 10 5.16 M00110_2.00 9337 TCGGAAAACT TCTTAAAACT
## 4: 10062 10 5.01 M00110_2.00 10071 ACGGATAACT ACTTATAACT
## 5: 11130 10 5.00 M00110_2.00 11139 TTGGATAAAT TTTTATAAAT
## 6: 13556 10 5.08 M00110_2.00 13565 TCGGGCAATC TCTTGCAATC
## 7: 14786 10 5.02 M00110_2.00 14795 AAGGAAAAAT AATTAAAAAT
## 8: 16608 10 5.28 M00110_2.00 16617 ACGGGTAAAC ACTTGTAAAC
## 9: 17925 10 5.21 M00110_2.00 17934 TCGGGAAAAG TCTTGAAAAG
## 10: 18888 10 5.26 M00110_2.00 18897 TCGGAAATAT TCTTAAATAT
## 11: 20517 10 5.15 M00110_2.00 20526 GCGGTAAATT GCTTTAAATT
## 12: 21031 10 5.26 M00110_2.00 21040 ACGGGAAAAA ACTTGAAAAA
## 13: 21078 10 5.44 M00110_2.00 21087 TCGGAGAAAG TCTTAGAAAG
## 14: 23775 10 5.53 M00110_2.00 23784 ACGGTGAAAT ACTTTGAAAT
## 15: 29404 10 5.04 M00110_2.00 29413 ACGGTAAAAA ACTTTAAAAA
## MT_Score Delta_Score
## 1: 3.44 40%
## 2: 3.92 36%
## 3: 3.57 39%
## 4: 3.43 40%
## 5: 3.41 40%
## 6: 3.49 39%
## 7: 3.43 40%
## 8: 3.69 38%
## 9: 3.63 38%
## 10: 3.68 38%
## 11: 3.56 39%
## 12: 3.67 38%
## 13: 3.85 37%
## 14: 3.94 36%
## 15: 3.45 40%