Introduction

This demo shows how the PWM are used to evalate the impact of mutations on transcription factor binding sites (TFBS). The data used here is:

1.PWMs for teh yeast TFs downloaded from [http://cisbp.ccbr.utoronto.ca/index.php] CIS-BP Database: Catalog of Inferred Sequence Binding Preferences.

2.The yeast genome sequence downloaded from [https://www.yeastgenome.org/] The Saccharomyces Genome Database (SGD).

3.The Bioconductor [https://bioconductor.org/packages/release/bioc/html/Biostrings.html] BioStrings R Package.

Loading libraries and data

## [1] "Number of loaded PWMs: 1440 - Length of loaded sequence: 230218bp."

Identify TFBS in the yeast chromosome 1

## [1] "Length of loaded sequence: 30000bp."
##     BS_Start PWM_Length Score    PWM_name BS_End BS_Sequence
##  1:     2638         10  5.02 M00110_2.00   2647  GCGGAGGAAA
##  2:     7762         10  5.51 M00110_2.00   7771  CCGGAGTAAT
##  3:     9328         10  5.16 M00110_2.00   9337  TCGGAAAACT
##  4:    10062         10  5.01 M00110_2.00  10071  ACGGATAACT
##  5:    11130         10  5.00 M00110_2.00  11139  TTGGATAAAT
##  6:    13556         10  5.08 M00110_2.00  13565  TCGGGCAATC
##  7:    14786         10  5.02 M00110_2.00  14795  AAGGAAAAAT
##  8:    16608         10  5.28 M00110_2.00  16617  ACGGGTAAAC
##  9:    17925         10  5.21 M00110_2.00  17934  TCGGGAAAAG
## 10:    18888         10  5.26 M00110_2.00  18897  TCGGAAATAT
## 11:    20517         10  5.15 M00110_2.00  20526  GCGGTAAATT
## 12:    21031         10  5.26 M00110_2.00  21040  ACGGGAAAAA
## 13:    21078         10  5.44 M00110_2.00  21087  TCGGAGAAAG
## 14:    23775         10  5.53 M00110_2.00  23784  ACGGTGAAAT
## 15:    29404         10  5.04 M00110_2.00  29413  ACGGTAAAAA

Plot the Sequence logo of this PWM

Sequence Logo for the PWM M00110_2.00.

Sequence Logo for the PWM M00110_2.00.

Create mutants for each TFBS

## [1] "The wildtype sequences are mutated in 2 possition"
##     BS_Start PWM_Length Score    PWM_name BS_End BS_Sequence BS_Sequence_MT
##  1:     2638         10  5.02 M00110_2.00   2647  GCGGAGGAAA     GCTTAGGAAA
##  2:     7762         10  5.51 M00110_2.00   7771  CCGGAGTAAT     CCTTAGTAAT
##  3:     9328         10  5.16 M00110_2.00   9337  TCGGAAAACT     TCTTAAAACT
##  4:    10062         10  5.01 M00110_2.00  10071  ACGGATAACT     ACTTATAACT
##  5:    11130         10  5.00 M00110_2.00  11139  TTGGATAAAT     TTTTATAAAT
##  6:    13556         10  5.08 M00110_2.00  13565  TCGGGCAATC     TCTTGCAATC
##  7:    14786         10  5.02 M00110_2.00  14795  AAGGAAAAAT     AATTAAAAAT
##  8:    16608         10  5.28 M00110_2.00  16617  ACGGGTAAAC     ACTTGTAAAC
##  9:    17925         10  5.21 M00110_2.00  17934  TCGGGAAAAG     TCTTGAAAAG
## 10:    18888         10  5.26 M00110_2.00  18897  TCGGAAATAT     TCTTAAATAT
## 11:    20517         10  5.15 M00110_2.00  20526  GCGGTAAATT     GCTTTAAATT
## 12:    21031         10  5.26 M00110_2.00  21040  ACGGGAAAAA     ACTTGAAAAA
## 13:    21078         10  5.44 M00110_2.00  21087  TCGGAGAAAG     TCTTAGAAAG
## 14:    23775         10  5.53 M00110_2.00  23784  ACGGTGAAAT     ACTTTGAAAT
## 15:    29404         10  5.04 M00110_2.00  29413  ACGGTAAAAA     ACTTTAAAAA

Score the mutants

##     BS_Start PWM_Length Score    PWM_name BS_End BS_Sequence BS_Sequence_MT
##  1:     2638         10  5.02 M00110_2.00   2647  GCGGAGGAAA     GCTTAGGAAA
##  2:     7762         10  5.51 M00110_2.00   7771  CCGGAGTAAT     CCTTAGTAAT
##  3:     9328         10  5.16 M00110_2.00   9337  TCGGAAAACT     TCTTAAAACT
##  4:    10062         10  5.01 M00110_2.00  10071  ACGGATAACT     ACTTATAACT
##  5:    11130         10  5.00 M00110_2.00  11139  TTGGATAAAT     TTTTATAAAT
##  6:    13556         10  5.08 M00110_2.00  13565  TCGGGCAATC     TCTTGCAATC
##  7:    14786         10  5.02 M00110_2.00  14795  AAGGAAAAAT     AATTAAAAAT
##  8:    16608         10  5.28 M00110_2.00  16617  ACGGGTAAAC     ACTTGTAAAC
##  9:    17925         10  5.21 M00110_2.00  17934  TCGGGAAAAG     TCTTGAAAAG
## 10:    18888         10  5.26 M00110_2.00  18897  TCGGAAATAT     TCTTAAATAT
## 11:    20517         10  5.15 M00110_2.00  20526  GCGGTAAATT     GCTTTAAATT
## 12:    21031         10  5.26 M00110_2.00  21040  ACGGGAAAAA     ACTTGAAAAA
## 13:    21078         10  5.44 M00110_2.00  21087  TCGGAGAAAG     TCTTAGAAAG
## 14:    23775         10  5.53 M00110_2.00  23784  ACGGTGAAAT     ACTTTGAAAT
## 15:    29404         10  5.04 M00110_2.00  29413  ACGGTAAAAA     ACTTTAAAAA
##     MT_Score Delta_Score
##  1:     3.44         40%
##  2:     3.92         36%
##  3:     3.57         39%
##  4:     3.43         40%
##  5:     3.41         40%
##  6:     3.49         39%
##  7:     3.43         40%
##  8:     3.69         38%
##  9:     3.63         38%
## 10:     3.68         38%
## 11:     3.56         39%
## 12:     3.67         38%
## 13:     3.85         37%
## 14:     3.94         36%
## 15:     3.45         40%

Plot the WT vs. the MT scores