-For each cassette exons, find the upstream and downstream exons
-Blast sequences with cassette exon spliced in and spliced out against the five fish species (e-value threshold = 0.1 and identity % >= 30 and query coverage >= 70)
-If there are blast results for both spliced in and spliced out sequences, then that splicing event is conserved
Number of total cassette exons: 5051
Number of genes with cassette exons: 3037
Number of total cassette splicing events: 19015
| Species | Number of Conserved Cassette Splicing Events | Number of Non-conserved Cassette Splicing Events | Number of genes with conserved splicing events |
|---|---|---|---|
| lamprey | 6208 | 7415 | 1086 |
| spotted gar | 11557 | 3898 | 1643 |
| zebrafish | 10136 | 5514 | 1375 |
| fugu | 9767 | 4895 | 1409 |
| coelacanth | 7892 | 6317 | 1448 |
| human | 15478 | 1743 | 2242 |
| C. elegans | 1983 | 1.438910^{4} | 242 |
upset(fromList(listInput), nsets = 7, order.by = "freq")
TODO: blast against other mammals
-take the first hit (the one with the highest bit score)
-legnth vs bit