WORKFLOW

blastx against ensembl protein database

-Blast sequences with cassette exon spliced in and spliced out against the five four species and human
-Filter: the alignment must start in the upstream exon and end in the downstream exon
-Gap introduced cannot be larger than the cassette exon length / 3
-If there are blast results for both spliced in and spliced out sequences, then that splicing event is conserved

Number of total cassette splicing events: 6898
Number of genes involved: 4121

BLAST RESULTS:

Species Number of Conserved Cassette Splicing Events (# of genes) Number of Non-conserved Cassette Splicing Events (# of genes) Number of genes with at least one conserved isoform
spotted gar 1531 (1228) 3577 (2551) 3295
zebrafish 1611 (1272) 3404 (2492) 3261
fugu 1451 (1181) 3218 (2370) 3098
coelacanth 1446 (1182) 3624 (2556) 3265
human 2775 (2029) 3574 (2622) 3894

Splicing Event Conservation:

upset(fromList(listInput), nsets = 5, order.by = "freq")

21 cassette splicing events that are conserved in all 5 fish species (not human)
19 genes have cassette splicing events that are conserved in all 5 fish species (not human)
783 cassette splicing events are conserved in all 5 fish species and human
674 (16.4%) genes have cassette splicing events that are conserved in all 5 fish species and human
2306 cassette splicing events from 1753 are conserved in at least one fish species