WORKFLOW

blastx against genbank protein database

-Blast sequences with cassette exon spliced in and spliced out against the five four species and human -Filter: the alignment must start in the upstream exon and end in the downstream exon
-Gap introduced cannot be larger than the cassette exon length / 3
-If there are blast results for both spliced in and spliced out sequences, then that splicing event is conserved

Number of total cassette exons: 4926
Number of genes with cassette exons: 2961

Number of total cassette splicing events: 18354

Non-conserved means blast hits for either the spliced in or spliced out sequences (exclusive or)

BLAST RESULTS:

Species Number of Conserved Cassette Splicing Events (# of genes) Number of Non-conserved Cassette Splicing Events (# of genes) Number of genes with at least one conserved isoform
spotted gar 3800 (919) 8387 (1624) 2128
zebrafish 3856 (934) 8298 (1573) 2104
fugu 3825 (879) 8109 (1568) 2076
coelacanth 3551 (868) 8664 (1721) 2185
human 4956 (1336) 9281 (1728) 2495

Splicing Event Conservation:

upset(fromList(listInput), nsets = 5, order.by = "freq")

74 cassette splicing events that are conserved in all 5 fish species (not human)
20 genes have cassette splicing events that are conserved in all 5 fish species (not human)
2626 cassette splicing events are conserved in all 5 fish species and human
591 (20.2%) genes have cassette splicing events that are conserved in all 5 fish species and human
4829 cassette splicing events from 1180 are conserved in at least one fish species