TODO compile qualimap and tin in the new qc database

Summary

describe structure of document

link to graphs

thresholding

Metrics

The only metrics that we use for Crypto that are currently applicable to yeast are the following.

NOTE!! The “upper fence” metric is one we defined at .16 (the upper fence is .19, for reference). I need to go through and change some text/coding, but “upper fence” refers to our threshold that we set in the last meeting for now.

metric threshold
mRNA total 1.0e+06
not aligned total percent (crypto) 7.0e-02
not aligned total percent (upper fence) 1.6e-01
rle_iqr 6.1e-01

TODO: create a metric to summarize the graphs

Tallies

Plots

mRNA

unmapped_other_percent

rle
rle_compare = tibble(
  norm_count_rle = norm_count_rle_rle_summary$INTERQUARTILE_RANGE,
  removed_effect_rle = removed_effect_rle_summary$interquartile_range
) %>%
  pivot_longer(everything(), names_to = "input_type") %>%
  dplyr::rename(iqr = value)

ggplot(rle_compare) + 
  geom_boxplot(aes(input_type, iqr)) + 
  geom_hline(yintercept = iqr_upper_fence, color = "red")
## Warning: Removed 88 rows containing non-finite values (stat_boxplot).

quality tallies
statusDecomp n
passing_sample 307
protein_coding_total 29
protein_coding_total, unmapped_other_percent 20
rle_iqr 11
unmapped_other_percent 63
replicate tallies
unfiltered
QC unfiltered

Alternative Metrics

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=percent_duplication))

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=bias_5_3))
## Warning: Removed 1 rows containing non-finite values (stat_boxplot).

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=percent_rRNA))

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=dupradar_int))

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=percent_intergenic))

ggplot(audited_qc_df) +
  geom_boxplot(aes(x="", y=percent_exonic))

ggplot(audited_qc_df) +
  geom_point(aes(x=dupradar_int, 
                 y=percent_intergenic, color = auditFlag)) + 
  labs(
    title = "colored by audit status using fence threshold"
  )

ggplot(audited_qc_df) +
  geom_point(aes(x=percent_exonic, 
                 y=percent_rRNA, color = auditFlag)) + 
  labs(
    title = "colored by audit status using fence threshold"
  )

ggplot(audited_qc_df) +
  geom_point(aes(x=percent_exonic, 
                 y=percent_duplication, color = auditFlag)) + 
  labs(
    title = "colored by audit status using fence threshold"
  )