The Plan and Specfic Aim: We are investigating the expression diffrences between the diffrent negative controls of possible nhr-25 knock down experiments in C. elegans nematodes, including N2 wildtype exposed to auxin in ethanol solution, N2 wildtype exposed to only ethanol, KRY85 transgenic mutant strand exposed to pure ethanol solution, and N2 wildtype fed E. coli producing a control RNAi not matching to any C. elegans mRNA. To do this we shall quanitfy the transcript abundance in the Galaxy servers and read the resulting counts of mapped reads for each gene for each sample into count tables. We shall then run PCA on each possible pair of control groups in order to determine if they cluster as expected - ensuring that no control group has outlier samples which would needed to be verified did not affect the results by running DESEQ2 analysis of the control group pair in question with and without the outlier samples to see if a significant change in outcome was observed. We shall then run DESEQ2 analysis of all pairs of control groups to compare their expression levles of each gene and determine if any control group deviated signifcantly from the rest and what can be hypothesised from largely similar control groups expression to be the pure N2 wildtype. To see if that diffrence is likely to be actually relevent to a study of nhr-25 using these groups we shall run GOterm analysis of the differentialy expresed genes to see if they are part of metabolic functions known to be affected by nhr-25, if the contorl groups differ only in biolgocial systems largely unaffected by and do not act on nhr-25, or if the differentialy expressed genes are all part of diffrent metabolic pathways and as such are likely merely noise as true differnces in expression would be likely to alter the expression of an entire related pathway, not just one gene.
First, outside of R I dowloaded on to my continue the count tables for each of our orginal four control groups from Galaxy - the N2 with EtOH, the N2 with EtOH and auxin, the KRY-85 mutant with EtOH, and N2 given RNAi. These where provided in this format and as such the FastQC, trimmomatic, and HISAT steps in Galaxy where skipped, having already been completed for each set prior to this experiments start. Then on my computer, for each of the six comparisons each group to another, I created tab deleniated files containing the the three count tables from each data set for both control groups.
Inorder for us to get access to DESEQ2 we must download DESEQ2 from Biomanager. As this must be done after booting the notebook each time in order to use DESEQ2. This code downloads and installs DESEQ2.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("GenomeInfoDb", version = "3.8")
## Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.1 (2018-07-02)
## Installing package(s) 'GenomeInfoDb'
##
## The downloaded binary packages are in
## /var/folders/tg/rszhgl1x3t7gv69lz9tp0n1x9z37y1/T//Rtmp2Ijzl2/downloaded_packages
## installation path not writeable, unable to update packages: class,
## cluster, codetools, foreign, lattice, MASS, Matrix, mgcv, nlme, survival
library("GenomeInfoDb")
## Warning: package 'GenomeInfoDb' was built under R version 3.5.2
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, basename, cbind,
## colMeans, colnames, colSums, dirname, do.call, duplicated,
## eval, evalq, Filter, Find, get, grep, grepl, intersect,
## is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
## paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
## Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
## table, tapply, union, unique, unsplit, which, which.max,
## which.min
## Loading required package: S4Vectors
## Loading required package: stats4
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:base':
##
## expand.grid
## Loading required package: IRanges
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("DESeq2", version = "3.8")
## Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.1 (2018-07-02)
## Installing package(s) 'DESeq2'
##
## The downloaded binary packages are in
## /var/folders/tg/rszhgl1x3t7gv69lz9tp0n1x9z37y1/T//Rtmp2Ijzl2/downloaded_packages
## installation path not writeable, unable to update packages: class,
## cluster, codetools, foreign, lattice, MASS, Matrix, mgcv, nlme, survival
library("DESeq2")
## Warning: package 'DESeq2' was built under R version 3.5.2
## Loading required package: GenomicRanges
## Loading required package: SummarizedExperiment
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: DelayedArray
## Loading required package: matrixStats
##
## Attaching package: 'matrixStats'
## The following objects are masked from 'package:Biobase':
##
## anyMissing, rowMedians
## Loading required package: BiocParallel
## Warning: package 'BiocParallel' was built under R version 3.5.2
##
## Attaching package: 'DelayedArray'
## The following objects are masked from 'package:matrixStats':
##
## colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
## The following objects are masked from 'package:base':
##
## aperm, apply
Next we shall enter our text-deliminated files from HT-seq into R so that they become objects in R named the same as the file name (just without the .txt). Then, where if Variable1 is the condition defining the first group of samples in the file and varialbe 2 the second we read these files into Variable1_Variable2_Counts, which organize the raw RNAseq data from each sample for easy acces by the computer. We then shall normalize each data set to account for diffrences in the number of sequencing reads between samples.
Here we are entering out text-deliminated file for the HT-seq runs for the KRY85 muant with ethanol and N2 with auxin replicates so that it becomes an ojbect in R. It then reads the file into KRY85_N2auxin_Counts.
text_KRY85_N2auxin<-read.table("text_KRY85_N2auxin.txt", header = TRUE, sep="\t")
KRY85_N2auxin_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_KRY85_N2auxin, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the KRY85 mutant with ethanol and N2 with auxin dataset datasets and name the resulting modified data sets as the object Normal_KRY85_N2auxin_Counts.
Normal_KRY85_N2auxin_Counts <- vst(KRY85_N2auxin_Counts, blind = FALSE)
Here we are entering out text-deliminated file for the HT-seq runs for the KRY85 muant with ethanol and N2 with only etahanol replicates so that it becomes an ojbect in R. It then reads the file into KRY85_N2EtOH_Counts.
text_KRY85_N2EtOH<-read.table("text_KRY85_N2EtOH.txt", header = TRUE, sep="\t")
KRY85_N2EtOH_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_KRY85_N2EtOH, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the KRY85 mutant with ethanol and N2 with only ethanol dataset datasets and name the resulting modified data sets as the object Normal_KRY85_N2EtOH_Counts.
Normal_KRY85_N2EtOH_Counts <- vst(KRY85_N2EtOH_Counts, blind = FALSE)
Here we are entering out text-deliminated file for the HT-seq runs for the KRY85 muant with ethanol and N2 with control RNAi replicates so that it becomes an ojbect in R. It then reads the file into KRY85_RNAi_Counts.
text_KRY85_RNAi<-read.table("text_KRY85_RNAi.txt", header = TRUE, sep="\t")
KRY85_RNAi_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_KRY85_RNAi, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the KRY85 mutant with ethanol and N2 with control RNAi datasets and name the resulting modified data sets as the object Normal_KRY85_RNAi_Counts.
Normal_KRY85_RNAi_Counts <- vst(KRY85_RNAi_Counts, blind = FALSE)
Here we are entering out text-deliminated file for the HT-seq runs for N2 with Auxin and N2 with only ethenol replicates so that it becomes an ojbect in R. It then reads the file into N2auxin_N2EtOH_Counts.
text_N2auxin_N2EtOH<-read.table("text_N2auxin_N2EtOH.txt", header = TRUE, sep="\t")
N2auxin_N2EtOH_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_N2auxin_N2EtOH, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the N2 with Auxin and N2 with only ethanol datasets and name the resulting modified data sets as the object Normal_N2auxin_N2EtOH_Counts.
Normal_N2auxin_N2EtOH_Counts <- vst(N2auxin_N2EtOH_Counts, blind = FALSE)
Here we are entering out text-deliminated file for the HT-seq runs for N2 with Auxin and N2 with RNai replicates so that it becomes an ojbect in R. It then reads the file into N2auxin_RNAi_Counts.
text_N2auxin_RNAi<-read.table("text_N2auxin_RNAi.txt", header = TRUE, sep="\t")
N2auxin_RNAi_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_N2auxin_RNAi, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the N2 with Auxin and N2 with control RNAi dataset datasets and name the resulting modified data sets as the object Normal_N2auxin_RNai_Counts.
Normal_N2auxin_RNAi_Counts <- vst(N2auxin_RNAi_Counts, blind = FALSE)
Here we are entering out text-deliminated file for the HT-seq runs for N2 with only Ethanol and N2 with RNai replicates so that it becomes an ojbect in R. It then reads the file into N2EtOH_RNAi_Counts.
text_N2EtOH_RNAi<-read.table("text_N2EtOH_RNAi.txt", header = TRUE, sep="\t")
N2EtOH_RNAi_Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_N2EtOH_RNAi, directory = "Count Tables", design = ~ Condition)
Next we normalize the data to account for diffrences in the number of sequencing reads between samples of the N2 with only ethanol and N2 with control RNAi dataset datasets and name the resulting modified data sets as the object Normal_N2auxin_RNai_Counts.
Normal_N2EtOH_RNAi_Counts <- vst(N2EtOH_RNAi_Counts, blind = FALSE)
The below then looks at our normalized data sets in a PCA, allowing us to tell if the cluster in the manner we expect, thus veryfing if each control group’s samples are similar or if they cluster oddly and thus may have large internal variation. If the later the samples would need to be investigated to deterime if outlying samples are too divergent for good analyis to be preformed and thus need to be excluded from or otherwises accounted for in the DESeq analysis or if the samples of all the contorl groups are simply so similar that a few samples having negliable expression diffrences caused them not to cluster with the rest. The intrgroup command determines how the points are colored so we can tell those from the same dataset apart.
Here we plot the normalized counts for the KRY85 and N2 with auxin comparison. Here the first two N2 with auxin and KRY85 with ethanol samples group close together, as expected. However, the third sample from each group exists as a major outlier and given that this is merely a PCA plot, we do not know why or if the source of difference is from an abnormility that will affect the DESEQ2 analysis or if the diffrence is an repeatable diffrence of relevance to differntial expression analysis in the control groups which would require accounting for in future studies or merely a result of random chance. We will thus conduct investigation into this and if nessissary rerun DESEQ2 analyis of the groups with out their third members.
plotPCA(Normal_KRY85_N2auxin_Counts, intgroup=c("Condition", "Biological.Replicate"))
Here we plot the normalized counts for the KRY85 and N2 with ethanol comparison. The third sample of KRY85 with auxin continues to show major variance from the other samples, however the other two KRY85 samples are clustered togther as expected and the N2 with ethanol samples appear fairly well aligned, all nearly along the same location on the PC1 axis with accounts for most the diffrence but the third sample still varies from the rest along the PC2 axis. Therefore, the N2 with ethanol group may also require investigation.
plotPCA(Normal_KRY85_N2EtOH_Counts, intgroup=c("Condition", "Biological.Replicate"))
Here we plot the normalized counts for the KRY85 and N2 with control RNAi comparison. As the samples of each control group cluster closely togther along the PC1 axis which accounts for a full 77% of the varience (as opposed to PC2’s mere 11%) and as such, we can conclude that the samples within each group are highly similar to each other and more similar with eachother than the other group. This is as expected so we may proceed to DESEQ2 without a need for further invesigation for no signifcant outliers appear to exist in this data.
plotPCA(Normal_KRY85_RNAi_Counts, intgroup=c("Condition", "Biological.Replicate"))
Here we plot the normalized counts for the N2 with auxin and N2 with only ethanol comparison. Here the N2 with auxin samples all aling allong the PC1 axis which accounts for 87% of the variance, and so the gorup is unlikely to require alternation of groups in this comparision - they cluster well in this comparison as epxected. The N2 with ethanol group however has one sample aligned with the N2 auxin samples rather than its own along PC1. Wheather this is an outlier data point or if the two groups are simply so similar that small diffrences in expression cold drastically alter a samples alignment will require investigation.
plotPCA(Normal_N2auxin_N2EtOH_Counts, intgroup=c("Condition", "Replicate"))
Here we plot the normalized counts for the N2 with auxin and N2 with control RNAi comparison. Each control groups samples cluster with each other and dispaly close clustering along the PC1 axis and as such show expected results and as such does not seem to have any outliers so we may proceed to DESEQ2 with this data.
plotPCA(Normal_N2auxin_RNAi_Counts, intgroup=c("Condition", "Biological.Replicate"))
Here we plot the normalized counts for the N2 with only ethanol and N2 with control RNAi comparison. This shows that samples from each control group cluster with others of the that group along the PC1 axis which accounts for 78% of the varience and the N2 with ethanol only samples also cluster well along the PC2 axis. This would suggest that neither group possesses outliers in thier comparison requring accounting for in DESEQ2 and are largely similar within contorl groups, as expected. We may then proceed to DESEQ2 without further alterations to the data.
plotPCA(Normal_N2EtOH_RNAi_Counts, intgroup=c("Condition", "Replicate"))
#####Running DESEQ2####
Next we calculate the differenctial expressions for all comparison data sets.
KRY85_N2auxin_Diff_Expression <- DESeq(KRY85_N2auxin_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
KRY85_N2EtOH_Diff_Expression <- DESeq(KRY85_N2EtOH_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
KRY85_RNAi_Diff_Expression <- DESeq(KRY85_RNAi_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
N2auxin_N2EtOH_Diff_Expression <- DESeq(N2auxin_N2EtOH_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
N2auxin_RNAi_Diff_Expression <- DESeq(N2auxin_RNAi_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
N2EtOH_RNAi_Diff_Expression <- DESeq(N2EtOH_RNAi_Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
The following code actually does the differential calculations, with the contrast taking arguments of the column with the conditions’s name and the names of the condition types. This will asign a p-value to the diffrence in expression for each gene and provide a table containing these results.
This code runs the differential expersion analyssis calculations for the KRY85 with ethanol vs N2 with auxin comparison using all available samples. That a total of 3.31% of expressed genes show an adjusted p-value less than 0.1 indicates a massive diffrence in expression, especailly given that this corrisponds to 895 genes with large changes in up or down regulation changes, indcating that these two control methods may be drastically diffrent. To veryify this, the differntial expression shall also need to be done with just the two closely clustering samples from each group to verify these results are not purely due to an outlying sample in each group.
KRY85_N2auxin_results_table <- results(KRY85_N2auxin_Diff_Expression, contrast = c("Condition", "KRY85", "auxin"))
summary(KRY85_N2auxin_results_table)
##
## out of 26819 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 244, 0.91%
## LFC < 0 (down) : 651, 2.4%
## outliers [1] : 132, 0.49%
## low counts [2] : 11602, 43%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
This code runs the differential expersion analyssis calculations for the KRY85 with ethanol vs N2 with only ethanol comparison using all available samples. This also shows a large number of significantly differntially epxressed genes (880 in total) which seems to indicate that the KRY85 with ethanol worms had vastly diffrent expression from the N2 with only ethanol trated worms. This could indicate that these control groups are vastly diffrent with regards to nhr-25, but an investigation into the affects of the the KRY85:2 sample, which the PCA indicated may be an outlier, on the outcome we have seen here must be conducted to demonstarte that our findings are not merely the result of an outlier. To do this we will re-run the DESEQ2 analysis of these two control groups but exclude the KRY85:2 sample.
KRY85_N2EtOH_results_table <- results(KRY85_N2EtOH_Diff_Expression, contrast = c("Condition", "KRY85", "N2_EtOH"))
summary(KRY85_N2EtOH_results_table)
##
## out of 26509 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 206, 0.78%
## LFC < 0 (down) : 674, 2.5%
## outliers [1] : 116, 0.44%
## low counts [2] : 10958, 41%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
This code runs the differential expersion analyssis calculations for the KRY85 with ethanol vs N2 with control RNAi comparison using all available samples. This shows a massive number of significantly differntially epxressed genes (10987 in total) which seems to indicate that the KRY85 with ethanol worms had vastly diffrent expression from the N2 worms fed E. coli with the control RNAi. This could indicate that these control groups are vastly diffrent with regards to nhr-25, we must look through the list of differentially expressed genes to be sure. Regardless, with so many genes differentially expressed it seems likely that these control groups are very unlikely to be equivelent negative controls for a study regarding nhr-25 with at least one of these control groups failing to accuretly reprsentin the wild type as intended by its use as a negative control - though KRY85’s signficant diffrences from all other controls studied hered (including N2 with auxin and N2 with only ethanol) it seems likely that if the diffrences are shown to include gene sets affected by nhr-25 that the KRY85 strand fails alters the function or expression of nhr-25 in an fashion it is not in N2 wild types.
KRY85_RNAi_results_table <- results(KRY85_RNAi_Diff_Expression, contrast = c("Condition", "KRY85", "RNAi"))
summary(KRY85_RNAi_results_table)
##
## out of 25079 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 5491, 22%
## LFC < 0 (down) : 5496, 22%
## outliers [1] : 106, 0.42%
## low counts [2] : 6093, 24%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
This code runs the differential expersion analyssis calculations for the N2 with auxin vs N2 with only ethanol comparison using all available samples. This shows no significant expression diffrences between the two control groups whena all samples are included - suggesting that they are equivelent control mechanisms and that any diffrences in the N2 with auxin from the wild type are purely due to the presence of ethanol, indicating that auxin itself has no affect on the transcriptome of the worm, as expected. Still, as the second sample of the N2 with auxin clustered hevily with the N2 with ethanol only samples, we must ensure that our lack of diffrence between the two control groups is not due purely to this potenetial outlier by running this comparison again with out the second N2 with auxin sample.
N2auxin_N2EtOH_results_table <- results(N2auxin_N2EtOH_Diff_Expression, contrast = c("Condition", "aux", "etoh"))
summary(N2auxin_N2EtOH_results_table)
##
## out of 27238 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 55, 0.2%
## low counts [2] : 0, 0%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
This code runs the differential expersion analyssis calculations for the N2 with auxin vs N2 having ingested E. coli with control RNAi comparison using all available samples. There is seen a signifcant diffrence between the transcriptomes of the two groups, indicating that these controls are hardly equvilent - however wheather the differences seen between the N2 with auxin and N2 with RNAi are due to off target affects of RNAi or due to the fact that the RNAi sampels came from drier culter plate enviroments than the other group’s samples cannot be determined. Regardless this particular control RNAi data set seems to be signficantly off differentially expressed from the N2 with auxin data set.
N2auxin_RNAi_results_table <- results(N2auxin_RNAi_Diff_Expression, contrast = c("Condition", "auxin", "RNAi"))
summary(N2auxin_RNAi_results_table)
##
## out of 26344 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 4954, 19%
## LFC < 0 (down) : 4015, 15%
## outliers [1] : 38, 0.14%
## low counts [2] : 7400, 28%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
This code runs the differential expersion analyssis calculations for the N2 with ethanol vs N2 having ingested E. coli with control RNAi comparison using all available samples. These also signficant diffrences across thousands of genes (as expected from the N2 with auxin vs N2 with control RNAi analysis). Once again weather this diffrences is due to RNAi interference or envriometnal diffrences between the control groups culter plates is unkown.
N2EtOH_RNAi_results_table <- results(N2EtOH_RNAi_Diff_Expression, contrast = c("Condition", "etoh", "rnai"))
summary(N2EtOH_RNAi_results_table)
##
## out of 26005 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 5382, 21%
## LFC < 0 (down) : 4530, 17%
## outliers [1] : 31, 0.12%
## low counts [2] : 6829, 26%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
Here we invesigate affects of the widely non-clustering sampels in the KRY85 with ethanol and N2 with auxin camparison by exccluding these samples from the analysis. First, we created a tab-delinated file for the KRY85 and N2 with axuin group comparisons containing only the first two samples from the auxin group and missing the second sample from the KRY85 group called text_KRY85_N2auxin_2.
Here we are entering out text-deliminated file for the HT-seq runs for the non outlier replicates of the KRY85 mutant with ethanol and N2 with auxin groups so that it becomes an ojbect in R. It then reads the file into KRY85_N2auxin_2Counts.
text_KRY85_N2auxin_2<-read.table("text_KRY85_N2auxin_2.txt", header = TRUE, sep="\t")
KRY85_N2auxin_2Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_KRY85_N2auxin_2, directory = "Count Tables", design = ~ Condition)
Next we use KRY85_N2auxin_2Counts to find the differential expression analysis between the KRY85 and N2 with auxin groups lacking their outliers.
KRY85_N2auxin_2Diff_Expression <- DESeq(KRY85_N2auxin_2Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
The following code actually does the differential calculations, with the contrast taking arguments of the column with the conditions’s name and the names of the condition types. This will asign a p-value to the diffrence in expression for each gene and provide a table containing the results for the KRY85 with ethanol and N2 with auxin group comparison without the outliers.
It still shows signifcant diffreneces between the two groups with 3.2% of the non-zero read count genes (a total of 830) showing significant diffential expression given an a signifance level of .1, verifying that our prevoius results where largely not the product of outlier samples in each group and that KRY85 with ethanol control group has vastly diffrent patterns of expression than the N2 with auxin control group.
KRY85_N2auxin_2results_table <- results(KRY85_N2auxin_2Diff_Expression, contrast = c("Condition", "KRY85", "auxin"))
summary(KRY85_N2auxin_2results_table)
##
## out of 26308 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 393, 1.5%
## LFC < 0 (down) : 437, 1.7%
## outliers [1] : 0, 0%
## low counts [2] : 11856, 45%
## (mean count < 6)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
Here we invesigate affects of the widely non-clustering sampel in the KRY85 with ethanol and N2 with ethanol camparison by exccluding this sample (KRY85:2) from the analysis. First, we created a tab-delinated file for the KRY85 and N2 with ethanol groups comparison, excluding the second sample form KRY85.
Here we are entering out text-deliminated file for the HT-seq runs for the non outlier replicates of the KRY85 mutant with ethanol and N2 with ethanol groups so that it becomes an ojbect in R. It then reads the file into KRY85_N2EtOH_2Counts.
text_KRY85_N2EtOH_2<-read.table("text_KRY85_N2EtOH_2.txt", header = TRUE, sep="\t")
KRY85_N2EtOH_2Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_KRY85_N2EtOH_2, directory = "Count Tables", design = ~ Condition)
Next we use KRY85_N2EtOH_2Counts to find the differential expression analysis between the KRY85 and N2 with ethanol groups lacking the outlier.
KRY85_N2EtOH_2Diff_Expression <- DESeq(KRY85_N2EtOH_2Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
The following code actually does the differential calculations, with the contrast taking arguments of the column with the conditions’s name and the names of the condition types. This will asign a p-value to the diffrence in expression for each gene and provide a table containing the results for the KRY85 with ethanol and N2 with ethanol group comparison without the outlier.
It still shows signifcant diffreneces between the two groups with 3.3% of the non-zero read count genes (a total of 864) showing significant diffential expression given an a signifance level of .1, verifying that our prevoius results where largely not the product of outlier samples in each group and that KRY85 with ethanol control group has vastly diffrent patterns of expression than the N2 with ethanol control group.
KRY85_N2EtOH_2results_table <- results(KRY85_N2EtOH_2Diff_Expression, contrast = c("Condition", "KRY85", "N2_EtOH"))
summary(KRY85_N2EtOH_2results_table)
##
## out of 26169 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 182, 0.7%
## LFC < 0 (down) : 682, 2.6%
## outliers [1] : 3, 0.011%
## low counts [2] : 11296, 43%
## (mean count < 4)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
Here we invesigate affects of the N2 with auxin sample which clustred highly with the N2 with ethanol samples (aux:2) in the PCA analysis on the differntial expression analysis between the two groups so as to determine if the oddly clustering sample is truly an outlier leading us to underetimated the expression diffrences between the groups or if the two control groups where so similar that an non-outlier sample form the N2 with auxin clustered with the N2 with ethanol group. First, we created a tab-delinated file for the N2 with auxin and N2 with ethanol groups comparison, excluding the second sample form N2 with auxin.
Here we are entering out text-deliminated file for the HT-seq runs for the non outlier replicates of the N2 with auxin and N2 with ethanol groups so that it becomes an ojbect in R. It then reads the file into N2auxin_N2EtOH_2Counts.
text_N2auxin_N2EtOH_2<-read.table("text_N2auxin_N2EtOH_2.txt", header = TRUE, sep="\t")
N2auxin_N2EtOH_2Counts<-DESeqDataSetFromHTSeqCount(sampleTable = text_N2auxin_N2EtOH_2, directory = "Count Tables", design = ~ Condition)
Next we use N2axuin_N2EtOH_2Counts to find the differential expression analysis between the N2 with axuin and N2 with ethanol groups with the potential outlier removed.
N2auxin_N2EtOH_2Diff_Expression <- DESeq(N2auxin_N2EtOH_2Counts)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
The following code actually does the differential calculations, with the contrast taking arguments of the column with the conditions’s name and the names of the condition types. This will asign a p-value to the diffrence in expression for each gene and provide a table containing the results for the N2 with auxin and N2 with ethanol group comparison without the potential outlier.
This code runs the differential expersion analyssis calculations for the N2 with auxin vs N2 with only ethanol comparison using all available samples save for the second sample from N2 with auxin. It once again shows no signficantly differentially expressed genes, indicating that the N2 with auxin and N2 with ethanol are indeed near identical in their transcriptome and the second N2 with auxin sample was likely only clustering with the N2 with ethanol samples due to the vast similarity between all samples and thus the diffrences between the N2 with auxin and N2 with ethanol groups diffrences in the PCA group were due only to insignficantly small diffrences between them.
N2auxin_N2EtOH_2results_table <- results(N2auxin_N2EtOH_2Diff_Expression, contrast = c("Condition", "aux", "etoh"))
summary(N2auxin_N2EtOH_2results_table)
##
## out of 26723 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 2, 0.0075%
## low counts [2] : 0, 0%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
Below we visualy represent the data from each comparsion excluding those comparisons done to verify the accuracy of another comparison as each only verified the orginal comparison. This is helpful in viewing how the expression differs between each control group. We shall do so with volcano plots.
This code installs the package used to create volcano plots.
if (!requireNamespace('BiocManager', quietly = TRUE))
install.packages('BiocManager')
BiocManager::install('EnhancedVolcano')
## Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.1 (2018-07-02)
## Installing package(s) 'EnhancedVolcano'
##
## The downloaded binary packages are in
## /var/folders/tg/rszhgl1x3t7gv69lz9tp0n1x9z37y1/T//Rtmp2Ijzl2/downloaded_packages
## installation path not writeable, unable to update packages: class,
## cluster, codetools, foreign, lattice, MASS, Matrix, mgcv, nlme, survival
library(EnhancedVolcano)
## Warning: package 'EnhancedVolcano' was built under R version 3.5.2
## Loading required package: ggplot2
## Loading required package: ggrepel
This code generates a volcano plot for the KRY85 and N2 with Auxin comparison with a p=.10 cut off for signficance. This shows the extreme numbers of differntially expressed genes expected from the DEseq2 analysis - it shows espcially strong downregulation of KRY85 genes when compared to N2 with Auxin.
EnhancedVolcano(KRY85_N2auxin_results_table,
lab = rownames(KRY85_N2auxin_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'KRY85 vs N2 with Auxin Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 26963 rows containing missing values (geom_point).
This code generates a volcano plot for the KRY85 and N2 with ethanol only comparison with a p=.10 cut off for signficance. This shows the extreme numbers of differntially expressed genes expected from the DEseq2 analysis - it shows espcially strong downregulation of KRY85 genes when compared to N2 with ethanol. ZK938.3 again shows extremely large signifcance when compared to other genes. I therefore looked up its function on wormbase and found that it appears to be coding for an ATP binding protein. This is interesting as nhr-25’s human homolog SF1 affects the liver and gonads by altering glycotic pathway - which produces ATP. This may suggest that the KRY85 contorl group is expereincing glycotic pathway diffrences from the N2 with ethanol and N2 with Auxin groups - possibly as a result of nhr-25 loss of function in the KRY85 with ethanol control group due to the translational fusion of AID onto the nhr-25 protein.
EnhancedVolcano(KRY85_N2EtOH_results_table,
lab = rownames(KRY85_N2EtOH_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'KRY85 vs N2 with EtOH only Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 26613 rows containing missing values (geom_point).
This code generates a volcano plot for the KRY85 with only ethanol and control RNAi comparison with a p=.10 cut off for signficance. This shows the same kind of massive diffrences in expression as were seen in the intial DEseq2 analysis of the sets. A large quanity of the genes appear to be upregulated in the KRY85 set when compared to the RNAi group - the source of this is unkown but will likely be shown in GOterm analysis.
EnhancedVolcano(KRY85_RNAi_results_table,
lab = rownames(KRY85_RNAi_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'KRY85 vs N2 with control RNAi Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 23323 rows containing missing values (geom_point).
## Warning: Removed 155 rows containing missing values (geom_text).
This code generates a volcano plot for the N2 with Auxin and N2 with ethanol only comparison with a p=.10 cut off for signficance. Like the DEseq2 analysis upon which this graph is based, it shows no signficant diffrential expression between the two groups. Additionally, the two groups show a minimal log fold change (always within 5 of 0) in expression when compared to the other comparison groups and as such we may conclude that the N2 with auxin and N2 with ethanol group’s transcriptomes may be even more similar than previously thought form the DEseq2 analysis alone.
EnhancedVolcano(N2auxin_N2EtOH_results_table,
lab = rownames(N2auxin_N2EtOH_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'N2 with Auxin vs N2 with Ethanol only Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 14865 rows containing missing values (geom_point).
This code generates a volcano plot for the N2 with Auxin and N2 with control RNAi comparison with a p=.10 cut off for signficance. As in the DEseq2 analysis, this shows that the groups have vast and varied diffrences in expression with some genes heavily upregulated in the N2 with Auxin and others down regulated when compared to the N2 with control RNAi.
EnhancedVolcano(N2auxin_RNAi_results_table,
lab = rownames(N2auxin_RNAi_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'N2 with Auxin vs N2 with control RNAi Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 23192 rows containing missing values (geom_point).
## Warning: Removed 50 rows containing missing values (geom_text).
This code generates a volcano plot for the N2 with Ethanol only and N2 with control RNAi comparison with a p=.10 cut off for signficance. As expected, given their simularity, this comparsion appears nearly identical to the N2 with auxin with N2 with RNAi comparsion and conveys the same message of vast transcriptional diffrences between the two data sets. Only further anlaysis will reveal the types of genes envolved in these diffrences and allow us to geuss wheather they are due to RNAi offtarget affects from the control RNAi, other affects caused by the RNAi on the RNAi pathway or due to envirometnal differences between the two sample groups.
EnhancedVolcano(N2EtOH_RNAi_results_table,
lab = rownames(N2EtOH_RNAi_results_table),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-8, 8),
title = 'N2 with Ethanol only vs N2 with control RNAi Expression',
pCutoff = .1,
FCcutoff = 1.5,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0)
## Warning: Removed 22955 rows containing missing values (geom_point).
## Warning: Removed 52 rows containing missing values (geom_text).
####Excel Data - look at nhr-25### Next we prepare the comparison differential expression analysis data for export as a tab delianiated files displaying the relative expression of for each gene of each control group in a comparsion as well as that comparison’s p value. The genes will be listed from top to bottom in the tables by p-value from least to greatest.
KRY85_N2auxin_tab <- KRY85_N2auxin_results_table[order(KRY85_N2auxin_results_table$pvalue),]
KRY85_N2EtOH_tab <- KRY85_N2EtOH_results_table[order(KRY85_N2EtOH_results_table$pvalue),]
KRY85_RNAi_tab <- KRY85_RNAi_results_table[order(KRY85_RNAi_results_table$pvalue),]
N2auxin_N2EtOH_tab <- N2auxin_N2EtOH_results_table[order(N2auxin_N2EtOH_results_table$pvalue),]
N2auxin_RNAi_tab <- N2auxin_RNAi_results_table[order(N2auxin_RNAi_results_table$pvalue),]
N2EtOH_RNAi_tab <- N2EtOH_RNAi_results_table[order(N2EtOH_RNAi_results_table$pvalue),]
Below we trim each tab delianted file of the differential expressiona analysis to exclude genes without signficant diffrences in the comparison by only including those genes with an adjusted p-value less than .10. We then write that each table as ‘Group1_Group2_siggenes.txt’ (so for example the KRY85 vs N2 with auxin groups would have a table called KRY85_N2auxin_siggenes.txt).
KRY85_N2auxin_tab_trimed <- subset(KRY85_N2auxin_tab, padj < .1)
write.table(as.data.frame(KRY85_N2auxin_tab_trimed), file="KRY85_N2auxin_siggenes.txt", sep= "\t")
KRY85_N2EtOH_tab_trimed <- subset(KRY85_N2EtOH_tab, padj < .1)
write.table(as.data.frame(KRY85_N2EtOH_tab_trimed), file="KRY85_N2EtOH_siggenes.txt", sep= "\t")
KRY85_RNAi_tab_trimed <- subset(KRY85_RNAi_tab, padj < .1)
write.table(as.data.frame(KRY85_RNAi_tab_trimed), file="KRY85_RNAi_siggenes.txt", sep= "\t")
N2auxin_N2EtOH_tab_trimed <- subset(N2auxin_N2EtOH_tab, padj < .1)
write.table(as.data.frame(N2auxin_N2EtOH_tab_trimed), file="N2auxin_N2EtOH_siggenes.txt", sep= "\t")
N2auxin_RNAi_tab_trimed <- subset(N2auxin_RNAi_tab, padj < .1)
write.table(as.data.frame(N2auxin_RNAi_tab_trimed), file="N2auxin_RNAi_siggenes.txt", sep= "\t")
N2EtOH_RNAi_tab_trimed <- subset(N2EtOH_RNAi_tab, padj < .1)
write.table(as.data.frame(N2EtOH_RNAi_tab_trimed), file="N2EtOH_RNAi_siggenes.txt", sep= "\t")
We then looked at each trimmed tab deleniated file in excel to search for patterns and wheather nhr-25 was signifantly diffrent in the samples and if so by how much.
The KRY85 vs N2 with auxin comparison tab delentiated file shows 896 genes with a signifance level less than p-value = .1, idnicating signficant diffrences between the two data sets. Amoung the differentially expressed genes is nrh-25 with an adjusted pvalue of .004 and with a log2 value of 1.64. This means that nhr-25 is differntially expressed between two negative control groups for nhr-25 transcription factors with more mRNA being produced form the gene in the KRY85 data set and as such at least one of these controls is not a good negative control. The higher expression of nhr-25 in the KRY85 worms is possibly due to the protein being defecient and unable to trigger negative feed back loops to halt the body’s demand for the protein.
The KRY85 vs N2 with ethanol comparison tab delentiated file shows 881 genes with a signifance level less than p-value = .1, idnicating signficant diffrences between the two data sets. Amoung the differentially expressed genes, possibly for the reason stated above, is nhr-25 which has a log2 value of 1.49 and an adjusted p value of .008. This indicates that nhr-25 is heavily over expressed in the KRY85 control group when compared to the N2 with only ethanol group and so either the N2 with auxin and N2 with ethanol groups are poor negative controls (which seems unlikely as both sperately arrived at nearly identical results) or the KRY85 strand is poor for negative controls when investigating nhr-25.
The KRY85 vs N2 with control RNAi comparison tab delentiated file shows 10988 genes (which given the nematod has about 20,000 genes is about half the genome) with a signifance level less than p-value = .1, indicating such a signficant diffrence between the two data sets that it is unlikely that all of it is merely caused by diffrences in nhr-25 function and/or ethanol exposure, and as such envirometnal diffrences appear to have a signficant affect on the transcriptome of the RNAi when compared to the other groups. Amoung the differentially expressed genes is nhr-25 which was upregulated in the KRY85 with an adjusted p value of .082.
The N2 with Auxin vs N2 with ethanol comparison tab delentiated file shows is empty - indicating that the two control groups do not differ in thier transcriptome and if one is a good negative control the other should be as well.
The N2 with Auxin vs N2 with RNAi comparison tab delentiated file shows 8970 genes with a signifance level less than p-value = .1, indicating signficant diffrences between the two data sets. Amoung the differentially expressed genes, possibly for the reason stated above, is nhr-25 which has a log2 value of -.928 and an adjusted p value of .03. This suggests that nhr-25 is less expressed in the N2 with Auxin group than the N2 with control RNAi group and that at least one of them make a poor negative control group.
The N2 with only Ethanol vs N2 with RNAi comparison tab delentiated file shows 9913 genes with a signifance level less than p-value = .1, indicating signficant diffrences between the two data sets (and some minor diffrences between the N2 with auxn and N2 with ethanol groups as the N2 with auxin had less differentially epxressed genes compared to the N2 with control RNAi group). Amoung the differentially expressed genes, possibly for the reason stated above, is nhr-25 which has a log2 value of -.80 and an adjusted p value of .055. This suggests that nhr-25 is less expressed in the N2 with ethanol group than the N2 with control RNAi group and that at least one of them make a poor negative control group.
For the N2 with Auxin and RNAi comparison I looked for signficantly differentiated genes assossiated RNAi regulation pathways whose mutation (and such function corrisponed to changes in phenotypes regulated by nhr-25 (mostly sterlity) in RNAi regulation pathways as listed in “RNAi mechanisms in Caenorhabditis elegans” by Grishok. Only one gene,rah-1, of the set was found be sigfificantly differntially regulated in the N2 with Auxin and RNAi comparison. Nor was drh-1, the DICER-1 coding gene in c. elegans, found to be differentially regulated in the set. This seems to indicate that the control RNAi was not signifanctly triggering any RNAi pathway expression beyond the normal level. While this cannot disclude off target affects, it suggests that as no increase in RNAi mechansims expression occured, that the RNAi mechansims are not being overwhelemed by the control RNAi for no expression of RNAi mechanism gene pathways was done to counteract it.
For N2 with only ethanol and RNAi comparison, drh-1 still had no significant changes, however several RNAi pathway related genes with affects on sterlity and embroynic death where differentially regulated. For expample, alg-1 and alg-2 which cause embryonic death in double mutants was signficantly down regulated in the control RNAi group. As where mes-4 (a germline RNAi, which corrisponds to sterlie progeny) and rrf-3 (which leads to defects in sperm development). Wheather these are random affects due to the eviromental affects and its wide-scale changes, or if the control RNAi had off-target affects here it for some reason did not in the N2 with auxin group (seems unlikely as N2 with auxin and N2 with ethanol are extremely similar in transcriptome, and enviroment, and likely identical in genetics.)
For KRY85 with RNAi comparison, dhr-1 did not show any differnitial expression here, alg-1 and alg-2 where down regulated in the RNAi data set as where mes-6 (also a germline RNAi assossiated with progeny sterlity). Howver, pgl-1 and rha-1 (germline RNAi related genes whose mutants have temperature dependent sterlity) was up regulated in the control RNAi data set. It is possible that these seemingly conflicting changes are due to nhr-25 defects in the KRY-85’s nhr-25 then causing decrease’s in pgl-2 and rha-1 expression in its transcriptome while in the control RNAi group RNAi off target affects decreased its mes-6 and alg-1 and alg-2 expression. However, it is also likely that eviromental diffrences between the two samples have cuased the observed changes or that a combination of the above affects have lead to the observed outcome. The affects of RNAi off taret affects therefore seem to be possible, but will require additional and more carfeully controled groups to determine. However, it does appear that the control RNAi is not high jacking the RNAi mechanisms by over working them or binding to them at the expense of naturally produced siRNAs.
Next I compared the RNAi comparsion groups to a list of genes altered by exposure to ethanol found in “Ethanol-response genes and their regulation analyzed by a microarray and comparative genomic approach in the nematode Caenorhabditis elegans” by J Kwon et al. Using the VennDiagram tool provided on moodle and largely created by Vijayaraj Nagarajan, I input the signficantly differntially epxressed gene list found from the DEseq2 analysis of each RANi control group and compared each to the set of genes differentially expressed by exposure to ethanol in order to see if such genes were differntially expressed between the groups exposed to ethanol (all but the control RNAi group) and those not (the RNAi group). Of course as the RNAi may have off-target affects and has signficant envirometnal diffrences from the other groups which seem to have affected many genes across teh genenome, the results should be taken with a grain of salt.
The KRY85 and N2 with control RNAi group comparison had 48 of 211 ethanol response genes differntially expressed.
The N2 with axuin and N2 with control RNAi group comparsion had 41 of 211 ethanol response genes differntially expressed.
The N2 with ethanol only and N2 with control RNAi group comparsion had 46 of 211 ethanol response genes differntially expressed.
As a large number of ehtanol response genes, but ultimately less than a fourth of them in any one sample were differentially epxressed between the RNAi group and the ethanol groups and as I searched each RNAi group comparison expression for type 1 ethanol response genes and found that rarely did any of these gens include the Type 1 response gens that would have acitvated upon immediate contact with ethanol and stayed throughout exposure, it appears that exposure to ethanol did not alter the transcriptome of the nematodes. ##GOterm via Wormbase## Finally, we ran GOterm analysis through Wormbase for each data set save for N2 with Auxin and N2 with ethanol only coparsion for it lacked any genes to find the ontology for.
First we ran GOterm analysis of the KRY85 with ethanol and the N2 with auxin groups. It largely contains genes devoted to a variety of functions with little jumping out save for the large diffrences in dephosphylation related genes which would suggest high activity of somesort occurying in one control group over the other (though as phosphlyation is involved in a variety of biological prosesses it is hard to determine exactly what that might be or if it is related to nhr-25). Additionally, response to topologically incorrect protein related genes are differentially regulated here and in the comparison between KRY85 and N2 with ethanol and no other groups, suggesting that the KRY85’s differeces in expression nhr-25 and many other genes may be due to the nhr-25 protein failing to fold correctly with the AID tail attatched. Next we ran GOterm analysis on the KRY85 and N2 with ethanol comparison. It is, as expected, highly similar to the KRY85 and N2 with auxin comparison and also shows response to topologically incorrect protein response genes activating, further supporitng the idea that the nhr-25 in the KRY85 is altering expression between the other wise identical groups due to the AID tail on the nhr-25 in the KRY85 preventing the protein from folding into a correct and fully functional form.
Next we ran GOterm analysis on the KRY85 and N2 with RNAi comparison. That cytoplasm related genes are highly diffenetially regulated, this supports the idea that alot of the differences in the trasncriptome between the N2 with RNAi group and others is due to the RNAi exposed control group being in a far drier enviroment that the others and thus due to osmostic pressure and forces, requiring a diffrent set of regulators for the cytoplasm to keep it the correct percent water when compared to groups left in liquid medium.
Next we ran GOterm analysis on the N2 with Auxin and N2 with RNAi comparison. This reveals that not only due the N2 with auxin and N2 with RNAi groups differ heavily in trasncription of many genes, including nhr-25, but that those differences in nhr-25 and other genes direclty affect genes related to nhr-25 regulated phenotypic proccesses such as reproduction, post-embryonic organ devlopment, and post-embryonic development. This makes it clear that these two are not in any way equvilent controls for studies with nhr-25 as they differ vastly in the transcription of genes behind the very proccess one would look at for phenotypic differences between nhr-25 mutant and wild type worms.
Next we ran GOterm analysis on the N2 with ethanol and N2 with RNAi comparison. This one is very similar to the N2 with auxin and N2 with RNAi GOterm and provides similar conclusions (see above). Of note however, is that the structual consituent of the cuticle related genes differs between the two groups (as one would suspect to see in groups with vastly differnt enviroments, for the N2 with ethanol would have to have a far more hydrophobic cuticle than the RNAi group would need to devote resoruces to building, lest the N2 with ethanol worms become hyperhydrated and water logged).
Finally, we ran a GOterm analysis on just the up regulated genes for KRY85 in the KRY85 and N2 with auxin comparison. It shows that a vast quanitity of the upregulated genes are involved in responding to topologically incorrect proteins - suggesting that the KRY85’s nhr-25 has been failed to fold correctly as a result of the added AID tail to protein. This would explain many of the differences in expression (including that nhr-25) seen in the KRY85 comparsions.
From the above, we may first conclude that the N2 with ethanol and N2 with Auxin groups, while not totally identical, show no significant differences in their trasncriptome and as such function as identical controls, with the presecne of Auxin in an N2 with type worm causing no transcriptional alterations. Additionally, as dispite the RNAi exposed N2 worms having vast differences in envirometns and expression of thousands of genes between itself and the other N2 groups, the RNAi fed N2 (which was not exposed to ethanol) groups showed differntial expression in very few of ethanol response genes when compared to the ethanol exposed N2 with auxin and N2 with just ethanol groups. This indicates that ethanol exposure likely had little no affect on the transcriptome of the worms and as such, the N2 with auxin and N2 with ethanol only negative control groups appear to be equivelent to a control of the wild type exposed to no chemicals and the N2 with auxin and N2 with ethanol trasncriptomes may be considered within base line for the wild type for any experiment using the same petri dish enviromental conditions as these worm groups were epxosed too. That the RNAi exposed N2’s transcriptomes did not differ from the other control groups in terms of epxression levels for many of the RNAi pathway related genes affecting proccesses affected by nhr-25, we may conclude that the control RNAi is unlikely to have overworked or caused the RNAi pathway proteins to over preferntially bind itself - at least for any proetins that would affect the phenotypic respones to loss of nhr-25. While this makes the RNAi negative control group more viable for use as such in experiments using nhr-25 targeting RNAi, it does not validate it as a good negative control group for the differences in transcription seen the RNAi group were vast, possibly more so than can purely be accountd for by envirometnal diffrences, and so there remains the possiblity of RNAi off target affects - though the confounding variable of a wetter envirometn and less fat rich diet of the control RNAi negative control group when compared to the others means our anaylsis could not determine if off-target affects where occuring or if all differences in transcription where purely based in envirometna diffrences. The KRY85 control group appears to be a very poor one, for it differs in expression of many genes including nhr-25 from the wild type control groups, including the N2 groups of auxin and ethanol exposed which were in similar enviroments to it, and shows activation of genes which respond to topologically incorrect genes. This indicates that the nhr-25 protein in the KRY85 strain is likely misfolded due to the addtion of the the transgenic AID tail. As the nhr-25 is over expressed in the KRY85 group when compared to fairly wild type equivlent N2 with auxin and N2 with just ethanol groups, it appears that this misfolding is not getting corrected properly and as such any negative feed back pathways requiring nhr-25 presecnce to turn off are being left on, leading to the observed nhr-25 over expression and vast transcriptional differences to the other N2 groups across the genome. As such, the KRY85 with ethanol group cannot function as a negative control, for its nhr-25 is at least partially non-fucntional and as such the strain will likely the mutant phenoype rather than the wild type - as it was meant to.