This report analyses the output of the gradient-pattern differential expression in each of the four Flaveria species:
We define a function to load the DE data for each species and extract:
Now we can compute joint probabilities for our patterns. That is, we can calculate the probability that two potential patterns are both true (see http://sites.nicholas.duke.edu/statsreview/probability/jmc/).
We are particularly interested patterns associated with C4, so first we calculate:
# find patterns that are the same in both C3 and those that are the same in both C4
same <- function(x) {
length(unique(x)) == 1
}
sig <- function(x) {
prod(x) >= 0.95
}
c3 <- c("fp", "fr")
c3.prob <- paste(c3, "prob", sep = ".")
c3.pattern <- paste(c3, "pattern", sep = ".")
all$c3.pattern <- apply(all[, c3.pattern], 1, function(x) {
if (same(x)) {
return(x[1])
} else {
NA
}
})
all$c3.prob <- apply(all[, c3.prob], 1, prod)
c4 <- c("ft", "fb")
c4.prob <- paste(c4, "prob", sep = ".")
c4.pattern <- paste(c4, "pattern", sep = ".")
all$c4.pattern <- apply(all[, c4.pattern], 1, function(x) {
if (same(x)) {
return(x[1])
} else {
NA
}
})
all$c4.prob <- apply(all[, c4.prob], 1, prod)
# take a look
all[1:10, c("gene.id", "c3.pattern", "c3.prob", "c4.pattern", "c4.prob")]
## gene.id c3.pattern c3.prob c4.pattern c4.prob
## 1 AT1G01030 no significant pattern 0.3285 no significant pattern 0.2979
## 2 AT1G01040 no significant pattern 0.4790 no significant pattern 0.6920
## 3 AT1G01050 <NA> 0.5170 no significant pattern 0.5597
## 4 AT1G01060 <NA> 0.8941 up_gradient 0.9921
## 5 AT1G01080 up_gradient 0.9995 up_gradient 1.0000
## 6 AT1G01090 no significant pattern 0.8013 <NA> 0.6529
## 7 AT1G01120 no significant pattern 0.7390 no significant pattern 0.2780
## 8 AT1G01140 <NA> 0.9497 <NA> 0.5534
## 9 AT1G01150 <NA> 0.8458 down_gradient 0.9971
## 10 AT1G01160 no significant pattern 0.6273 no significant pattern 0.2780
Note:
<NA>means the two species didn't have the same pattern
From this we can calculate:
C3.prob * C4.prob)c3c4.prob <- c("c3.prob", "c4.prob")
c3c4.pattern <- c("c3.pattern", "c4.pattern")
all$c3c4.same <- apply(all[, c3c4.pattern], 1, function(x) {
if (all(is.na(x))) {
return(NA)
} else if (same(x)) {
return(TRUE)
} else {
return(FALSE)
}
})
all$c3c4.prob <- apply(all[, c3c4.prob], 1, prod)
To help us interpret the data, we also want to see some functional information for each gene, so we load the Arabidopsis full annotation we prepared for the earlier DE analysis and merge it with our results.
# annotate'em
annot <- read.csv("/data/genomes/ath/Athaliana_167_full_annotation.csv")
all <- merge(all, annot)
Now we can start asking the data interesting questions.
Note that we're only considering the pattern of changes of the genes, not their absolute expression. In a later analysis we'll add expression quantity to the calculations.
First, let's ask what genes have the same expression pattern in all four Flaveria species.
# next, those with the same pattern in all species
allsame <- subset(all, c3c4.same & c3c4.prob >= 0.95)
# how many genes have the same pattern in all four species?
nrow(allsame)
## [1] 1134
# how many genes are the same for each pattern?
allsame.tab <- table(allsame$fb.pattern)
allsame.tab
##
## down_gradient equal expression up_gradient
## 632 15 487
Not many genes have equal expression along the gradient in all four species - let's see which ones do.
# genes with equal expression along the gradient in all species
equal <- subset(allsame, c4.pattern == "equal expression")
equal[, c(1, 16:21)]
## gene.id Description PFAM.ID SMART.ID TF.Family TF.Genome.EST TF.Category
## 57 AT1G02030 C2H2-like zinc finger protein SM00355
## 504 AT1G09040
## 4816 AT2G30340 LOB domain-containing protein 13 PF03195
## 5658 AT2G43820 UDP-glucosyltransferase 74F2 PF00201
## 5885 AT2G47360
## 7152 AT3G19310 PLC-like phosphodiesterases superfamily protein
## 7230 AT3G20580 COBRA-like protein 10 precursor PF04833
## 8503 AT3G57500
## 9706 AT4G18660
## 9926 AT4G22600
## 10631 AT4G33600
## 11434 AT5G06920 FASCICLIN-like arabinogalactan protein 21 precursor SM00554
## 12728 AT5G36000
## 13539 AT5G51500 Plant invertase/pectin methylesterase inhibitor superfamily PF04043 SM00856
## 13923 AT5G57830 Protein of unknown function, DUF593 PF04576
We have a list of genes of particular interest - let's see what their patterns are across the species.
# what's the representation of the key AGIs?
key <- read.csv("key_agi.txt", sep = "\t", head = F)
names(key) <- c("ati", "name")
key$agi <- gsub(key$ati, pattern = "\\.[0-9]+", replacement = "")
key.all <- merge(key[, 2:3], all, by.x = "agi", by.y = "gene.id")
table(key.all[c("c4.pattern", "c3c4.same")])
## c3c4.same
## c4.pattern FALSE TRUE
## down_gradient 2 4
## down_in_base 0 1
## no significant pattern 7 4
## up_gradient 3 5
14 of the genes of interest have the same pattern in all species. Which ones?
na.omit(key.all[key.all$c3c4.same == T, c("agi", "name", "c4.pattern", "Description")])
## agi name c4.pattern Description
## 2 AT1G05470 CVP2 down_gradient DNAse I-like superfamily protein
## 3 AT1G08540 TF18 down_in_base RNApolymerase sigma subunit 2
## 6 AT1G19850 MP down_gradient Transcriptional factor B3 family protein / auxin-responsive factor AUX/IAA-related
## 8 AT1G25440 TF01 no significant pattern B-box type zinc finger protein with CCT domain
## 11 AT1G52150 ATHB15 down_gradient Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein
## 13 AT1G64860 TF17 up_gradient sigma factor A
## 25 AT2G22430 HB6 no significant pattern homeobox protein 6
## 28 AT2G34710 PHB no significant pattern Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein
## 29 AT2G35940 TF11 up_gradient BEL1-like homeodomain 1
## 31 AT2G41940 TF03 no significant pattern zinc finger protein 8
## 34 AT3G53920 TF16 up_gradient RNApolymerase sigma-subunit C
## 45 AT5G16780 DOT2 down_gradient SART-1 family
## 47 AT5G41410 TF12 up_gradient POX (plant homeobox) family protein
## 50 AT5G67030 TF06 up_gradient zeaxanthin epoxidase (ZEP) (ABA1)
We can identify putative C4-related genes. Our strict definition might be:
Genes where:
c4diff.a <- subset(all, !c3c4.same & c3c4.prob >= 0.95 & !is.na(c3.pattern) & !is.na(c4.pattern))
# how many?
nrow(c4diff.a)
## [1] 7
# what patterns?
c4diff.a.tab <- table(c4diff.a[, c("c4.pattern", "c3.pattern")])
c4diff.a.tab
## c3.pattern
## c4.pattern down_gradient equal expression up_gradient up_in_base
## down_gradient 0 0 1 3
## equal expression 1 0 0 0
## up_gradient 0 1 0 0
## up_in_base 0 1 0 0
# and let's look at the actual genes
c4diff.a[, c("gene.id", "c3.pattern", "c4.pattern", "Description")]
## gene.id c3.pattern c4.pattern Description
## 2095 AT1G49400 down_gradient equal expression Nucleic acid-binding, OB-fold-like protein
## 2414 AT1G55730 up_gradient down_gradient cation exchanger 5
## 5699 AT2G44580 up_in_base down_gradient zinc ion binding
## 9072 AT4G04500 equal expression up_in_base cysteine-rich RLK (RECEPTOR-like protein kinase) 37
## 11765 AT5G12920 up_in_base down_gradient Transducin/WD40 repeat-like superfamily protein
## 13319 AT5G48100 equal expression up_gradient Laccase/Diphenol oxidase family protein
## 13586 AT5G52220 up_in_base down_gradient
We can have a more relaxed definition by allowing:
c4diff.b <- subset(all, fr.pattern != c4.pattern & fp.pattern != c4.pattern & c4.prob >= 0.95, !is.na(c4.pattern))
# how many?
nrow(c4diff.b)
## [1] 365
# what patterns?
c4diff.b.tab <- table(c4diff.b[, c("c4.pattern", "c3.pattern")])
c4diff.b.tab
## c3.pattern
## c4.pattern down_gradient down_in_base down_in_tip equal expression no significant pattern up_gradient up_in_base
## 2_1_3 0 0 0 0 1 0 0
## down_gradient 0 0 1 0 62 1 6
## down_in_base 0 0 0 0 0 1 0
## equal expression 1 0 0 0 51 1 0
## up_gradient 0 2 0 1 96 0 0
## up_in_base 0 0 0 1 5 0 0
## up_in_tip 0 0 0 0 0 0 0
# and let's look at the actual genes
c4diff.b[, c("gene.id", "c3.pattern", "c4.pattern", "Description")]
## gene.id c3.pattern c4.pattern Description
## 15 AT1G01225 <NA> down_gradient NC domain-containing protein-related
## 21 AT1G01320 <NA> down_in_base Tetratricopeptide repeat (TPR)-like superfamily protein
## 48 AT1G01920 no significant pattern down_gradient SET domain-containing protein
## 70 AT1G02180 <NA> down_in_base ferredoxin-related
## 77 AT1G02400 no significant pattern equal expression gibberellin 2-oxidase 6
## 102 AT1G02910 no significant pattern up_gradient tetratricopeptide repeat (TPR)-containing protein
## 111 AT1G03050 no significant pattern up_gradient ENTH/ANTH/VHS superfamily protein
## 114 AT1G03070 no significant pattern up_gradient Bax inhibitor-1 family protein
## 123 AT1G03170 no significant pattern up_gradient Protein of unknown function (DUF3049)
## 178 AT1G04130 <NA> down_gradient Tetratricopeptide repeat (TPR)-like superfamily protein
## 187 AT1G04220 no significant pattern equal expression 3-ketoacyl-CoA synthase 2
## 234 AT1G04880 no significant pattern down_gradient HMG (high mobility group) box protein with ARID/BRIGHT DNA-binding domain
## 268 AT1G05320 no significant pattern up_gradient
## 357 AT1G06930 no significant pattern equal expression
## 379 AT1G07230 no significant pattern equal expression non-specific phospholipase C1
## 398 AT1G07530 no significant pattern up_gradient SCARECROW-like 14
## 473 AT1G08600 <NA> down_gradient P-loop containing nucleoside triphosphate hydrolases superfamily protein
## 521 AT1G09290 no significant pattern down_gradient
## 527 AT1G09380 no significant pattern equal expression nodulin MtN21 /EamA-like transporter family protein
## 588 AT1G10170 no significant pattern equal expression NF-X-like 1
## 590 AT1G10200 no significant pattern up_gradient GATA type zinc finger transcription factor family protein
## 596 AT1G10310 <NA> up_gradient NAD(P)-binding Rossmann-fold superfamily protein
## 701 AT1G11900 no significant pattern down_gradient Tetratricopeptide repeat (TPR)-like superfamily protein
## 738 AT1G12460 no significant pattern down_gradient Leucine-rich repeat protein kinase family protein
## 767 AT1G12930 no significant pattern down_gradient ARM repeat superfamily protein
## 793 AT1G13310 no significant pattern up_gradient Endosomal targeting BRO1-like domain-containing protein
## 796 AT1G13340 no significant pattern up_gradient Regulator of Vps4 activity in the MVB pathway protein
## 868 AT1G14700 <NA> up_gradient purple acid phosphatase 3
## 1037 AT1G17680 no significant pattern up_gradient tetratricopeptide repeat (TPR)-containing protein
## 1040 AT1G17720 <NA> up_gradient Protein phosphatase 2A, regulatory subunit PR55
## 1146 AT1G19610 <NA> equal expression Arabidopsis defensin-like protein
## 1148 AT1G19640 <NA> up_gradient jasmonic acid carboxyl methyltransferase
## 1211 AT1G20830 <NA> up_gradient multiple chloroplast division site 1
## 1253 AT1G21651 <NA> up_gradient zinc ion binding
## 1254 AT1G21680 no significant pattern up_gradient DPP6 N-terminal domain-like protein
## 1259 AT1G21722 no significant pattern equal expression
## 1325 AT1G22930 no significant pattern up_gradient T-complex protein 11
## 1326 AT1G22940 <NA> up_gradient thiamin biosynthesis protein, putative
## 1366 AT1G23740 <NA> up_gradient Oxidoreductase, zinc-binding dehydrogenase family protein
## 1368 AT1G23780 no significant pattern up_gradient F-box family protein
## 1378 AT1G24040 no significant pattern up_gradient Acyl-CoA N-acyltransferases (NAT) superfamily protein
## 1398 AT1G24440 <NA> up_gradient RING/U-box superfamily protein
## 1407 AT1G24590 no significant pattern equal expression DORNROSCHEN-like
## 1415 AT1G25260 no significant pattern down_gradient Ribosomal protein L10 family protein
## 1449 AT1G26190 <NA> down_gradient Phosphoribulokinase / Uridine kinase family
## 1555 AT1G28110 <NA> down_gradient serine carboxypeptidase-like 45
## 1637 AT1G29740 no significant pattern down_gradient Leucine-rich repeat transmembrane protein kinase
## 1679 AT1G30330 no significant pattern down_gradient auxin response factor 6
## 1719 AT1G31040 <NA> down_gradient PLATZ transcription factor family protein
## 1739 AT1G31410 no significant pattern up_gradient putrescine-binding periplasmic protein-related
## 1754 AT1G31790 no significant pattern equal expression Tetratricopeptide repeat (TPR)-like superfamily protein
## 1760 AT1G31840 <NA> up_in_base Tetratricopeptide repeat (TPR)-like superfamily protein
## 1828 AT1G33270 no significant pattern up_gradient Acyl transferase/acyl hydrolase/lysophospholipase superfamily protein
## 1872 AT1G34360 <NA> up_in_base translation initiation factor 3 (IF-3) family protein
## 1889 AT1G35210 <NA> up_in_tip
## 1935 AT1G42960 <NA> up_gradient
## 1960 AT1G44130 no significant pattern up_gradient Eukaryotic aspartyl protease family protein
## 1968 AT1G44770 no significant pattern up_gradient
## 2095 AT1G49400 down_gradient equal expression Nucleic acid-binding, OB-fold-like protein
## 2159 AT1G50700 no significant pattern up_gradient calcium-dependent protein kinase 33
## 2199 AT1G51630 <NA> down_gradient O-fucosyltransferase family protein
## 2296 AT1G53520 <NA> up_gradient Chalcone-flavanone isomerase family protein
## 2320 AT1G53885 no significant pattern up_gradient Protein of unknown function (DUF581)
## 2362 AT1G54710 no significant pattern up_gradient homolog of yeast autophagy 18 (ATG18) H
## 2406 AT1G55535 <NA> up_gradient
## 2414 AT1G55730 up_gradient down_gradient cation exchanger 5
## 2476 AT1G57540 <NA> down_gradient
## 2525 AT1G59740 no significant pattern down_gradient Major facilitator superfamily protein
## 2554 AT1G60420 no significant pattern up_gradient DC1 domain-containing protein
## 2683 AT1G63220 no significant pattern down_gradient Calcium-dependent lipid-binding (CaLB domain) family protein
## 2690 AT1G63410 no significant pattern up_gradient Protein of unknown function (DUF567)
## 2700 AT1G63660 <NA> down_gradient GMP synthase (glutamine-hydrolyzing), putative / glutamine amidotransferase, putative
## 2702 AT1G63690 <NA> down_gradient SIGNAL PEPTIDE PEPTIDASE-LIKE 2
## 2730 AT1G64330 no significant pattern down_gradient myosin heavy chain-related
## 2761 AT1G64750 no significant pattern equal expression deletion of SUV3 suppressor 1(I)
## 2795 AT1G65420 <NA> up_gradient Protein of unknown function (DUF565)
## 2806 AT1G65650 no significant pattern down_gradient Peptidase C12, ubiquitin carboxyl-terminal hydrolase 1
## 2870 AT1G67040 no significant pattern down_gradient
## 2917 AT1G67730 no significant pattern down_gradient beta-ketoacyl reductase 1
## 2955 AT1G68220 <NA> down_gradient Protein of unknown function (DUF1218)
## 3006 AT1G68990 no significant pattern down_gradient male gametophyte defective 3
## 3193 AT1G72330 no significant pattern up_gradient alanine aminotransferase 2
## 3210 AT1G72640 no significant pattern up_gradient NAD(P)-binding Rossmann-fold superfamily protein
## 3239 AT1G73177 <NA> down_gradient bonsai
## 3314 AT1G74470 <NA> up_gradient Pyridine nucleotide-disulphide oxidoreductase family protein
## 3361 AT1G75180 no significant pattern up_gradient Erythronate-4-phosphate dehydrogenase family protein
## 3377 AT1G75388 <NA> up_gradient conserved peptide upstream open reading frame 5
## 3401 AT1G75750 no significant pattern up_gradient GAST1 protein homolog 1
## 3417 AT1G76060 no significant pattern equal expression LYR family of Fe/S cluster biogenesis protein
## 3430 AT1G76250 no significant pattern up_gradient
## 3442 AT1G76430 no significant pattern equal expression phosphate transporter 1;9
## 3451 AT1G76560 <NA> up_gradient CP12 domain-containing protein 3
## 3473 AT1G76940 no significant pattern up_gradient RNA-binding (RRM/RBD/RNP motifs) family protein
## 3485 AT1G77130 <NA> down_gradient plant glycogenin-like starch initiation protein 2
## 3551 AT1G78290 <NA> up_gradient Protein kinase superfamily protein
## 3580 AT1G78780 no significant pattern up_in_base pathogenesis-related family protein
## 3626 AT1G79420 <NA> down_gradient Protein of unknown function (DUF620)
## 3645 AT1G79640 no significant pattern up_gradient Protein kinase superfamily protein
## 3648 AT1G79670 no significant pattern equal expression Wall-associated kinase family protein
## 3709 AT1G80550 no significant pattern equal expression Pentatricopeptide repeat (PPR) superfamily protein
## 3780 AT2G01650 <NA> up_gradient plant UBX domain-containing protein 2
## 3783 AT2G01680 no significant pattern up_gradient Ankyrin repeat family protein
## 3840 AT2G02870 <NA> up_gradient Galactose oxidase/kelch repeat superfamily protein
## 3900 AT2G04560 <NA> down_gradient transferases, transferring glycosyl groups
## 3923 AT2G05210 no significant pattern down_gradient Nucleic acid-binding, OB-fold-like protein
## 4020 AT2G14960 no significant pattern down_gradient Auxin-responsive GH3 family protein
## 4028 AT2G15290 no significant pattern up_gradient translocon at inner membrane of chloroplasts 21
## 4044 AT2G15860 up_in_base down_gradient
## 4076 AT2G16750 no significant pattern up_gradient Protein kinase protein with adenine nucleotide alpha hydrolases-like domain
## 4146 AT2G18030 no significant pattern down_gradient Peptide methionine sulfoxide reductase family protein
## 4158 AT2G18230 no significant pattern up_gradient pyrophosphorylase 2
## 4160 AT2G18250 <NA> up_gradient 4-phosphopantetheine adenylyltransferase
## 4171 AT2G18500 <NA> up_in_base ovate family protein 7
## 4237 AT2G19900 <NA> up_gradient NADP-malic enzyme 1
## 4280 AT2G20585 no significant pattern down_gradient nuclear fusion defective 6
## 4411 AT2G22950 no significant pattern equal expression Cation transporter/ E1-E2 ATPase family protein
## 4463 AT2G24130 <NA> down_gradient Leucine-rich receptor-like protein kinase family protein
## 4546 AT2G25820 <NA> down_gradient Integrase-type DNA-binding superfamily protein
## 4575 AT2G26290 no significant pattern equal expression root-specific kinase 1
## 4585 AT2G26490 no significant pattern up_gradient Transducin/WD40 repeat-like superfamily protein
## 4605 AT2G26700 no significant pattern down_gradient AGC (cAMP-dependent, cGMP-dependent and protein kinase C) kinase family protein
## 4621 AT2G26930 no significant pattern up_gradient 4-(cytidine 5'-phospho)-2-C-methyl-D-erithritol kinase
## 4696 AT2G28200 no significant pattern up_gradient C2H2-type zinc finger family protein
## 4742 AT2G28940 no significant pattern down_gradient Protein kinase superfamily protein
## 4829 AT2G30520 no significant pattern up_gradient Phototropic-responsive NPH3 family protein
## 4851 AT2G30890 no significant pattern down_gradient Cytochrome b561/ferric reductase transmembrane protein family
## 4855 AT2G30933 no significant pattern equal expression Carbohydrate-binding X8 domain superfamily protein
## 4870 AT2G31190 <NA> up_gradient Protein of unknown function, DUF647
## 4899 AT2G31670 no significant pattern up_gradient Stress responsive alpha-beta barrel domain protein
## 4921 AT2G32010 <NA> down_gradient CVP2 like 1
## 4922 AT2G32040 <NA> up_gradient Major facilitator superfamily protein
## 4977 AT2G33100 <NA> down_gradient cellulose synthase-like D1
## 5022 AT2G33793 up_in_base down_gradient
## 5152 AT2G36110 <NA> down_gradient Polynucleotidyl transferase, ribonuclease H-like superfamily protein
## 5201 AT2G36800 no significant pattern up_gradient don-glucosyltransferase 1
## 5210 AT2G36890 no significant pattern down_gradient Duplicated homeodomain-like superfamily protein
## 5330 AT2G38640 no significant pattern up_gradient Protein of unknown function (DUF567)
## 5479 AT2G40780 no significant pattern down_gradient Nucleic acid-binding, OB-fold-like protein
## 5484 AT2G40830 no significant pattern up_gradient RING-H2 finger C1A
## 5620 AT2G43090 no significant pattern up_gradient Aconitase/3-isopropylmalate dehydratase protein
## 5636 AT2G43330 up_gradient down_in_base inositol transporter 1
## 5699 AT2G44580 up_in_base down_gradient zinc ion binding
## 5727 AT2G44980 <NA> down_gradient SNF2 domain-containing protein / helicase domain-containing protein
## 5758 AT2G45420 no significant pattern equal expression LOB domain-containing protein 18
## 5791 AT2G45800 no significant pattern up_gradient GATA type zinc finger transcription factor family protein
## 5801 AT2G46020 no significant pattern down_gradient transcription regulatory protein SNF2, putative
## 5842 AT2G46570 no significant pattern down_gradient laccase 6
## 5900 AT2G47600 no significant pattern up_gradient magnesium/proton exchanger
## 5925 AT2G47970 <NA> equal expression Nuclear pore localisation protein NPL4
## 6020 AT3G02200 no significant pattern down_gradient Proteasome component (PCI) domain protein
## 6022 AT3G02220 <NA> down_gradient
## 6072 AT3G02875 <NA> down_gradient Peptidase M20/M25/M40 family protein
## 6080 AT3G03000 no significant pattern equal expression EF hand calcium-binding protein family
## 6109 AT3G03500 <NA> up_gradient TatD related DNase
## 6212 AT3G05150 no significant pattern up_gradient Major facilitator superfamily protein
## 6227 AT3G05345 no significant pattern up_gradient Chaperone DnaJ-domain superfamily protein
## 6254 AT3G05690 no significant pattern up_gradient nuclear factor Y, subunit A2
## 6275 AT3G06040 no significant pattern down_gradient Ribosomal protein L12/ ATP-dependent Clp protease adaptor protein ClpS family protein
## 6341 AT3G07010 <NA> down_gradient Pectin lyase-like superfamily protein
## 6453 AT3G08840 no significant pattern up_gradient D-alanine--D-alanine ligase family
## 6459 AT3G08930 <NA> down_gradient LMBR1-like membrane protein
## 6599 AT3G10980 <NA> up_gradient PLAC8 family protein
## 6776 AT3G13730 no significant pattern equal expression cytochrome P450, family 90, subfamily D, polypeptide 1
## 6817 AT3G14240 <NA> down_gradient Subtilase family protein
## 6839 AT3G14600 no significant pattern down_gradient Ribosomal protein L18ae/LX family protein
## 6840 AT3G14610 no significant pattern up_gradient cytochrome P450, family 72, subfamily A, polypeptide 7
## 6940 AT3G15990 <NA> down_gradient sulfate transporter 3;4
## 6968 AT3G16330 <NA> down_gradient
## 7022 AT3G17350 <NA> equal expression
## 7059 AT3G17940 no significant pattern down_gradient Galactose mutarotase-like superfamily protein
## 7105 AT3G18550 <NA> equal expression TCP family transcription factor
## 7107 AT3G18570 no significant pattern equal expression Oleosin family protein
## 7159 AT3G19440 no significant pattern down_gradient Pseudouridine synthase family protein
## 7162 AT3G19490 <NA> up_gradient sodium:hydrogen antiporter 1
## 7221 AT3G20480 <NA> down_gradient tetraacyldisaccharide 4'-kinase family protein
## 7243 AT3G20790 no significant pattern down_gradient NAD(P)-binding Rossmann-fold superfamily protein
## 7273 AT3G21250 no significant pattern up_gradient multidrug resistance-associated protein 6
## 7289 AT3G21480 no significant pattern down_gradient BRCT domain-containing DNA repair protein
## 7315 AT3G22160 no significant pattern equal expression VQ motif-containing protein
## 7330 AT3G22400 <NA> up_gradient PLAT/LH2 domain-containing lipoxygenase family protein
## 7342 AT3G22560 <NA> equal expression Acyl-CoA N-acyltransferases (NAT) superfamily protein
## 7374 AT3G23150 <NA> down_gradient Signal transduction histidine kinase, hybrid-type, ethylene sensor
## 7400 AT3G23570 up_gradient equal expression alpha/beta-Hydrolases superfamily protein
## 7409 AT3G23690 no significant pattern up_gradient basic helix-loop-helix (bHLH) DNA-binding superfamily protein
## 7416 AT3G23770 no significant pattern equal expression O-Glycosyl hydrolases family 17 protein
## 7521 AT3G25590 <NA> down_gradient
## 7666 AT3G28210 no significant pattern up_gradient zinc finger (AN1-like) family protein
## 7674 AT3G28455 <NA> down_gradient CLAVATA3/ESR-RELATED 25
## 7736 AT3G30725 no significant pattern equal expression glutamine dumper 6
## 7808 AT3G45040 no significant pattern up_gradient phosphatidate cytidylyltransferase family protein
## 7830 AT3G45780 no significant pattern up_gradient phototropin 1
## 7865 AT3G46610 no significant pattern up_gradient Pentatricopeptide repeat (PPR-like) superfamily protein
## 7904 AT3G47630 no significant pattern down_gradient
## 8042 AT3G49880 <NA> up_gradient glycosyl hydrolase family protein 43
## 8065 AT3G50560 no significant pattern up_gradient NAD(P)-binding Rossmann-fold superfamily protein
## 8168 AT3G52190 <NA> down_gradient phosphate transporter traffic facilitator1
## 8170 AT3G52210 no significant pattern down_gradient S-adenosyl-L-methionine-dependent methyltransferases superfamily protein
## 8181 AT3G52350 no significant pattern equal expression D111/G-patch domain-containing protein
## 8225 AT3G53120 <NA> up_gradient Modifier of rudimentary (Mod(r)) protein
## 8252 AT3G53580 down_in_base up_gradient diaminopimelate epimerase family protein
## 8346 AT3G54960 no significant pattern down_gradient PDI-like 1-3
## 8350 AT3G55000 no significant pattern down_gradient tonneau family protein
## 8598 AT3G59490 no significant pattern equal expression
## 8600 AT3G59520 <NA> equal expression RHOMBOID-like protein 13
## 8641 AT3G60280 no significant pattern up_gradient uclacyanin 3
## 8660 AT3G60550 <NA> down_gradient cyclin p3;2
## 8674 AT3G60780 no significant pattern equal expression Protein of unknown function (DUF1442)
## 8680 AT3G60850 no significant pattern equal expression
## 8684 AT3G60910 no significant pattern up_gradient S-adenosyl-L-methionine-dependent methyltransferases superfamily protein
## 8690 AT3G61080 no significant pattern up_gradient Protein kinase superfamily protein
## 8748 AT3G62040 no significant pattern up_gradient Haloacid dehalogenase-like hydrolase (HAD) superfamily protein
## 8762 AT3G62270 no significant pattern up_gradient HCO3- transporter family
## 8795 AT3G62950 no significant pattern up_gradient Thioredoxin superfamily protein
## 8841 AT3G66658 up_in_base down_gradient aldehyde dehydrogenase 22A1
## 8861 AT4G00335 no significant pattern up_in_base RING-H2 finger B1A
## 8864 AT4G00370 no significant pattern up_gradient Major facilitator superfamily protein
## 8912 AT4G01240 no significant pattern equal expression S-adenosyl-L-methionine-dependent methyltransferases superfamily protein
## 8922 AT4G01440 no significant pattern down_gradient nodulin MtN21 /EamA-like transporter family protein
## 8923 AT4G01470 no significant pattern equal expression tonoplast intrinsic protein 1;3
## 8942 AT4G01880 no significant pattern down_gradient methyltransferases
## 8961 AT4G02170 no significant pattern equal expression
## 9007 AT4G02780 <NA> down_gradient Terpenoid cyclases/Protein prenyltransferases superfamily protein
## 9051 AT4G03520 no significant pattern up_gradient Thioredoxin superfamily protein
## 9072 AT4G04500 equal expression up_in_base cysteine-rich RLK (RECEPTOR-like protein kinase) 37
## 9138 AT4G08280 <NA> up_gradient Thioredoxin superfamily protein
## 9148 AT4G08550 <NA> down_gradient electron carriers;protein disulfide oxidoreductases
## 9189 AT4G09760 no significant pattern up_gradient Protein kinase superfamily protein
## 9236 AT4G10750 no significant pattern down_gradient Phosphoenolpyruvate carboxylase family protein
## 9237 AT4G10760 <NA> down_gradient mRNAadenosine methylase
## 9255 AT4G11060 <NA> down_gradient mitochondrially targeted single-stranded DNA binding protein
## 9270 AT4G11400 <NA> down_gradient ARID/BRIGHT DNA-binding domain;ELM2 domain protein
## 9326 AT4G12620 no significant pattern down_gradient origin of replication complex 1B
## 9365 AT4G13345 no significant pattern 2_1_3 Serinc-domain containing serine and sphingolipid biosynthesis protein
## 9419 AT4G14180 <NA> up_in_base putative recombination initiation defect 1
## 9431 AT4G14350 <NA> up_gradient AGC (cAMP-dependent, cGMP-dependent and protein kinase C) kinase family protein
## 9481 AT4G15093 no significant pattern up_gradient catalytic LigB subunit of aromatic ring-opening dioxygenase family
## 9555 AT4G16370 <NA> up_gradient oligopeptide transporter
## 9610 AT4G17180 no significant pattern up_in_base O-Glycosyl hydrolases family 17 protein
## 9685 AT4G18350 <NA> equal expression nine-cis-epoxycarotenoid dioxygenase 2
## 9730 AT4G19020 no significant pattern down_gradient chromomethylase 2
## 9745 AT4G19191 no significant pattern down_gradient Tetratricopeptide repeat (TPR)-like superfamily protein
## 9785 AT4G20050 no significant pattern equal expression Pectin lyase-like superfamily protein
## 9787 AT4G20070 <NA> up_gradient allantoate amidohydrolase
## 9944 AT4G22910 <NA> up_gradient FIZZY-related 2
## 9973 AT4G23550 no significant pattern equal expression WRKY family transcription factor
## 10046 AT4G24560 <NA> up_gradient ubiquitin-specific protease 16
## 10137 AT4G25980 <NA> down_gradient Peroxidase superfamily protein
## 10217 AT4G27130 <NA> up_gradient Translation initiation factor SUI1 family protein
## 10221 AT4G27250 no significant pattern equal expression NAD(P)-binding Rossmann-fold superfamily protein
## 10274 AT4G28070 no significant pattern up_gradient AFG1-like ATPase family protein
## 10320 AT4G28820 no significant pattern up_gradient HIT-type Zinc finger family protein
## 10339 AT4G29100 down_in_base up_gradient basic helix-loop-helix (bHLH) DNA-binding superfamily protein
## 10341 AT4G29120 no significant pattern up_gradient 6-phosphogluconate dehydrogenase family protein
## 10399 AT4G30160 <NA> up_gradient villin 4
## 10425 AT4G30550 <NA> up_gradient Class I glutamine amidotransferase-like superfamily protein
## 10444 AT4G30845 no significant pattern up_gradient
## 10457 AT4G30993 <NA> up_gradient Calcineurin-like metallo-phosphoesterase superfamily protein
## 10459 AT4G31000 no significant pattern equal expression Calmodulin-binding protein
## 10465 AT4G31115 no significant pattern up_gradient Protein of unknown function (DUF1997)
## 10528 AT4G32160 <NA> up_gradient Phox (PX) domain-containing protein
## 10546 AT4G32400 <NA> up_gradient Mitochondrial substrate carrier family protein
## 10581 AT4G32870 no significant pattern up_gradient Polyketide cyclase/dehydrase and lipid transport superfamily protein
## 10596 AT4G33060 <NA> down_gradient Cyclophilin-like peptidyl-prolyl cis-trans isomerase family protein
## 10626 AT4G33510 no significant pattern up_gradient 3-deoxy-d-arabino-heptulosonate 7-phosphate synthase
## 10630 AT4G33580 no significant pattern up_gradient beta carbonic anhydrase 5
## 10633 AT4G33630 <NA> up_gradient Protein of unknown function (DUF3506)
## 10642 AT4G33780 <NA> up_gradient
## 10645 AT4G33820 <NA> up_gradient Glycosyl hydrolase superfamily protein
## 10678 AT4G34260 <NA> down_gradient 1,2-alpha-L-fucosidases
## 10791 AT4G35905 no significant pattern equal expression
## 10851 AT4G36910 <NA> up_gradient Cystathionine beta-synthase (CBS) family protein
## 10855 AT4G36945 no significant pattern equal expression PLC-like phosphodiesterases superfamily protein
## 10867 AT4G37070 <NA> down_gradient Acyl transferase/acyl hydrolase/lysophospholipase superfamily protein
## 11023 AT4G39490 no significant pattern equal expression cytochrome P450, family 96, subfamily A, polypeptide 10
## 11080 AT5G01250 no significant pattern equal expression alpha 1,4-glycosyltransferase family protein
## 11098 AT5G01510 no significant pattern up_gradient Protein of unknown function, DUF647
## 11140 AT5G02230 no significant pattern up_gradient Haloacid dehalogenase-like hydrolase (HAD) superfamily protein
## 11177 AT5G02880 no significant pattern up_gradient ubiquitin-protein ligase 4
## 11200 AT5G03380 no significant pattern up_gradient Heavy metal transport/detoxification superfamily protein
## 11314 AT5G05180 no significant pattern down_gradient
## 11318 AT5G05220 no significant pattern up_gradient
## 11329 AT5G05360 <NA> up_gradient
## 11330 AT5G05365 no significant pattern up_gradient Heavy metal transport/detoxification superfamily protein
## 11380 AT5G06100 no significant pattern equal expression myb domain protein 33
## 11474 AT5G07840 <NA> up_gradient Ankyrin repeat family protein
## 11483 AT5G07990 <NA> down_gradient Cytochrome P450 superfamily protein
## 11490 AT5G08100 no significant pattern equal expression N-terminal nucleophile aminohydrolases (Ntn hydrolases) superfamily protein
## 11530 AT5G08570 <NA> down_gradient Pyruvate kinase family protein
## 11605 AT5G10170 <NA> up_gradient myo-inositol-1-phosphate synthase 3
## 11636 AT5G10720 <NA> up_in_base histidine kinase 5
## 11762 AT5G12870 <NA> down_in_base myb domain protein 46
## 11765 AT5G12920 up_in_base down_gradient Transducin/WD40 repeat-like superfamily protein
## 11859 AT5G14150 no significant pattern down_gradient Protein of unknown function, DUF642
## 11861 AT5G14180 no significant pattern down_gradient Myzus persicae-induced lipase 1
## 11869 AT5G14270 no significant pattern down_gradient bromodomain and extraterminal domain protein 9
## 11890 AT5G14580 <NA> down_gradient polyribonucleotide nucleotidyltransferase, putative
## 12004 AT5G16290 no significant pattern down_gradient VALINE-TOLERANT 1
## 12069 AT5G17390 no significant pattern down_gradient Adenine nucleotide alpha hydrolases-like superfamily protein
## 12129 AT5G18460 no significant pattern down_gradient Protein of Unknown Function (DUF239)
## 12144 AT5G18640 <NA> up_gradient alpha/beta-Hydrolases superfamily protein
## 12155 AT5G18860 <NA> up_gradient inosine-uridine preferring nucleoside hydrolase family protein
## 12158 AT5G18910 <NA> down_gradient Protein kinase superfamily protein
## 12166 AT5G19010 no significant pattern up_gradient mitogen-activated protein kinase 16
## 12205 AT5G19570 <NA> down_gradient
## 12222 AT5G19790 <NA> equal expression related to AP2 11
## 12225 AT5G19840 no significant pattern equal expression 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein
## 12232 AT5G19930 no significant pattern up_gradient Protein of unknown function DUF92, transmembrane
## 12276 AT5G20540 no significant pattern down_gradient BREVIS RADIX-like 4
## 12327 AT5G22000 no significant pattern up_gradient RING-H2 group F2A
## 12446 AT5G23950 no significant pattern up_gradient Calcium-dependent lipid-binding (CaLB domain) family protein
## 12484 AT5G24600 no significant pattern equal expression Protein of unknown function, DUF599
## 12510 AT5G25120 no significant pattern equal expression ytochrome p450, family 71, subfamily B, polypeptide 11
## 12533 AT5G25530 no significant pattern equal expression DNAJ heat shock family protein
## 12547 AT5G25820 <NA> equal expression Exostosin family protein
## 12594 AT5G26980 no significant pattern up_in_base syntaxin of plants 41
## 12759 AT5G37340 <NA> down_gradient ZPR1 zinc-finger domain protein
## 12765 AT5G37480 no significant pattern up_gradient
## 12766 AT5G37490 no significant pattern equal expression ARM repeat superfamily protein
## 12921 AT5G40890 no significant pattern up_gradient chloride channel A
## 12977 AT5G41990 <NA> down_gradient with no lysine (K) kinase 8
## 13103 AT5G44010 no significant pattern down_gradient
## 13118 AT5G44230 <NA> down_gradient Pentatricopeptide repeat (PPR) superfamily protein
## 13234 AT5G46790 no significant pattern up_gradient PYR1-like 1
## 13245 AT5G47020 no significant pattern up_gradient
## 13304 AT5G47860 <NA> up_gradient Protein of unknown function (DUF1350)
## 13319 AT5G48100 equal expression up_gradient Laccase/Diphenol oxidase family protein
## 13358 AT5G48660 no significant pattern down_gradient B-cell receptor-associated protein 31-like
## 13365 AT5G48800 no significant pattern up_in_base Phototropic-responsive NPH3 family protein
## 13368 AT5G48830 <NA> up_gradient
## 13377 AT5G48940 <NA> down_gradient Leucine-rich repeat transmembrane protein kinase family protein
## 13406 AT5G49520 <NA> equal expression WRKY DNA-binding protein 48
## 13427 AT5G49800 <NA> down_gradient Polyketide cyclase/dehydrase and lipid transport superfamily protein
## 13433 AT5G49890 no significant pattern up_gradient chloride channel C
## 13537 AT5G51460 <NA> up_gradient Haloacid dehalogenase-like hydrolase (HAD) superfamily protein
## 13553 AT5G51720 no significant pattern up_gradient 2 iron, 2 sulfur cluster binding
## 13581 AT5G52170 no significant pattern equal expression homeodomain GLABROUS 7
## 13586 AT5G52220 up_in_base down_gradient
## 13674 AT5G53660 no significant pattern down_gradient growth-regulating factor 7
## 13761 AT5G55180 <NA> up_in_base O-Glycosyl hydrolases family 17 protein
## 13765 AT5G55250 no significant pattern equal expression IAA carboxylmethyltransferase 1
## 13812 AT5G56090 no significant pattern down_gradient cytochrome c oxidase 15
## 13836 AT5G56530 <NA> up_gradient Protein of Unknown Function (DUF239)
## 13922 AT5G57815 no significant pattern equal expression Cytochrome c oxidase, subunit Vib family protein
## 13929 AT5G57900 no significant pattern up_gradient SKP1 interacting partner 1
## 13964 AT5G58320 no significant pattern up_gradient Kinase interacting (KIP1-like) family protein
## 14003 AT5G58960 no significant pattern down_gradient Plant protein of unknown function (DUF641)
## 14083 AT5G60490 <NA> equal expression FASCICLIN-like arabinogalactan-protein 12
## 14084 AT5G60520 no significant pattern equal expression Late embryogenesis abundant (LEA) protein-related
## 14130 AT5G61230 <NA> up_gradient Ankyrin repeat family protein
## 14188 AT5G62170 no significant pattern down_gradient
## 14253 AT5G63120 no significant pattern down_gradient P-loop containing nucleoside triphosphate hydrolases superfamily protein
## 14288 AT5G63640 no significant pattern up_gradient ENTH/VHS/GAT family protein
## 14298 AT5G63810 down_in_tip down_gradient beta-galactosidase 10
## 14305 AT5G63905 no significant pattern up_gradient
## 14367 AT5G64750 <NA> up_in_base Integrase-type DNA-binding superfamily protein
## 14390 AT5G65140 no significant pattern up_gradient Haloacid dehalogenase-like hydrolase (HAD) superfamily protein
## 14391 AT5G65160 <NA> down_gradient tetratricopeptide repeat (TPR)-containing protein
## 14395 AT5G65210 <NA> up_gradient bZIP transcription factor family protein
## 14406 AT5G65420 <NA> up_in_base CYCLIN D4;1
## 14465 AT5G66130 <NA> equal expression RADIATION SENSITIVE 17
## 14487 AT5G66460 <NA> down_gradient Glycosyl hydrolase superfamily protein
## 14493 AT5G66560 no significant pattern down_gradient Phototropic-responsive NPH3 family protein
## 14499 AT5G66680 no significant pattern down_gradient dolichyl-diphosphooligosaccharide-protein glycosyltransferase 48kDa subunit family protein
## 14587 ATCG00750 no significant pattern equal expression ribosomal protein S11
## 14606 ATMG00080 no significant pattern down_gradient ribosomal protein L16
That's a lot of genes… and our confidence in the patterns being correct doesn't correspond to our confidence in our prediction that they are involved in C4.
To assign genes a confidence in our belief that they are involved in C4 we need an explicit model of what it means to be C4. We will develop this in the next stage of the analysis.
This will make us more confident, and more precise about how confident we are, by including information about whether the absolute level of expression as well as the developmental data, and cell-type specificity in other species.