My scripts:

  • rendered html versions: Rpubs/thomas-weissensteiner
  • rmd files with executable code chunks: www.github.com/thomas-weissensteiner/portfolio/tree/main/

About myself: www.linkedin.com/in/ThomasWs-Mopfair

1. Background and Motivation

T cells are able to recognise and kill other cells that express “foreign” antigens. However, T cells normally do not respond strongly to “self” antigens which might be overexpressed by certain tumors. One way to enhance the tumor-killing ability of T cells is to arm them with an engineered high affinity T cell receptor (TCR).
Alpha fetoprotein (AFP) is an attractive target for therapeutic TCRs because the protein is highly expressed in hepatocellular carcinomas (HCC), but found only at trace levels in normal adult tissues. Fetal AFP synthesis can nevertheless resume in liver disease and regeneration.

The featured paper describes the development of a candidate TCR, named AFPc332, with optimal specificity and sensitivity for AFP. The following two figures convey a key message: T cells recognised target cells in vitro which expressed AFP levels similar to HCC tumors, but which were outside the range of AFP expression in normal tissues and diseased liver.

2. Correlation between AFP mRNA expression in target cells and responses by AFPc332T cells in vitro



Figure 4 in (1) : T cell effector responses, plotted against AFP levels expressed by different target cell lines
A) interferon gamma secretion: IFNg spot forming units (SFU), B) target cell killing (AUC).

An advantage of being an in-house medical writer was a detailed knowledge of the company’s unpublished research. I suggested adding data from the work of another group that had generated a panel of cell lines expressing a complementary range of AFP levels (white diamonds). This resulted in a better estimate of the AFP thresholds at which AFPc332 T cell start to respond in vitro.
The threshold level at which AFPc332 T cells recognised target cells in vitro was ~1mio copies of AFP messenger RNA (mRNA) / 100ng total RNA.

3. Expression of AFP mRNA in normal tissues and cancer



Figure 1 in (1): Expression data were obtained from the public databases TCGA (A) and GTEx (B), and in-house PCR analysis (C).

The original figure was a bar chart in which each individual tissue was represented with a tumor and a normal samples as categories. To simplify the visualisation, I summarised tissues by organ system and changed the format to a violin plot. Individual tissues that explained the high expression in the organ system groups “reproductive and breast” and “digestive and excretory are shown separately in the lower panels. Organ systems and their constituent tissues have matching colours. AFP expression in HCC (red) is shown for comparison, confirming that ~30% of HCC samples expressed levels greater than the highest levels in non-tumor tissues.
Horizontal violin plots were chosen to facilitate comparison with the in vitro response (Figure 4 in ref. (1), above). Liver and non-tumor diseased liver expressed < 104 AFP transcripts / 100ng RNA, which was two orders of magnitude below the response threshold level for AFPc332 T cells.



4. Code

The code for generating the second figure was my first R script, written in 2018. I might do some things differently now, i.e. using dplyr::case_when() to replace cancer tissue with organ system names.

setwd("M:/Manuscripts in Progress/AFP - Ros Docta/Target Validation Data")
require(ggpubr)

## --------- Log scale with minor breaks --------- ##

ymin = 10^0
ymax = 10^5
ticks = 10

minor_breaks <- 
  ymin*seq (10/ticks, 10-10/ticks, by= 10/ticks)

for (i in seq(log10(ymin)+1,log10(ymax)-1)) { 
    minor_breaks <- c(minor_breaks, 
            10^i*seq (10/ticks, 10-10/ticks, by= 10/ticks))
    }

tick.sizes <- 
  c(rep(1, 1+log10(ymax/ymin)), rep(0.5, length(minor_breaks)) )
## --------- Plot non-malignant adjacent tissue data from TCGA --------- ##

AFP_TCGA_Normal <- 
  read.csv("AFP_OncoLand_TCGA_B37_3Apr2018.csv")
AFP_TCGA_Normal <- 
  AFP_TCGA_Normal[grep("Solid Tissue Normal", 
    AFP_TCGA_Normal$Sample.Type), ]

# Generate ordered factor levels for organ system classes

AFP_TCGA_Normal$OrganSystem <- 
  rep("NA", length(AFP_TCGA_Normal[1]))
AFP_TCGA_Normal$OrganSystem[grep(   "nervous system cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Nervous and Sensory"
AFP_TCGA_Normal$OrganSystem[grep(   "musculoskeletal system cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Muscoskeletal and Skin"
AFP_TCGA_Normal$OrganSystem[grep(   "skin cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Muscoskeletal and Skin"
AFP_TCGA_Normal$OrganSystem[grep(   "immune system cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Hematological and Immunological"
AFP_TCGA_Normal$OrganSystem[grep(   "hematologic cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Hematological and Immunological"
AFP_TCGA_Normal$OrganSystem[grep(   "reproductive organ cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Reproductive and Breast"
AFP_TCGA_Normal$OrganSystem[grep(   "head and neck cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Respiratory"
AFP_TCGA_Normal$OrganSystem[grep(   "respiratory system cancer",
  AFP_TCGA_Normal$DiseaseCategory)]   <- "Respiratory"
AFP_TCGA_Normal$OrganSystem[grep(   "gastrointestinal system cancer",
  AFP_TCGA_Normal$DiseaseCategory)]   <- "Digestive and Excretory"
AFP_TCGA_Normal$OrganSystem[grep(   "urinary system cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Digestive and Excretory"
AFP_TCGA_Normal$OrganSystem[grep(   "endocrine gland cancer", 
  AFP_TCGA_Normal$DiseaseCategory)] <- "Endocrine and Adipose"

AFP_TCGA_HCC <- 
  read.csv("AFP_OncoLand_TCGA_B37_3Apr2018.csv")
AFP_TCGA_HCC <- 
  AFP_TCGA_HCC [grep("Liver hepatocellular carcinoma", 
    AFP_TCGA_HCC$Disease), ]


AFP_TCGA_Normal$OrganSystem <- 
  factor(AFP_TCGA_Normal$OrganSystem, 
        levels= c(
          "Digestive and Excretory", "Reproductive and Breast", "Nervous and Sensory",
          "Respiratory", "Hematological and Immunological", 
          "Muscoskeletal and Skin", "Endocrine and Adipose", "Cardiovascular and Lymphatic"
            ),
        ordered=TRUE
        )
AFP_TCGA_Normal$RnaSeq_Transcript[AFP_TCGA_Normal$RnaSeq_Transcript < 1] <- 1


# Generate dummy values for categories with few or no members which are not to be shown in the plot

AFP_TCGA_Normal$RnaSeq_Transcript[grep("Hematological and Immunological", 
  AFP_TCGA_Normal$OrganSystem)] <- 1
AFP_TCGA_Normal$RnaSeq_Transcript[grep("Muscoskeletal and Skin" , 
  AFP_TCGA_Normal$OrganSystem)] <- 1
AFP_TCGA_Normal$OrganSystem[+1] <- "Cardiovascular and Lymphatic"
  AFP_TCGA_Normal$RnaSeq_Transcript[+1] <- 1


# Generate violin plot, using Adaptimmune colour palette
# Cheat: doubled "Endocrine and Adipose" category colour to avoid display as "Cardio" colour

pViol.AFP_TCGA_Normal <- 
    ggviolin(
      AFP_TCGA_Normal, 
          x= "OrganSystem", order= levels(AFP_TCGA_Normal$OrganSystem),      
          y= "RnaSeq_Transcript", size= 0.6, 
          scale= "width", trim= TRUE, fill= "OrganSystem", 
          palette = c( "#9F8BB9", "#90D8DC", "#999EC9", "#9FEADD",
                       "#91B2D3", "#8CC6D9", "#8CC6D9", "#DAFFE2")
                    ) +
    scale_y_log10(
     breaks=c(10^(log10(ymin):log10(ymax)), 
     minor_breaks),
     labels = c( 
       paste("<", ymin),sprintf("%.f", 
         10^((log10(ymin)+1):log10(ymax))
         ),
            rep( '', length(minor_breaks)) )
      )  +
    geom_jitter(shape=16, size=1.5, width=0.045, alpha=0.5) +
    theme ( 
      axis.ticks.x = element_line(size=tick.sizes),
        axis.ticks.length = unit(0.3,"cm"),
        aspect.ratio = 0.5, 
        plot.title= element_blank(), 
        legend.position = "none", 
        text=element_text(size=16), 
        axis.title= element_blank(),
        axis.text.x= element_blank(),
        plot.margin = margin(1, 1, 1, 0, "cm")
        ) +
    geom_hline(
      yintercept=30, color="#cc0000") +
    coord_flip(
      xlim = NULL, ylim = c(1,50000), expand= T)

pViol.AFP_TCGA_Normal <- 
  annotate_figure(
    pViol.AFP_TCGA_Normal,
        left = text_grob(
          "Organ Systems", color= "#3b5998", 
          face= "bold", size= 16, rot= 90, vjust= 0.5
          )
        )



## --------- Plot TCGA high expression tissues and HCC --------- ##

# Blank space before names makes sure max. length of category labels is the same in both panels when combined
AFP_TCGA_Normal$High <- 
  rep("Other Tissues", nrow(AFP_TCGA_Normal))
AFP_TCGA_Normal$High[grep(     "Breast", 
  AFP_TCGA_Normal$Disease)] <- "Breast"
AFP_TCGA_Normal$High[grep(     "Kidney", 
  AFP_TCGA_Normal$Disease)] <- "Kidney"
AFP_TCGA_Normal$High[grep(     "Cholangiocarcinoma", 
  AFP_TCGA_Normal$Disease)] <- "Biliary"
AFP_TCGA_Normal$High[grep(                                   "Liver", 
  AFP_TCGA_Normal$Disease)] <- "                              Liver"
# AFP_TCGA_Normal <- AFP_TCGA_Normal[-grep("HCC", AFP_TCGA_Normal$OrganSystem), ]

AFP_TCGA_HCC <- 
  read.csv("AFP_OncoLand_TCGA_B37_3Apr2018.csv")
AFP_TCGA_HCC <- AFP_TCGA_HCC[-grep("Solid Tissue Normal", 
  AFP_TCGA_HCC$Sample.Type), ]
AFP_TCGA_HCC <- AFP_TCGA_HCC[grep("Liver", 
  AFP_TCGA_HCC$Disease), ]
AFP_TCGA_HCC$OrganSystem <- rep("Digestive and Excretory", 
  length(AFP_TCGA_Normal[1]))
AFP_TCGA_HCC$High <- 
  rep("HCC", length(AFP_TCGA_Normal[1]))

# Set value lower boundary to be just below estimated detection limit
AFP_TCGA_HCC$RnaSeq_Transcript[AFP_TCGA_HCC$RnaSeq_Transcript < 1] <- 1

AFP_TCGA_HCC <- 
  rbind (AFP_TCGA_HCC, AFP_TCGA_Normal)

AFP_TCGA_HCC$High <- 
  factor(AFP_TCGA_HCC$High, 
        levels = c("                              Liver", 
                 "Biliary", "Kidney", "Breast", "HCC", 
                 "Other Tissues"
                 ), 
        ordered=TRUE
        )


# Generate violin plot, using Adaptimmune colour palette

pViol.AFP_TCGA_HCC <- 
    ggviolin(
      AFP_TCGA_HCC [-grep(levels(AFP_TCGA_HCC$High)[6], 
      AFP_TCGA_HCC$High), ], 
      x="High", order=levels(AFP_TCGA_HCC$High)[-6], 
        y="RnaSeq_Transcript", size=0.7, 
        scale = "width", trim=TRUE, fill="High", 
      palette= c(
        "#9F8BB9", "#9F8BB9", "#9F8BB9", "#90D8DC", "#ff3333")
            ) +
    scale_y_log10(
      breaks=c(10^(log10(ymin):log10(ymax) ), 
      minor_breaks
      ),
    labels = c(
     c(
       paste("<", ymin), 
       comma (10^((log10(ymin)+1):log10(ymax) )
      ),
    rep( '', length(minor_breaks)) )))  +
    geom_jitter(
      shape=16, size=1.5, width=0.045, alpha=0.5) +
    labs(y = "FPKM", vjust=1) +
    theme(  
     axis.ticks.x = element_line(size=tick.sizes),
     axis.ticks.length = unit(0.3,"cm"),
     aspect.ratio = 0.5, plot.title= element_blank(), 
     legend.position = "none", axis.text=element_text(size=16), 
     axis.title.x = element_text(size=16), 
     axis.title.y = element_blank(),
     plot.margin = margin(0.5, 0, 0.5, 1, "cm")
     ) +
    geom_hline(
     yintercept=30, color="#cc0000") +
    coord_flip(
     xlim = NULL, ylim = c(1,50000), expand= T)

pViol.AFP_TCGA_HCC <- 
  annotate_figure(
    pViol.AFP_TCGA_HCC,
        left = text_grob("Tissues", color = "#cc0000", 
        face = "bold", size = 16, rot = 90, vjust= 0.5 )
        )

pViol.AFP_TCGA_HCC <- 
  annotate_figure(
    pViol.AFP_TCGA_HCC,
    left = text_grob("High Expression", color = "#cc0000", 
    face = "bold", size = 16, rot = 90, vjust= 0.5 )
    )
 ## --------- Plot normal tissue data from GTEx --------- ##
 
AFP_GTEx2 <- read.csv("AFP_OncoLand_GTEx_B37_3Apr2018.csv")

# Generate ordered factor levels for organ system classes

AFP_GTEx2$OrganSystem <- rep("NA", length(AFP_GTEx2[1]))
AFP_GTEx2$OrganSystem[grep("Adipose Tissue", AFP_GTEx2$Tissue)] <- "Endocrine and Adipose"
AFP_GTEx2$OrganSystem[grep("Adrenal Gland", AFP_GTEx2$Tissue)] <- "Endocrine and Adipose"
AFP_GTEx2$OrganSystem[grep("Pituitary", AFP_GTEx2$Tissue)] <- "Endocrine and Adipose"
AFP_GTEx2$OrganSystem[grep("Thyroid", AFP_GTEx2$Tissue)] <- "Endocrine and Adipose"
AFP_GTEx2$OrganSystem[grep("Bladder", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Colon", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Esophagus", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Kidney", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Liver", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Pancreas", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Salivary Gland", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Small Intestine", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Stomach", AFP_GTEx2$Tissue)] <- "Digestive and Excretory"
AFP_GTEx2$OrganSystem[grep("Blood", AFP_GTEx2$Tissue)] <- "Hematological and Immunological"
AFP_GTEx2$OrganSystem[grep("Blood Vessel", AFP_GTEx2$Tissue)] <- "Cardiovascular and Lymphatic"
AFP_GTEx2$OrganSystem[grep("Heart", AFP_GTEx2$Tissue)] <- "Cardiovascular and Lymphatic"
AFP_GTEx2$OrganSystem[grep("Spleen", AFP_GTEx2$Tissue)] <- "Cardiovascular and Lymphatic"
AFP_GTEx2$OrganSystem[grep("Brain", AFP_GTEx2$Tissue)] <- "Nervous and Sensory"
AFP_GTEx2$OrganSystem[grep("Nerve", AFP_GTEx2$Tissue)] <- "Nervous and Sensory"
AFP_GTEx2$OrganSystem[grep("Breast", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Cervix Uteri", AFP_GTEx2$Tissue)] <-"Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Fallopian Tube", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Ovary", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Prostate", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Testis", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Uterus", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Vagina", AFP_GTEx2$Tissue)] <- "Reproductive and Breast"
AFP_GTEx2$OrganSystem[grep("Lung", AFP_GTEx2$Tissue)] <- "Respiratory"
AFP_GTEx2$OrganSystem[grep("Muscle", AFP_GTEx2$Tissue)] <- "Muscoskeletal and Skin"
AFP_GTEx2$OrganSystem[grep("Skin", AFP_GTEx2$Tissue)] <- "Muscoskeletal and Skin"

AFP_GTEx2$OrganSystem <- factor(AFP_GTEx2$OrganSystem, 
                levels= c("Digestive and Excretory", "Reproductive and Breast", "Nervous and Sensory", 
                    "Respiratory", "Hematological and Immunological",   
                    "Muscoskeletal and Skin", "Endocrine and Adipose", "Cardiovascular and Lymphatic"),
                ordered=TRUE)
AFP_GTEx2$RnaSeq_Transcript[AFP_GTEx2$RnaSeq_Transcript < 1] <- 1


# Generate violin plot, using Adaptimmune colour palette

pViol.AFP_GTEx2_Organs <- 
    ggviolin (AFP_GTEx2, 
        x= "OrganSystem", order= levels(AFP_GTEx2$OrganSystem), 
        y= "RnaSeq_Transcript", size= 0.6, 
        scale= "width", trim= TRUE, fill= "OrganSystem", 
        palette=c("#9F8BB9", "#90D8DC", "#999EC9", "#9FEADD","#91B2D3", "#B8FADD", "#8CC6D9", "#DAFFE2")) +
    scale_y_log10(breaks=c(10^(log10(ymin):log10(ymax)), minor_breaks),
        labels=c(paste("<", ymin),sprintf("%.f", 10^((log10(ymin)+1):log10(ymax))),
                rep( '', length(minor_breaks)) ))  +
    geom_jitter(shape=16, size=1.5, width=0.045, alpha=0.5) +
    theme ( axis.ticks.x = element_line(size=tick.sizes),
        axis.ticks.length = unit(0.3,"cm"),
        aspect.ratio = 0.5, plot.title= element_blank(), 
        legend.position = "none", text=element_text(size=16), 
        axis.title= element_blank(),
        axis.text.x= element_blank(),
        plot.margin = margin(1, 1, 1, 0, "cm")) +
    geom_hline(yintercept=30, color="#cc0000") +
    coord_flip(xlim = NULL, ylim = c(1,100), expand= T)




## --------- Plot GTEx high expression tissues --------- ##

# Blank space before names makes sure max. length of category labels is the same in both panels when combined
AFP_GTEx2$High <- rep("Other Tissues", nrow(AFP_GTEx2))
AFP_GTEx2$High[grep("Liver", AFP_GTEx2$Tissue.Detail.Type)] <- "Liver"
AFP_GTEx2$High[grep("Breast - Mammary", AFP_GTEx2$Tissue.Detail.Type)] <-  "              Breast-Mammary"
AFP_GTEx2$High[grep("Kidney - Cortex", AFP_GTEx2$Tissue.Detail.Type)] <- "Kidney-Cortex"
AFP_GTEx2$High[grep("Uterus", AFP_GTEx2$Tissue.Detail.Type)] <- "Uterus"

# Generate violin plot, using Adaptimmune colour palette

AFP_GTEx2$High <- factor(AFP_GTEx2$High, 
     levels=c(  "              Breast-Mammary", 
               "Uterus", "Kidney-Cortex", "Liver", "Other Tissues"), ordered=TRUE)

pViol.AFP_GTEx2_High <- 
ggviolin (AFP_GTEx2 [-grep("Other Tissues", AFP_GTEx2$High), ], 
          x="High", order=levels(AFP_GTEx2$High)[4:1], 
          y="RnaSeq_Transcript", size=0.6, 
      scale = "width", trim=TRUE, fill="High", 
          palette= c("#9F8BB9", "#9F8BB9", "#90D8DC", "#90D8DC")) +
    scale_y_log10(breaks=c(10^(log10(ymin):log10(ymax)), minor_breaks),
        labels=c(paste("<", ymin),sprintf("%.f", 10^((log10(ymin)+1):log10(ymax))),
                rep( '', length(minor_breaks)) ))  +
    geom_jitter(shape=16, size=1.5, width=0.045, alpha=0.5) +
    labs(y = "FPKM", vjust=1) +
    theme ( axis.ticks.x = element_line(size=tick.sizes),
        axis.ticks.length = unit(0.3,"cm"),
        aspect.ratio = 0.5, plot.title= element_blank(), 
        legend.position = "none", axis.text=element_text(size=16), 
        axis.title.x = element_text(size=16), 
        axis.title.y = element_blank(),
        plot.margin = margin(0.5, 0, 0.5, 1, "cm")) +
    geom_hline(yintercept=30, color="#cc0000") +
    coord_flip(xlim = NULL, ylim = c(1,100), expand= T)
AFP_qPCR <- read.csv("AFP_qPCR.csv")

AFP_qPCR$OrganSystem <- factor(AFP_qPCR$OrganSystem, 
                levels= c("Digestive and Excretory", "Reproductive and Breast",
                    "Nervous and Sensory", "Respiratory",
                    "Hematological and Immunological", "Muscoskeletal and Skin",
                    "Endocrine and Adipose", "Cardiovascular and Lymphatic",
                    "HCC", "Adjacent to tumors"),
                ordered=TRUE)


# Convert zeros to "background" to allow log transformation

AFP_qPCR$Mean.transcript.number..100ng.RNA[AFP_qPCR$Mean.transcript.number..100ng.RNA < 100] <- 100


# Generate violin plot for organ system expression levels, using Adaptimmune colour palette

pViol.AFP_qPCR <- 
    ggviolin (AFP_qPCR[-grep("HCC", AFP_qPCR$OrganSystem), ], 
        x="OrganSystem", order=levels(AFP_qPCR$OrganSystem), 
        y="Mean.transcript.number..100ng.RNA", size=0.7, 
        scale = "width", fill="OrganSystem", trim=TRUE, 
        palette=c("#9F8BB9", "#90D8DC", "#999EC9", "#9FEADD","#91B2D3", "#B8FADD", "#8CC6D9", "#DAFFE2")) +
    scale_y_log10(breaks=c(10^(log10(ymin):log10(ymax)), minor_breaks),
        labels=c(paste("<", ymin),sprintf("%.f", 10^((log10(ymin)+1):log10(ymax))),
                rep( '', length(minor_breaks)) ))  +
    geom_jitter(shape=16, size=3, width=0.05, alpha=0.5) +
    theme ( axis.ticks.x = element_line(size=tick.sizes),
        axis.ticks.length = unit(0.3,"cm"),
        aspect.ratio = 0.5, plot.title= element_blank(), 
        legend.position = "none", text=element_text(size=16), 
        axis.title= element_blank(),
        axis.text.x= element_blank(),
        plot.margin = margin(1, 1, 1, 0, "cm")) +
    geom_hline( yintercept=12000, color="#cc0000") +
    coord_flip(xlim = NULL, ylim = c(130,3*10^6), expand = T)

pViol.AFP_qPCR <- annotate_figure(pViol.AFP_qPCR,
            left = text_grob("Organ Systems", color= "#3b5998", face= "bold", 
            size= 16, rot= 90, vjust= 0.5 ))


# Violin plot for high expression tissues and HCC

AFP_qPCR_High <- read.csv("AFP_qPCR.csv") 

AFP_qPCR$Tissue <- rep("Other Tissues", nrow(AFP_qPCR))

AFP_qPCR$Tissue[grep("Liver", AFP_qPCR$Sample.Name)] <- "                 Normal Liver"
AFP_qPCR$Tissue[grep("Mammary gland", AFP_qPCR$Sample.Name)] <- "Mammary Gland"
AFP_qPCR$Tissue[grep("Adrenal gland", AFP_qPCR$Sample.Name)] <- "Adrenal Gland"
AFP_qPCR$Tissue[grep("Cirrhosis of liver", AFP_qPCR$Sample.Name)] <- "Cirrhotic Liver"
AFP_qPCR$Tissue[grep("Fatty changes to liver", AFP_qPCR$Sample.Name)] <- "Fatty Liver"
AFP_qPCR$Tissue[grep("Hepatitis", AFP_qPCR$Sample.Name)] <- "Hepatitis"
AFP_qPCR$Tissue[grep("Hepatitis, chronic", AFP_qPCR$Sample.Name)] <- "Chronic Hepatitis"
AFP_qPCR$Tissue[grep("HCC", AFP_qPCR$OrganSystem)] <- "HCC"

AFP_qPCR$Tissue <- factor(AFP_qPCR$Tissue, levels=c("Other Tissues",  
                        "Cirrhotic Liver", "Fatty Liver", "Hepatitis", "Chronic Hepatitis",
                        "                 Normal Liver",
                        "Mammary Gland", "Adrenal Gland", 
                        "HCC"), 
                    ordered=TRUE)


pViol.AFP_qPCR_High <- 
    ggviolin (AFP_qPCR[-grep("Other Tissue", AFP_qPCR$Tissue), ], 
        x="Tissue", order=levels(AFP_qPCR$Tissue), 
        y="Mean.transcript.number..100ng.RNA", size=0.7, 
        scale = "width", fill="Tissue", trim=TRUE,
        palette=c( "#9F8BB9", "#9F8BB9", "#9F8BB9", "#ff3333")) +            
  # palette ignores categories w/o violin area
    scale_y_log10(breaks=c(10^(log10(ymin):log10(ymax)), minor_breaks),
            labels=c(c(paste("<", ymin), comma (10^((log10(ymin)+1):log10(ymax))),
                rep( '', length(minor_breaks)) )))  +
    geom_jitter(shape=16, size=3, width=0.05, alpha=0.5) +
    labs( y = "Transcripts / 100ng total RNA", vjust=1) +
    theme ( axis.ticks.x = element_line(size=tick.sizes),
        axis.ticks.length = unit(0.3,"cm"),
        aspect.ratio = 0.5, plot.title= element_blank(), 
        legend.position = "none", axis.text=element_text(size=16), 
        axis.title.x = element_text(size=16), 
        axis.title.y = element_blank(),
        plot.margin = margin(0.5, 0, 0.5, 1, "cm")) +
    geom_hline( yintercept=12000, color="#cc0000") +
    coord_flip(xlim = NULL, ylim = c(130,3*10^6), expand = T)
pViol.AFP_qPCR_High <- annotate_figure(pViol.AFP_qPCR_High,
            left = text_grob("Tissues", color = "#cc0000", 
            face = "bold", size = 16, rot = 90, vjust= 0.5 ))
pViol.AFP_qPCR_High <- annotate_figure(pViol.AFP_qPCR_High,
            left = text_grob("High Expression", color = "#cc0000", 
            face = "bold", size = 16, rot = 90, vjust= 0.5 ))


# Combine plots // needs adjusting of subplot size and Low/Medium/High text
# Because of coord_flip, widths become heights and vice versa

pViol.AFP_qPCR_LowHigh <- ggarrange(pViol.AFP_qPCR, pViol.AFP_qPCR_High, 
                ncol = 1, nrow = 2, labels = NULL,
                align = c("v"), widths = c(1,1), heights = c(1,1.01),
                legend = NULL, common.legend = FALSE)
ggexport(pViol.AFP_qPCR_LowHigh, filename="AFP_qPCR_LowHigh.png", width=6300, height=5000, res=450)



5. Reference

  1. Tuning T-Cell Receptor Affinity to Optimize Clinical Risk-Benefit When Targeting Alpha-Fetoprotein-Positive Liver Cancer Docta RY, Ferronha T, Sanderson JP, Weissensteiner T, Pope GR, Bennett AD, Pumphrey NJ, Ferjentsik Z, Quinn LL, Wiedermann GE, Anderson VE, Saini M, Maroto M, Norry E, Gerry AB. Hepatology. 2019 May;69(5):2061-2075