PLEASE READ YOUR DATA

## [1] "/cloud/project"
## [1] "Experiment 1"               "Experiment1.txt"           
## [3] "Experiment2.txt"            "project.Rproj"             
## [5] "rsconnect"                  "Template_formative.html"   
## [7] "Template_formative.nb.html" "Template_formative.pdf"    
## [9] "Template_formative.Rmd"

LIST THE NAMES OF YOUR VARIABLES

## [1] "control"       "Pembrolizumab"

LIST THE NUMBERS OF ROWS AND COLUMNS IN YOUR DATA

## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9"
## [1] "control"       "Pembrolizumab"

PLOT YOUR DATA

## Warning in stat_summary(fun = mean, geom = "point", shape = 4, size = 4, :
## Ignoring unknown parameters: `linewidth`
## Warning: The dot-dot notation (`..ymin..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(ymin)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The `fatten` argument of `geom_boxplot()` is deprecated as of ggplot2 4.0.0.
## ℹ Please use the `median.linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

###Fig 1: Testosterone expression of tumour epithelial cells in mice. Intracellular testosterone levels were quantified by flow cytometry and are shown in arbitrary units (a.u.) for control (n = 9) and Pembrolizumab-treated (n = 9) groups. Data are presented as a box-and-whisker plot: the line within the box represents the median, the box shows the interquartile range (25th–75th percentile), and whiskers indicate the minimum and maximum values. The mean ± SEM for each group is as follows; (control: 1.01 ± 0.042; Pembrolizumab: 0.392 ± 0.053). An unpaired two-tailed t-test was used to assess statistical significance, revealing a significant reduction in testosterone expression in the Pembrolizumab group (p<0.0001, threshold p < 0.05). Sample Size; n=9.

PROVIDE AN APPROPRIATE SUMMARY STATISTICS FOR YOUR DATA

##     control      Pembrolizumab   
##  Min.   :0.790   Min.   :0.2300  
##  1st Qu.:0.990   1st Qu.:0.2700  
##  Median :1.050   Median :0.3400  
##  Mean   :1.006   Mean   :0.3922  
##  3rd Qu.:1.080   3rd Qu.:0.5000  
##  Max.   :1.150   Max.   :0.6700

READ AND UNDERSTAND THE BIOLOGY, THE QUESTION, THE DESIGN AND HOW THE DATA HAVE BEEN GENERATED

STATE THE BIOLOGICAL HYPOTHESIS

Research Hypothesis: Exposure to the anti-PD1 drug, Pembrolizumab, results in a reduction of testosterone expression in mouse tumour epithelial cells when compared to the untreated control.

STATE THE STATISTICAL HYPOTHESES ACCORDING TO THE TEST EMPLOYED

H0: There is no significant difference in the means of the control and Pembrolizumab groups. H1: There is a significant difference between the means of the control and Pembrolizumab groups.”

CARRY OUT THE HYPOTHESIS TEST WITHIN THIS CHUNCK AND CONDUCT ALL THE NECESSARY ROUTINES WHICH WOULD HELP YOU INTERPRET THE DATA

Shapiro Wilk test and QQ plot show data follows normal distribution

## [1] "Control"
## 
##  Shapiro-Wilk normality test
## 
## data:  Data1$control
## W = 0.84333, p-value = 0.06289
## [1] "Pembrolizumab"
## 
##  Shapiro-Wilk normality test
## 
## data:  Data1$Pembrolizumab
## W = 0.88925, p-value = 0.196

## [1] "p value not significant. Data follows normal distribution"
## [1] "Unpaired two tailed t-test to test significance of difference"
## 
##  One Sample t-test
## 
## data:  Data1
## t = 8.6126, df = 17, p-value = 1.318e-07
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.5276829 0.8700949
## sample estimates:
## mean of x 
## 0.6988889
## [1] "p value is significant. Enough evidence to reject the null hypothesis"

PLEASE INDICATE IN APPROXIMATELY 250 WORDS, PLUS MINUS 20%, THE ELEMENTS WHICH GUIDE THE DECISION BEHIND THE SELECTION OF GRAPH AND YOUR STATISTICAL TEST

###AS PART OF THE 250 WORDS PLEASE ALSO STATE YOUR INTERPRETATION OF THE DATA

A box-and-whisker plot was used to visualize the spread and distribution of the data, which showed no outliers. Given the small sample size (n = 9 per group), the Shapiro Wilk test was applied to assess normality, an appropriate choice for small datasets. Both groups passed the normality test (control: p = 0.0629, Pembrolizumab: p = 0.196), indicating that the data do not significantly deviate from a normal distribution. The box plot depicts the median and interquartile range, while the standard error of the mean (SEM) provides an indication of the variability around the mean for each group. Non-overlapping boxes suggest a potential difference between the control and Pembrolizumab groups. However, formal statistical testing is required to confirm this observation. Given that the data are independent, continuous, and approximately normally distributed, an unpaired two-tailed t-test was employed to compare the groups. The test yielded a highly significant result (p < 0.0001), allowing rejection of the null hypothesis that Pembrolizumab significantly reduces testosterone expression when compared to the untreated control group. This finding supports the research hypothesis, demonstrating that treatment with Pembrolizumab significantly reduces testosterone expression in tumour epithelial cells of testicular cancer mouse models. Overall, visualising the data and statistical testing provides evidencse for a biologically meaningful decrease in testosterone levels following treatment. Further research and data with a greater sample size are required to solidify this conclusion. This approach appropriately balances the challenges of small sample size with the parametric assumptions required for the t-test.