lab3.df <- read.csv(file = "Yi_et_al_PLOS_betatropin_data_subset_lab_3_CSV.csv")
head(lab3.df)
## animal.ID trt slide total.beta.cell Ki67.cells percent.Ki67 X X.1 X.2
## 1 CM7721 MBP 3 1887 28 1.48 NA NA NA
## 2 CM7722 MBP 3 897 6 0.67 NA NA NA
## 3 CM7725 MBP 3 1997 11 0.55 NA NA NA
## 4 CM7726 MBP 3 4513 17 0.38 NA NA NA
## 5 CM7729 MBP 3 930 3 0.32 NA NA NA
## 6 CM7731 MBP 3 3535 12 0.34 NA NA NA
## X.3
## 1 NA
## 2 NA
## 3 NA
## 4 NA
## 5 NA
## 6 NA
Not that in my version of the file there are 3 empty columns to the left. Normally I’d probably check the Excel file to make sure there are now problems, delete these by hand in Excel and re-load the file.
I can also remove them in R like this using brackets and column indexing. “-c(7:10)” means “remove columns 7 through 10”. “c(7:10)” specifies those columns, and the negative signs says remove them (this is called “negative indexing”)
lab3.df <- lab3.df[, -c(7:10)]
See the links below for more on negative indexing:
http://stackoverflow.com/questions/7336679/in-r-what-does-a-negative-index-do?noredirect=1&lq=1
The summaryBy command
library(doBy)
The std.error command
library(plotrix)
Yi.summary.stats <- summaryBy(percent.Ki67 ~ trt,
data = lab3.df,
FUN = c(mean, std.error))
Look at those summary stats
Yi.summary.stats
## trt percent.Ki67.mean percent.Ki67.std.error
## 1 MBP 0.793 0.1566174
## 2 MBP+hBT 1.652 0.1061320
barplot2() is in gplots
library(gplots)
##
## Attaching package: 'gplots'
## The following object is masked from 'package:plotrix':
##
## plotCI
## The following object is masked from 'package:stats':
##
## lowess
If I forget how barplot2() works I can get the help file with ?barplot2
barplot2(height = Yi.summary.stats$percent.Ki67.mean,
names.arg = Yi.summary.stats$trt,
plot.ci = TRUE,
ci.l = Yi.summary.stats$percent.Ki67.mean -
Yi.summary.stats$percent.Ki67.std.error,
ci.u = Yi.summary.stats$percent.Ki67.mean +
Yi.summary.stats$percent.Ki67.std.error)
with(Yi.summary.stats,
barplot2(height = percent.Ki67.mean,
names.arg = trt,
plot.ci = TRUE,
ci.l = percent.Ki67.mean - percent.Ki67.std.error,
ci.u = percent.Ki67.mean + percent.Ki67.std.error))
plotCI(x = Yi.summary.stats$percent.Ki67.mean,
ui = Yi.summary.stats$percent.Ki67.mean+
Yi.summary.stats$percent.Ki67.std.error,
li = Yi.summary.stats$percent.Ki67.mean-
Yi.summary.stats$percent.Ki67.std.error)
Still ugly, but less code!
with(Yi.summary.stats,
plotCI(x = percent.Ki67.mean,
ui = percent.Ki67.mean+
percent.Ki67.std.error,
li = percent.Ki67.mean-
percent.Ki67.std.error))
We actually don’t have to use the summaryBy() command to get our means and stadard errors. gplots has a function plotmeans() that can do all of this in one step, using formula notation. It even adds some other stuff. I believe the default is for 95% CI, NOT Standard erros.
plotmeans(percent.Ki67 ~ trt, data = lab3.df)
plotmeans(percent.Ki67 ~ trt,
data = lab3.df,
n.labe = F,
connect = F)
library(beeswarm)
## Warning: package 'beeswarm' was built under R version 3.3.2
beeswarm(percent.Ki67 ~ trt,
data = lab3.df)
t.test(percent.Ki67 ~ trt,
data = lab3.df)
##
## Welch Two Sample t-test
##
## data: percent.Ki67 by trt
## t = -4.5404, df = 15.826, p-value = 0.0003433
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.2604237 -0.4575763
## sample estimates:
## mean in group MBP mean in group MBP+hBT
## 0.793 1.652
Cox, Yi, Melton, Kushner et al. 2016. Resolving Discrepant Findings on ANGPTL8 in ??-Cell Proliferation: A Collaborative Approach to Resolving the Betatrophin Controversy. PLoSONE.
Yi P, Park JS, Melton DA (2013) Betatrophin:a hormone that controls pancreatic beta cell proliferation. Cell 153:747-758.