Before you do anything else, make sure you’ve got the latest versions of R and RStudio installed!
If you want to change the laguage in RStudio to English, run the following code:
Sys.setenv(LANG = "en")
Everything in R is an object with a name. Everything you do in R is a function. Let’s create a new object called my.data some experimental data. The data is stored as a text file at the web link https://dl.dropboxusercontent.com/u/7618380/priming.txt
priming <- read.table(file = "https://dl.dropboxusercontent.com/u/7618380/priming.txt",
sep = "\t", # File is tab-delimited
header = T # There is a header row
)
You can look at the data using the View() function applied to the priming object
# Show the priming dataset in a 'spreadsheet'
View(priming)
You can get summary information about the data using the summary() function
# Calculate summary statistics of each column
summary(priming)
id sex age prime time
Min. : 1.00 f:53 Min. :14.00 elderly:50 Min. : 7.200
1st Qu.: 25.75 m:47 1st Qu.:21.00 neutral:50 1st Qu.: 9.375
Median : 50.50 Median :22.00 Median :10.300
Mean : 50.50 Mean :22.21 Mean :10.337
3rd Qu.: 75.25 3rd Qu.:23.00 3rd Qu.:11.225
Max. :100.00 Max. :27.00 Max. :14.200
You can calculate simple statistics using functions like mean(), sd(), and table()
# Calculate some basic summary statistics of specific columns
mean(priming$age) # What was the mean age of the participants?
[1] 22.21
median(priming$time) # What was the median time?
[1] 10.3
table(priming$sex) # How many people were there of each sex?
f m
53 47
Plotting is super easy in R!
# Simple histogram of age
hist(x = priming$age)
You can customize the look of your plot with additional plotting arguments
# Age histogram with additional arguments
hist(x = priming$age,
border = "white",
col = "salmon",
main = "Age Distribution",
xlab = "Age",
yaxt = "n",
ylab = ""
)
Here’s a boxplot:
# Boxplot of times by priming condition
boxplot(formula = time ~ prime,
data = priming,
col = c("skyblue1", "tomato"),
ylab = "Walking Times",
xlab = "Priming Condition",
main = "Walking times by priming condition (Study 1)"
)
Now a scatterplot!
# Scatterplot of age and time
plot(x = priming$age,
y = priming$time,
pch = 16, # point type
col = gray(.5, .5), # point color
cex = 2, # point sizes
bty = "n", # remove outer box
xlab = "Age",
ylab = "Time",
main = "Age and Walking times\nStudy 1"
)
# Add a regression line!
abline(lm(time ~ age, data = priming),
col = "red",
lty = 2,
lwd = 2
)
Let’s do a t-test
# Was there a signficant effect of prime on time?
t.test(formula = time ~ prime,
data = priming)
Welch Two Sample t-test
data: time by prime
t = 4.5241, df = 97.236, p-value = 1.716e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.6028459 1.5451541
sample estimates:
mean in group elderly mean in group neutral
10.874 9.800
Regression? No problem!
# Which variables affected walking times?
summary(lm(time ~ prime + sex + id + age,
data = priming
))
Call:
lm(formula = time ~ prime + sex + id + age, data = priming)
Residuals:
Min 1Q Median 3Q Max
-2.2206 -0.8448 0.1198 0.8926 3.1407
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.1310136 1.2640597 9.597 1.2e-15 ***
primeneutral -1.0052648 0.4713175 -2.133 0.0355 *
sexm -0.3636308 0.2707522 -1.343 0.1825
id 0.0003651 0.0082321 0.044 0.9647
age -0.0512793 0.0579237 -0.885 0.3782
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.177 on 95 degrees of freedom
Multiple R-squared: 0.2111, Adjusted R-squared: 0.1779
F-statistic: 6.355 on 4 and 95 DF, p-value: 0.0001417
One of the best things about R is that anyone can program new functions and share them using packages. Let’s install two packages:
# Install some external packages
install.packages("BayesFactor") # BayesFactor package: For Bayesian analyses
install.packages("devtools")
devtools::install_github("ndphillips/yarrr") # yarrr package: For R piratery
Let’s do a Bayesian t-test using the BayesFactor package
library(BayesFactor)
# Bayesian t-test comparing time between priming conditions
ttestBF(formula = time ~ prime,
data = priming)
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 1068.152 ±0%
Against denominator:
Null, mu1-mu2 = 0
---
Bayes factor type: BFindepSample, JZS
Now let’s plot the posterior distribution of group differences:
bf.test <- BayesFactor::ttestBF(formula = time ~ prime,
data = priming)
samples <- BayesFactor::posterior(bf.test, iterations = 10000)
hist(x = samples[,2],
xlim = c(-2, 2),
xlab = "Difference in walking times between groups",
yaxt = "n",
ylab = "",
main = "Posterior distribution of group differences"
)
Now let’s load the yarrr package and make a pirateplot!
# Load the yarrr package
library(yarrr)
Here’s a ‘pirateplot’ from the yarrr package
# Pirateplot of time by priming condition
yarrr::pirateplot(formula = time ~ prime,
data = priming,
theme.o = 3,
xlab = "Priming Condition",
main = "Walking times by Priming Condition\nRectangles are 95% HDIs"
)
Here’s another pirateplot of a dataset on chickwen weights:
yarrr::pirateplot(formula = weight ~ Time,
data = ChickWeight,
theme.o = 3,
xlab = "Week",
main = "Weights of chickens by week\nRectangles are 95% HDIs"
)
Now wasn’t that easy? To learn more about R, check out YaRrr! The pirate’s guide to R: www.thepiratesguidetor.com