Your first WPA!
Here is your first WPA! Open a new R script in R and save it as wpa_0_LastFirst (where Last and First is your last and first name). At the top of your script, make sure to put the following info (make sure to put the hashtag at the beginning so R knows that it’s a comment, not real R code)
# Assignment: WPA 0
# Name: Nathaniel Phillips
# Date: 22 September 2016Copy and paste each of the following code chunks into your assignment
1. Install and load the yarrr package
First we’ll install and load the yarrr package. The yarrr package contains many datasets and functions (like pirateplot we’ll use in this course.)
## TASK 1
# Install devtools, then load the yarrr package from Github
install.packages('devtools')
library('devtools')
install_github('ndphillips/yarrr', build_vignettes = T)
library(yarrr)2. Look at the basics of the pirates data
## TASK 2
# Get help for pirates data
?pirates
# Show the first few rows of the dataset
head(pirates)
# Show the entire dataset in a new window
View(pirates)3. Basic descriptive statistics
## TASK 3
# What is the mean age?
mean(pirates$age)
# What was the tallest pirate?
max(pirates$height)
# How many pirates are there of each sex?
table(pirates$sex)4. Aggregating statistics by groups
What is the mean age of pirates for each sex?
## TASK 4
aggregate(formula = age ~ sex,
data = pirates,
FUN = mean)What is the median weight of pirates for each combination of sex and headband?
aggregate(formula = age ~ sex + headband,
data = pirates,
FUN = median)5. Histograms
Here is a simple histogram
## TASK 5
hist(x = pirates$age)Here is a customized, more colorful histogram
hist(x = pirates$age,
main = "Distribution of pirate ages",
col = "skyblue",
border = "white",
xlab = "Age")Here are two overlapped histograms
# Start with the female data
hist(x = pirates$age[pirates$sex == "female"],
main = "Distribution of pirate ages by sex",
col = transparent("red", .2),
border = "white",
xlab = "Age",
breaks = seq(0, 50, 2),
probability = T,
ylab = "",
yaxt = "n")
# Add male data
hist(x = pirates$age[pirates$sex == "male"],
add = T,
probability = T,
border = "white",
breaks = seq(0, 50, 2),
col = transparent("skyblue", .5))
# Add the legend
legend(x = 40,
y = .05,
col = c("red", "skyblue"),
legend = c("Female", "Male"),
pch = 16,
bty = "n")6. Create a scatterplot
## TASK 6
# Create main plot
plot(x = pirates$height,
y = pirates$weight,
main = 'My first scatterplot of pirate data!',
xlab = 'Height (in cm)',
ylab = 'Weight (in kg)',
pch = 16, # Filled circles
col = gray(.0, .1)) # Transparent gray
# Add gridlines
grid()
# Create a linear regression model
model <- lm(formula = weight ~ height, data = pirates)
# Add regression to plot
abline(model,
col = 'blue')7. Create a pirateplot
## TASK 7
pirateplot(formula = sword.time ~ sword.type,
data = pirates,
main = "Pirateplot of ages by sword.type")8. Two sample hypothesis tests
t-test
Do pirates with eyepatches have longer beards than those without eyepatches?
## TASK 8
t.test(formula = beard.length ~ eyepatch,
data = pirates,
alternative = 'two.sided')correlation test
Is there a correlation between a pirate’s age and the number of parrots (s)he has?
cor.test(formula = ~ age + parrots,
data = pirates)9. ANOVA
Now let’s do an ANOVA testing if pirates with different favorite pirates have different ages
## TASK 9
# Run the ANOVA
age.pix.aov <- aov(formula = age ~ favorite.pirate,
data = pirates)
# Print summary results
summary(age.pix.aov)
# Calculate post-hoc tests
TukeyHSD(age.pix.aov)10. Regression
Here is a regression analysis showing if age, weight, and tattoos predict how many treasure chests a pirate has found
## TASK 10
# Run the regression
chests.lm <- lm(formula = tchests ~ age + weight + tattoos,
data = pirates)
# Print summary results
summary(chests.lm)That’s it! Now it’s time to submit your assignment! Save and email your wpa_0_LastFirst.R file to me at nathaniel.phillips@unibas.ch. Then, go to http://www.rpubs.com/YaRrr/syllabusf16 to find the link for the WPA submission form.