This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. R code chunks look like this:
# Your code goes here.
Once you have completed all of the exercises below, use the Knit button to generate a Word file with all of your code and output. Then submit both the Word file on Canvas.
What is the sex ratio at birth of red deer? You have data from six
deer mothers on the sex of all their offspring, as follows:
Deer 1: 3 female, 2 male
Deer 2: 2 female, 5 male
Deer 3: 1 female, 3 male
Deer 4: 5 female, 4 male
Deer 5: 4 female, 1 male
Deer 6: 3 female, 2 male
Use the prompts in the code chunk below to write a script to analyze
these data. For each comment, write the corresponding code immediately
below, as in the first example.
# Store the data in two vectors, one for males and one for females.
male <- c(2, 5, 3, 4, 1, 2)
female <- c(3, 2, 1, 5, 4, 3)
# Calculate the sex ratio for Deer 1. That is, what proportion of her offspring are female?
female[1] / (male[1] + female[1])
## [1] 0.6
## [1] 0.6
# Calculate the sex ratio for all of the deer mothers. Do this in one single R command; do not separately calculate the ratio for each mother.
female[1:6] / (male[1:6] + female[1:6])
## [1] 0.6000000 0.2857143 0.2500000 0.5555556 0.8000000 0.6000000
# Calculate the average sex ratio across the six mothers.
ratio.data <- c(female/(male + female))
Do red deer appear to have an equal sex ratio at birth?
Yes, they do, since the ratio is 0.5152116, which is close to 0.5, which
is a perfectly equal sex ratio.
Body mass index (BMI) is a measure of body weight designed to account
for differences in height. It is equal to weight divided by height
squared, with weight measured in kilograms and height in meters. You
have data on the height and weight of ten people, as follows:
167 cm, 64 kg
175 cm, 72 kg
180 cm, 73 kg
155 cm, 65 kg
173 cm, 75 kg
185 cm, 74 kg
181 cm, 82 kg
163 cm, 69 kg
179 cm, 79 kg
170 cm, 72 kg
Put these data in an Excel spreadsheet and save it as a CSV file (comma
separated values). Put the data in two columns, one for height and one
for weight. Put variable labels in the first row of each column (i.e.,
‘height’ and ‘weight’).
Use the prompts in the code chunk below to write a script to analyze
these data.
# Show the current working directory.
getwd()
## [1] "/Users/kelseywitko/Downloads"
# Change the working directory to where the CSV file is located. You can do this either with the setwd() function or by using RStudio's Session menu.
# Clear the workspace of any previously defined variables.
rm(list=ls())
# Read the data into a data frame.
# Show the size of the data frame.
# Show the names of the data frame's variables.
# Calculate the average weight of the ten people.
# Calculate the BMI of each person and store it in the data frame. Do this with one single command; do not separately calculate the BMI for each person. Don't forget that BMI expects height in meters, not centimeters
# Make a scatterplot of BMI vs. weight. Be sure to label the plot axes.
Does BMI appear to depend on weight?
No, there does not appear to be a trend of BMI with weight.
How fast does the concentration of a toxin in the bloodstream decrease? A typical pattern is that the concentration decreases by a fixed proportion each unit time (e.g., it goes down by half every 2 hours). File toxin.csv contains data on the concentration (in parts per million) of a toxin in the bloodstream of a rat, measured every hour for eight hours. Use the prompts in the code chunk below to write a script to analyze these data.
# Read the data into a data frame.
("toxin.csv")
## [1] "toxin.csv"
# Plot toxin concentration over time.
plot(2,55)
plot(3,20)
plot(4,15)
# Plot the logarithm of concentration over time.
How do the two plots compare?
The first plot shows a first order logarithmic decay plot that when the
logarithm of the concentration is taken, becomes a linear trend.
How do the weights of male and female monitor lizards compare? File lizards.csv contains the weights of ten male and ten female lizards (in kilograms). Use the prompts in the code chunk below to write a script to analyze these data.
# Read the data into a data frame.
lizards.data <- read.csv
# Calculate the average weight of males and the average weight of females.
# Make a bar plot showing the average weight of each sex.
Does one sex seem bigger?
Yes, females seem bigger.
Can lion age be told by the amount of black pigmentation on the nose? File lions.csv contains the age (in years) and proportion of black pigmentation on the nose for 20 male lions. Use the prompts in the code chunk below to write a script to analyze these data.
# Read the data into a data frame.
# Make a scatter plot of the relation between age and proportion of black pigmentation on the nose
Based on this plot, what can you say about the usefulness of nose pigmentation for estimating the age of male lions?
Please watch the video lectures about the normal distribution before doing these exercises! Also, please first watch the tutorial video about working with normal distributions in R.
In many animals, island populations diverge in size from mainland populations of the same species. You are interested in finding out if this is the case for monitor lizards, which are widespread on the Australian continent and surrounding islands. Based on earlier studies, you believe that the mean and variance for the lengths of adult mainland lizards are as follows: Mean = 100 cm, Variance = 400. You also make the reasonable assumption that length is a normal random variable. Before collecting any data on the Kangaroo Island lizards, you will use this prior knowledge to say something about their expected probability distribution. Use the prompts in the code chunk below to write a script that plots the PDF and the CDF.
# Plot the probability density function of lizard lengths. The plot must include informative axis titles and an overall title.
# Enter parameter values.
mu <- 100
variance <- 400
stdev <- sqrt(variance)
# Set the range of X values.
x <- seq(mu - 3*stdev, mu + 3*stdev, by= 0.01)
# Calculate probability densities.
pdf <- dnorm (x, mu, stdev)
# Plot the PDF.
plot(pdf ~ x)
# Plot the cumulative distribution function of lizard lengths, again including informative axis titles and an overall title.
# Calculate probabilities.
# Plot the CDF.
(cdf~x)
## cdf ~ x
Write another script to answer the following questions about what you expect to find when you measure Kangaroo island lizards, assuming that they follow the same probability distribution as mainland lizards. Inspect the plots you made in 1 to check if your answers make sense.
# What is the probability density for a length of 75 cm?
0
## [1] 0
# What is the probability that a lizard will be...
# Less than or equal to 75 cm?
pnorm (75,mu,stdev)
## [1] 0.1056498
# Greater than 120 cm?
pnorm(120,mu,stdev)
## [1] 0.8413447
# Between 95 and 115cm?
pnorm(115,mu,stdev) - pnorm(95,mu,stdev)
## [1] 0.372079
# At least 40 cm different from the mean, in either direction?
pnorm(mu-40,mu,stdev)
## [1] 0.02275013
# Further than 0.7 standard deviations below the mean?
pnorm(-0.7)
## [1] 0.2419637
# Closer than 1.3 standard deviations to the mean?
pnorm(1.3) - pnorm(0)
## [1] 0.4031995
# Further than 1.5 standard deviations from the mean?
1-pnorm(1.5)
## [1] 0.0668072
# What are the quartiles of the distribution? That is, what are the 0.25, 0.5, and 0.75 quantiles?
qnorm(0.25)
## [1] -0.6744898
# 2/3 of observations are expected to lie below what value?
qnorm(2/3,mu,stdev)
## [1] 108.6145
# 80% of observations are expected to lie above what value?
qnorm(0.2,mu,stdev)
## [1] 83.16758
Britain’s domestic intelligence service MI5 places an upper limit on the height of its spies, on the assumption that people who are too tall do not blend in well with the crowd. To be a spy, men must be no taller than 180 cm (~5 feet 11 inches) and women no taller than 173 cm (~5 feet 8 inches). Write a script to answer the questions in the code chunk below.
# If the mean height of British men is 177 cm, with a standard deviation of 7.1 cm, what proportion of British men are excluded from being spies by this height restriction? Assume that height follows a normal distribution.
MImu <- 177
MIstdev <- 7.1
rm(MIvariance)
## Warning in rm(MIvariance): object 'MIvariance' not found
1-pnorm(180,MImu,MIstdev)
## [1] 0.3363172
# The mean height of British women is 163.3 cm, with a standard deviation of 6.4 cm. Assuming a normal distribution of female height, what fraction of women meet MI5’s height standard?
MIWmu <- 163.3
rm(MiWvariance)
## Warning in rm(MiWvariance): object 'MiWvariance' not found
MIWstdev <- 6.4
pnorm(173,MIWmu,MIWstdev)
## [1] 0.9351929
# Imagine that MI5 wants to change its maximum height for female spies. Its goal is to exclude the same proportion of women as men. What should the new maximum height for women be?
qnorm((1-0.3363172),MIWmu,MIWstdev)
## [1] 166.0042
# Sean Connery, the original James Bond, was 183 cm tall. By how many standard deviations did he exceed the height limit for spies?
(183 - 180)/MIstdev
## [1] 0.4225352