# Set up environment
rm(list = ls())
library(ggplot2)
library(tibble)
Using the random number seed 1234, generate 1000 draws
from a normal distribution with mean 100 and standard deviation 20. Plot
a histogram of these draws and add the following details:
lines() to plot the probability
density function of the normal distribution we generated the data
from.Use ?hist for more information about plotting options
for the histogram.
set.seed(1234)
DATA <- rnorm(100, mean = 100, sd = 20)
hist(DATA,
breaks = 20, #number of bins
freq = F, #Display probability density instead of frequency
main = paste("Sample mean = " , round(mean(DATA), 2) , ", Sample Variance = " , round(var(DATA), 2)),
)
seq <- seq(min(DATA), max(DATA), length = length(DATA)) #Arrange the data in order
lines(seq, dnorm(seq, mean = mean(DATA), sd = sd(DATA)), col = "red", lwd = 2)
Download nba_wingspan.csv from Canvas. It contains data
on the height and wingspan (in inches) of NBA players, taken from here.
Import this data, generate a scatterplot of wingspan versus height, and add the OLS line of best fit to it. Add a title and label the axes.
Report the estimates from the model for \(\beta_0\) and \(\beta_1\).
#Import data
setwd("~/Desktop/R/STSCI 5030")
NBA = read.csv("Data/nba_wingspan.csv", header=TRUE)
plot(NBA$wingspan, NBA$height, main = "Scatterplot of wingspan vs height", xlab = "wingspan", ylab = "height")
OLS <- lm(NBA$height ~ NBA$wingspan)
abline(OLS, col = "blue", lwd =2)
\(\beta_0\) =
OLS$coefficients[1]
## (Intercept)
## 20.54038
\(\beta_1\) =
OLS$coefficients[2]
## NBA$wingspan
## 0.6989063
Now create a subset of the dataset containing only players who play
Center position, pos equal to C, and generate the plot
again like in part a). Add a title and label the axes.
Report the estimates from the model for \(\beta_0\) and \(\beta_1\).
Center <- NBA[NBA$pos == "C", ]
plot(Center$wingspan, Center$height, main = "Scatterplot of wingspan vs height in Cetner", xlab = "wingspan", ylab = "height")
OLS_c <- lm(Center$height ~ Center$wingspan)
abline(OLS_c, col = "blue", lwd =2)
\(\beta_0\) =
OLS_c$coefficients[1]
## (Intercept)
## 68.9505
\(\beta_1\) =
OLS_c$coefficients[2]
## Center$wingspan
## 0.1534844
Submit both your .Rmd file and the resulting
.html file in Canvas.