mammals = read.csv("https://raw.githubusercontent.com/jfcross4/stats/refs/heads/master/mammals2.csv")
rents = read.csv("https://raw.githubusercontent.com/jfcross4/stats/refs/heads/master/rent_data.csv")
quiz = read.csv("https://raw.githubusercontent.com/jfcross4/stats/refs/heads/master/study_data.csv")
This data set has sleep hours
Create the following model:
m = lm(total_sleep ~ danger, data=mammals)
summary(m)
Write the equation you created to predict sleep hours.
Use your equation to predict the sleep for a mammal with a danger level of 5.
Interpret the coefficient of danger in your model.
Now, create a model to predict total_sleep from life_span and gestation:
m = lm(total_sleep ~ life_span + gestation, data=mammals)
summary(m)
View(rents)
This data frame has the prices of rental aparments (“rent”) along with the square footage (“sqft”) as well as a column (“near_subway”) that takes on values of yes or no depending on whether the apartment is near a subway.
m = lm(rent ~ near_subway, data=rents)
summary(m)
m = lm(rent ~ near_subway + sqft, data=rents)
summary(m)
How did adding “sqft” to the model effect the coefficient of “near_subway”? How would you explain this difference?
The “quiz” data frame has data on quiz scores along with the number of hours students studied in the week before as well as the number of hours they slept the night before:
View(quiz)
We can create a model to predict scores from sleep hours as follows:
m = lm(score ~ sleep_hours, data=quiz)
summary(m)
Bonus: What is the relationship between sleeping and quiz performance? How does adding study hours to your model affect your results? Please create additional models using this data set and explain what you find.