SQRodriguez HWK6

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggplot2)
library(readxl)
dog_and_cat_data_<- read_excel("dog and cat data .xlsx")
##Question One: Load your chosen data set into R Markdown##

##Question Two: Select the dependent variable you are interested in, along with independent variables which you believe are causing the dependent variable. Question Three: Create a linear model using the lm() command, save it to some object## 
# Create the linear model
euth<- c(dog_and_cat_data_$`Animals Euthanized`)
##euth represents the dependent variable. adop, fost and spay represent the independent variables##
adop<- c(dog_and_cat_data_$Adoptions)
fost<- c(dog_and_cat_data_$Fosters)
spay<- c(dog_and_cat_data_$`Spay/Neuter`)

model<- lm(euth ~ adop+fost+spay)

##Question Four: Call a summary on your new model##
summary(model)

## 
## Call:
## lm(formula = euth ~ adop + fost + spay)
## 
## Residuals:
##        1        2        3        4        5        6        7        8 
##    3.659  137.598  -67.761 -559.565  -42.270  107.732  182.939  237.668 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -1.049e+03  3.194e+02  -3.285  0.03036 * 
## adop         4.764e-01  1.624e-01   2.933  0.04269 * 
## fost         2.386e+00  4.756e-01   5.016  0.00741 **
## spay        -3.691e-02  1.966e-02  -1.877  0.13372   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 331.7 on 4 degrees of freedom
## Multiple R-squared:  0.952,  Adjusted R-squared:  0.9159 
## F-statistic: 26.42 on 3 and 4 DF,  p-value: 0.004257

##Question Five: interpret the model's r-squared and p-values. How much of the dependent variable does the overall model explain? What are the significant variables? What are the insignificant variables?##
##Answer: 95.2% of the dependent variable is explained by the model. This means the model is a good fit for the data and that the predictors explain the variation in the dependent variable. The dependent variable is euthanasia rates. The P Value of 0.004257 is significant because it's below 0.05. The significant variables are Adoptions (0.04269) and Fosters (0.00741) because they are also below 0.05. Spay/Neuter (0.13372) is considered insignificant because it's above 0.05##

##Question Six: Choose some significant independent variables. Interpret its Estimates (or Beta Coefficients). How do the independent variables individually affect the dependent variable? ##

##Answer: I chose adoptions and fosters for the significant variables. Adoptions has an estimate of 0.4764 and fosters has an estimate of 2.386. As we learned in class, it's anticipated that with each unit increase of adoptions, the dependent variable, euthanasia rates, is expected to also increase by one unit of adoptions (0.4764) This would mean that as adoptions increase, so do the euthanasia rates. The same is true for fosters but the estimate for fosters is much larger than the estimate for adoptions which implies that there is a stronger effect on euthanasia rates than adoptions does.##

##Question Seven:Does the model you create meet or violate the assumption of linearity? Show your work with "plot(x,which=1)"##

##Answer: Based on the plot of the model below, it appears that the assumption of linearity is violated because it's not a straight line and instead it's somewhat curved.##

plot(model, which = 1)

SQRodriguez HWK6

Sarah Rodriguez

2024-10-25