knitr::include_graphics("C:\\Users\\sergioor\\Desktop\\Old PC\\CUNY\\DATA606\\Chapter 7\\724.png")
a)The relationship between number of calories and amount of carbs may be linear, but not very strong. In fact, to me it looks like the small cloud of points in the lower left corner (low calorie, low carb) is forcing the linear relationship. It is positive if it exists.
b)Number of calories is the explanatory variable. Amount of carbs is the response variable.
c) We want to fit a regression line to examine whether we are doing better or worse compared to theoretical line. This comparison (residual) gives an idea whether there is a true relationship between indepedent and dependent variable. If residual for each oberserved value is too large we can say that explanatory variable cannot explain the dependent variable effectively and thus model is not a good approximation for predicting the outcome we want.
d)Linearity: As described in part (a), there may be a weak linear relationship. Nearly normal residuals: The histogram of the residuals is not completely symmetrical and may not necessarily be nearly normal. Constant variability: Based on the residual plot, I believe there is no constant variability as there are significantly more points on the right (with larger residuals) than on the left of the plot. Independent observations: Observations are independent since food items and their nutritional information does not depend on each other. I believe conditions are not met because of lack of constant variability and distribution of residuals that is not nearly enough normal.
knitr::include_graphics("C:\\Users\\sergioor\\Desktop\\Old PC\\CUNY\\DATA606\\Chapter 7\\726.png")
responseMean = 171.14
responseSD = 9.41
Rvar = .67
explanatoryMean= 107.2
explanatorySD= 10.37
slope = (responseSD/explanatorySD)*Rvar
intercept = responseMean - (slope)*explanatoryMean
regressionFunction = function(x,slope,intercept){
y =(x*slope)+ intercept
return(y)
}
tinyGraph = 1:150
tinyGraph = sapply(tinyGraph,regressionFunction,slope,intercept)
plot(tinyGraph, type = 'l',xlab = 'Shoulder girth',ylab='height')
The equation will be:
y=105.9651+0.6079749???x predicted Y = 105.9650878 + 0.6079749 * explanatory variable
(b)Interpret the slope and intercept in this context
The intercept has a positive offset, slope is going up there at a .6 rate to shoulder girth.The slope means that for every additional centimeter of shoulder girth the average height increases by 0.6079749 centimeters.
(c)Calculate R2 of the regression line for predicting height from shoulder girth, and interpret it in teh context of teh application.
\({ R }^{ 2 }_{ \quad }={ 0.67 }^{ 2 }\)
0.4489.44.89% of the variability in the height is explained by the model
A randomly selected student from your class has a shoulder girth of 100 cm. Predict the height of this student using the model.
I’m just plugging 100 into my function (regressionFunction) I made for question (a).166.7625805
The student from part (d) is 160 cm tall. Calculate the residual and explain what this means.
Residual = observed - predicted, which for this case is -6.7625805 Meaning the actual height is that much less than the line predicted.
A one year old has a shoulder girth of 56 cm. Would it be appropriate to use this linear model to predict the height of this child?
Original data only includes 80 to 140 shoulder girth which is a minus.Variability seems to be constant, correlation is strong and linear.While it would be innappropriate, I’d do it in absence of any other model.
knitr::include_graphics("C:\\Users\\sergioor\\Desktop\\Old PC\\CUNY\\DATA606\\Chapter 7\\730.png")
\(heart\quad weight\quad =\quad -0.357\quad +\quad 4.034*\quad body\quad weight }\)
Interpret the intercept
The intercept means that for a body weight of 0 kg, the average heart weight is -0.357 grams. It is an obviously theoretical example useful only to intepret the linear model.
Interpret the slope
The slope means that for each additional kilogram of body weight, the average heart weight of a cat increases by 4.034 grams.
(d)Interpret R2
64.66% of the variability in heart weight of cats can be explained by body weight.
Cacluclate hte correlation coefficient
We take the square root of R2 which is 0.8041144
knitr::include_graphics("C:\\Users\\sergioor\\Desktop\\Old PC\\CUNY\\DATA606\\Chapter 7\\740.png")
3.9983 = 4.010 + b1 * -.0883 >>>>> b1 = -0.1325028
Since the slope is positive the relationship is positive. If we set up a hypothesis test with H0:B1=0 and HA:B1>0, then based on the summary table the p???value is nearly 0. And this is for a two-sided test, so it’ll be even closer to 0 for a one-sided test. We reject the null hypothesis. There is convincing evidence that the relationship between teaching evluation and beauty is positive.
Linearity: Based on the scatterplot, there may be a weak linear relationship. There is no evident pattern in the residual plot. Nearly normal residuals: The histogram of the residuals exhibits a left skew. Additionally, the points seem to move away from the normal probability line on each end. However, the bulk of the data is very close to the line. I would conclude that the distribution of residuals is nearly normal. Constant variability: Based on residual plot, there appears to be constant variability in the data. Independent observations: Observations are not a time series, and can be assumed to be independent (unless there is evidence that students copied each other’s evaluations). I believe all conditions are satisfied for this linear model.