## [1] "C:/Users/User/Dropbox/CUNY/606Statistics/Assignments"
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 488676 26.1 940480 50.3 750400 40.1
## Vcells 884446 6.8 1650153 12.6 1131733 8.7
##
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics
## This package is designed to support this course. The text book used
## is OpenIntro Statistics, 3rd Edition. You can read this by typing
## vignette('os3') or visit www.OpenIntro.org.
##
## The getLabs() function will return a list of the labs available.
##
## The demo(package='DATA606') will list the demos that are available.
Exercise: 7.24 Nutrition at Starbucks, Part 1.
The relationship between the amount of calories and the amount of carbs is linear, moderate and positive. It has alot of deviation.
Explanatory Variable: Calories (x axis) and Response variable: Carbs(y axis).
Regression line can help us to make predictions. In this data set, the amount of carbs based on the number of calories can also be predicted through regression line.
It is not a good option to use Regression line. The residuals histogram is not symmetrical because it is skewed to left.
Exercise: Body Measurements, Part 3.
a.
\(s_x = 10.37\)
\(b_1\) = \(\frac{s_x}{s_y}R\) = 0.6079749
\(y - \hat{y} = b_1 (x - x_1)\) \(y = 105.9651 + 0.6079749 * x\)
Regression \(\hat{height}\) = 105.9651 + 0.6079749 * girth
b.
The slope of this data set indicates that for every additional change in the shoulder girth, the height will go up by 0.6079749 cms. The intercept is also positive which means that at 0 centimeters of shoulder girth, the height will be approx. 105.9651 centimeters.
44.89% of variation in height can be explained by by shoulder girth.
x=100, the predicted to be \(\hat{y}=105.9651+0.6079749*100\) = 166.7626 cms.
Residual is defined as observerd-predicted, -6.7625805, which in this case is negative. Negative residual means overestimation of height. \(e_i = y_i - \hat{y} = 160 - 166.7627 = -6.7627cm\)
The range of shoulder is girth from 85 cm to 135 cm. Since the value of 56 is outside the range . thats why its not correct to use the linear model for making prediction of height.
Exercise: 7.30 Cats, Part 1
\(\hat{heart weight} = -0.357 + 4.034 * body weight\)
Correaltion coefficient \(R = \sqrt{0.6466} = 0.8041144\).
Exercise: 7.40 Rate my Professor.
a.
\(\hat{x} = -0.0883\) and \(\hat{y} = 3.9983\)
\(b_1 = \frac{3.9983-4.01}{-0.0883} = 0.1325028\)
b.Relationship is positive because the slope is positive. \(H_0 : \beta_1 = 0\) and \(H_A : \beta_1 > 0\). The p-value is nearly zero.