Make sure to include the unit of the values whenever appropriate.
Hint: The variables are available in the gapminder data set from the gapminder package. Note that the data set and package both have the same name, gapminder.
library(tidyverse)
options(scipen=999)
data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~gdpPercap,
data = gapminder)
# View summary of model 1`
summary(houses_lm)
##
## Call:
## lm(formula = lifeExp ~ gdpPercap, data = gapminder)
##
## Residuals:
## Min 1Q Median 3Q Max
## -82.754 -7.758 2.176 8.225 18.426
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 53.95556088 0.31499494 171.29 <0.0000000000000002 ***
## gdpPercap 0.00076488 0.00002579 29.66 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.49 on 1702 degrees of freedom
## Multiple R-squared: 0.3407, Adjusted R-squared: 0.3403
## F-statistic: 879.6 on 1 and 1702 DF, p-value: < 0.00000000000000022
Hint: Your answer must include a discussion on the p-value.
Yes the coefficient is statistically significant because there are multiple zeros (0.00076488), showing it is much less than the p-value (0.05).
Hint: Discuss both its sign and magnitude.
Every year it looked as though it went up about $1 US Dollar. (Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1). This coefficient was positive. Each time it goes up, so does lifeExp
Hint: Provide a technical interpretation.
The intercept is 53.95 (54 years). The technical interpretation for this is that it is the average life expectancy for GDP per capita.
Hint: This is a model with two explanatory variables. Insert another code chunk below.
library(tidyverse)
options(scipen=999)
data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~gdpPercap + year,
data = gapminder)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = lifeExp ~ gdpPercap + year, data = gapminder)
##
## Residuals:
## Min 1Q Median 3Q Max
## -67.262 -6.954 1.219 7.759 19.553
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -418.42425945 27.61713769 -15.15 <0.0000000000000002 ***
## gdpPercap 0.00066973 0.00002447 27.37 <0.0000000000000002 ***
## year 0.23898275 0.01397107 17.11 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.694 on 1701 degrees of freedom
## Multiple R-squared: 0.4375, Adjusted R-squared: 0.4368
## F-statistic: 661.4 on 2 and 1701 DF, p-value: < 0.00000000000000022
Hint: Discuss in terms of both residual standard error and reported adjusted R squared.
The model in question 5 is better due to residual standard errors being lower in this model than in the first model.In the first model, the adjusted R squared is 0.3403 and in the second model it is 0.4368, which is slightly bigger. The residual standard error provides the absolute measure of the typical distance that the data points fall from the regression line, and its better in the second model
Hint: Discuss both its sign and magnitude.
the coefficient of year is 0.23898275. The sign of this coefficient is positive and it is equal to 5 percent in magnitude.
Hint: We had this discussion in class while watching the video at DataCamp, Correlation and Regression in R. The video is titled as “Interpretation of Regression” in Chapter 4: Interpreting Regression Models.
The coefficient for a year in 1997 is positive, showing that year positively affected the gdpPercap. 76 years of life for the predicted life expectancy for a country with gdpPercap of 40,000 in 1997
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.