Make sure to include the unit of the values whenever appropriate.
Hint: The variables are available in the gapminder data set from the gapminder package. Note that the data set and package both have the same name, gapminder.
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.3
## ✓ tibble 3.0.0 ✓ dplyr 0.8.5
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ───────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
options(scipen=999)
data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~ gdpPercap,
data = gapminder)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = lifeExp ~ gdpPercap, data = gapminder)
##
## Residuals:
## Min 1Q Median 3Q Max
## -82.754 -7.758 2.176 8.225 18.426
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 53.95556088 0.31499494 171.29 <0.0000000000000002 ***
## gdpPercap 0.00076488 0.00002579 29.66 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.49 on 1702 degrees of freedom
## Multiple R-squared: 0.3407, Adjusted R-squared: 0.3403
## F-statistic: 879.6 on 1 and 1702 DF, p-value: < 0.00000000000000022
Hint: Your answer must include a discussion on the p-value.
Yes the coefficnet of gdpPercap is signifigant because it’s less than 5%.
Hint: Discuss both its sign and magnitude.
GdaPercap has an estimiated life expectancy of 0.00076488. meaning that the gdpPercap increases by $1.
Hint: Provide a technical interpretation.
Intercepting the intercept value means that if you were born with $0 gdpPercap the life expectancy at birth is 53.95 years.
Hint: This is a model with two explanatory variables. Insert another code chunk below.
data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~ year, gdpPercap,
data = gapminder)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = lifeExp ~ year, data = gapminder, subset = gdpPercap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.221 -9.436 1.517 11.201 21.581
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -573.69800 56.15343 -10.22 <0.0000000000000002 ***
## year 0.31998 0.02837 11.28 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.86 on 580 degrees of freedom
## (1122 observations deleted due to missingness)
## Multiple R-squared: 0.1799, Adjusted R-squared: 0.1784
## F-statistic: 127.2 on 1 and 580 DF, p-value: < 0.00000000000000022
Hint: Discuss in terms of both residual standard error and reported adjusted R squared.
In the first model, the residual standard error is 10.49, while in the second model the residual standard error is 11.86. Meaning that the first model misses 10.49 people while the second model misses 11.86 people. The R-squared value for the first model is .3403 while the second models is .1784. These values mean that the first models data points are going to be further to the line of regression than the second models.
Hint: Discuss both its sign and magnitude.
Every year after 1952, the life expenctancy increases by 0.31998 yearly for newborns. This means the coefficent of year being 0.31998.
Hint: We had this discussion in class while watching the video at DataCamp, Correlation and Regression in R. The video is titled as “Interpretation of Regression” in Chapter 4: Interpreting Regression Models.
Based on the given models above, the second model displayed the predecited lise expectancy for a coultry with a gdpPercap of $40,000 in the year 1997 to be roughlty 76.49 years.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.