An Introduction to Statistical Learning (2nd ed.)

Labs

Chapter 03 Linear regression

The Boston data set contain various statistics for 506 neighborhoods in Boston. We will build a simple linear regression model that related the median value of owner-occupied homes (medv) as the response with a variable indicating the percentage of the population that belongs to a lower status (lstat) as the predictor.

The Boston data set is quite outdated and contains some really unfortunate variables.

boston <- tibble(Boston)
glimpse(boston)

Rows: 506
Columns: 14
$ crim    <dbl> 0.00632, 0.02731, 0.02729, 0.03237, 0.06905, 0.02985, 0.08829,~
$ zn      <dbl> 18.0, 0.0, 0.0, 0.0, 0.0, 0.0, 12.5, 12.5, 12.5, 12.5, 12.5, 1~
$ indus   <dbl> 2.31, 7.07, 7.07, 2.18, 2.18, 2.18, 7.87, 7.87, 7.87, 7.87, 7.~
$ chas    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
$ nox     <dbl> 0.538, 0.469, 0.469, 0.458, 0.458, 0.458, 0.524, 0.524, 0.524,~
$ rm      <dbl> 6.575, 6.421, 7.185, 6.998, 7.147, 6.430, 6.012, 6.172, 5.631,~
$ age     <dbl> 65.2, 78.9, 61.1, 45.8, 54.2, 58.7, 66.6, 96.1, 100.0, 85.9, 9~
$ dis     <dbl> 4.0900, 4.9671, 4.9671, 6.0622, 6.0622, 6.0622, 5.5605, 5.9505~
$ rad     <int> 1, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4,~
$ tax     <dbl> 296, 242, 242, 222, 222, 222, 311, 311, 311, 311, 311, 311, 31~
$ ptratio <dbl> 15.3, 17.8, 17.8, 18.7, 18.7, 18.7, 15.2, 15.2, 15.2, 15.2, 15~
$ black   <dbl> 396.90, 396.90, 392.83, 394.63, 396.90, 394.12, 395.60, 396.90~
$ lstat   <dbl> 4.98, 9.14, 4.03, 2.94, 5.33, 5.21, 12.43, 19.15, 29.93, 17.10~
$ medv    <dbl> 24.0, 21.6, 34.7, 33.4, 36.2, 28.7, 22.9, 27.1, 16.5, 18.9, 15~

colSums(is.na(boston))

   crim      zn   indus    chas     nox      rm     age     dis     rad     tax 
      0       0       0       0       0       0       0       0       0       0 
ptratio   black   lstat    medv 
      0       0       0       0

corr <- cor(boston)
corrplot(corr, method = "square", type = "upper")

ggpairs(boston)

ggplot(boston, aes(medv)) + 
  geom_density(fill = 'lightgrey') + 
  xlab("Median value") + 
  ggtitle("Median value of Boston homes in 506 neighborhoods ")

We start by creating a parsnip specification for a linear regression model.

lm_model <- 
  linear_reg() %>% 
  set_engine("lm")

lm_model

Linear Regression Model Specification (regression)

Computational engine: lm

#While it is unnecessary to set the mode for a linear regression since it can only be regression, we continue to do it in these labs to be explicit.

#The specification doesn’t perform any calculations by itself. It is just a specification of what we want to do.

Once we have the specification we can fit it by supplying a formula expression and the data we want to fit the model on. The formula is written on the form y ~ x where y is the name of the response and x is the name of the predictors. The names used in the formula should match the names of the variables in the data set passed to data.

lm_fit <- 
  fit(lm_model,
      medv ~ lstat,
      data = boston)

lm_fit

parsnip model object

Fit time:  0ms 

Call:
stats::lm(formula = medv ~ lstat, data = data)

Coefficients:
(Intercept)        lstat  
      34.55        -0.95

The result of this fit is a parsnip model object. This object contains the underlying fit as well as some parsnip-specific information. If we want to look at the underlying fit object we can access it with lm_fit$fit or with.

lm_fit %>% 
  pluck("fit") %>% 
  summary()


Call:
stats::lm(formula = medv ~ lstat, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.168  -3.990  -1.318   2.034  24.500 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.55384    0.56263   61.41   <2e-16 ***
lstat       -0.95005    0.03873  -24.53   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.216 on 504 degrees of freedom
Multiple R-squared:  0.5441,    Adjusted R-squared:  0.5432 
F-statistic: 601.6 on 1 and 504 DF,  p-value: < 2.2e-16

We can use packages from the broom package to extract key information out of the model objects in tidy formats.

the tidy() function returns the parameter estimates of a lm object.

tidy(lm_fit)

# A tibble: 2 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)   34.6      0.563       61.4 3.74e-236
2 lstat         -0.950    0.0387     -24.5 5.08e- 88

and glance() can be used to extract the model statistics.

glance1 <- glance(lm_fit)
glance1 <- glance1 %>% 
  select(-c(logLik, deviance, nobs)) %>% 
  mutate(model = "medv ~ lstat")
glance1

# A tibble: 1 x 10
  r.squared adj.r.squared sigma statistic  p.value    df   AIC   BIC df.residual
      <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl> <dbl> <dbl>       <int>
1     0.544         0.543  6.22      602. 5.08e-88     1 3289. 3302.         504
# ... with 1 more variable: model <chr>

We need to explicitly supply the data set that the predictions should be performed on via the new_data argument.

predict(lm_fit, boston)

# A tibble: 506 x 1
   .pred
   <dbl>
 1 29.8 
 2 25.9 
 3 30.7 
 4 31.8 
 5 29.5 
 6 29.6 
 7 22.7 
 8 16.4 
 9  6.12
10 18.3 
# ... with 496 more rows

Notice how the predictions are returned as a tibble. This will always be the case for parsnip models, no matter what engine is used. This is very useful since consistency allows us to combine data sets easily.

We can also return other types of predicts by specifying the type argument. Setting type = “conf_int” return a 95% confidence interval.

predict(lm_fit, boston, type = "conf_int")

# A tibble: 506 x 2
   .pred_lower .pred_upper
         <dbl>       <dbl>
 1       29.0        30.6 
 2       25.3        26.5 
 3       29.9        31.6 
 4       30.8        32.7 
 5       28.7        30.3 
 6       28.8        30.4 
 7       22.2        23.3 
 8       15.6        17.1 
 9        4.70        7.54
10       17.7        18.9 
# ... with 496 more rows

predict(lm_fit, boston, type = "pred_int")

# A tibble: 506 x 2
   .pred_lower .pred_upper
         <dbl>       <dbl>
 1       17.6         42.1
 2       13.6         38.1
 3       18.5         43.0
 4       19.5         44.0
 5       17.3         41.7
 6       17.4         41.8
 7       10.5         35.0
 8        4.13        28.6
 9       -6.18        18.4
10        6.08        30.5
# ... with 496 more rows

You can get the same results using the augment() function to same you a little bit of typing.

augment(lm_fit, boston) %>% 
  select(medv, .pred, .resid)

# A tibble: 506 x 3
    medv .pred .resid
   <dbl> <dbl>  <dbl>
 1  24   29.8  -5.82 
 2  21.6 25.9  -4.27 
 3  34.7 30.7   3.97 
 4  33.4 31.8   1.64 
 5  36.2 29.5   6.71 
 6  28.7 29.6  -0.904
 7  22.9 22.7   0.155
 8  27.1 16.4  10.7  
 9  16.5  6.12 10.4  
10  18.9 18.3   0.592
# ... with 496 more rows

Multiple Linear Regression

lm_fit2 <- 
  fit(lm_model,
      medv ~ lstat + age,
      data = boston)

lm_fit2

parsnip model object

Fit time:  0ms 

Call:
stats::lm(formula = medv ~ lstat + age, data = data)

Coefficients:
(Intercept)        lstat          age  
   33.22276     -1.03207      0.03454

lm_fit2 %>% 
  pluck("fit") %>% 
  summary()


Call:
stats::lm(formula = medv ~ lstat + age, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.981  -3.978  -1.283   1.968  23.158 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 33.22276    0.73085  45.458  < 2e-16 ***
lstat       -1.03207    0.04819 -21.416  < 2e-16 ***
age          0.03454    0.01223   2.826  0.00491 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.173 on 503 degrees of freedom
Multiple R-squared:  0.5513,    Adjusted R-squared:  0.5495 
F-statistic:   309 on 2 and 503 DF,  p-value: < 2.2e-16

tidy(lm_fit2)

# A tibble: 3 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)  33.2       0.731      45.5  2.94e-180
2 lstat        -1.03      0.0482    -21.4  8.42e- 73
3 age           0.0345    0.0122      2.83 4.91e-  3

glance2 <- glance(lm_fit2)
glance2 <- glance2 %>% 
  select(-c(logLik, deviance, nobs)) %>% 
  mutate(model = "medv ~ lstat + age")
glance2

# A tibble: 1 x 10
  r.squared adj.r.squared sigma statistic  p.value    df   AIC   BIC df.residual
      <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl> <dbl> <dbl>       <int>
1     0.551         0.549  6.17      309. 2.98e-88     2 3283. 3300.         503
# ... with 1 more variable: model <chr>

augment(lm_fit2, boston) %>% 
  select(medv, .pred, .resid)

# A tibble: 506 x 3
    medv .pred .resid
   <dbl> <dbl>  <dbl>
 1  24   30.3  -6.34 
 2  21.6 26.5  -4.92 
 3  34.7 31.2   3.53 
 4  33.4 31.8   1.63 
 5  36.2 29.6   6.61 
 6  28.7 29.9  -1.17 
 7  22.9 22.7   0.205
 8  27.1 16.8  10.3  
 9  16.5  5.79 10.7  
10  18.9 18.5   0.358
# ... with 496 more rows

Interaction terms

Adding interaction terms are quite easy to do using formula expressions. However, the syntax used to describe them isn’t accepted by all engines so we will go over how to include interaction terms using recipes as well.

There are two ways on including an interaction term; x:y and x * y

x:y will include the interaction between x and y,
x * y will include the interaction between x and y, x, and y, e.i. it is short for x:y + x + y.

with that out of the way let expand lm_fit2 by adding an interaction term

lm_fit3 <- 
  fit(lm_model,
      medv ~ lstat * age,
      data = boston)
lm_fit3

parsnip model object

Fit time:  0ms 

Call:
stats::lm(formula = medv ~ lstat * age, data = data)

Coefficients:
(Intercept)        lstat          age    lstat:age  
 36.0885359   -1.3921168   -0.0007209    0.0041560

lm_fit3 %>% 
  pluck("fit") %>% 
  summary()


Call:
stats::lm(formula = medv ~ lstat * age, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.806  -4.045  -1.333   2.085  27.552 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) 36.0885359  1.4698355  24.553  < 2e-16 ***
lstat       -1.3921168  0.1674555  -8.313 8.78e-16 ***
age         -0.0007209  0.0198792  -0.036   0.9711    
lstat:age    0.0041560  0.0018518   2.244   0.0252 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.149 on 502 degrees of freedom
Multiple R-squared:  0.5557,    Adjusted R-squared:  0.5531 
F-statistic: 209.3 on 3 and 502 DF,  p-value: < 2.2e-16

tidy(lm_fit3)

# A tibble: 4 x 5
  term         estimate std.error statistic  p.value
  <chr>           <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept) 36.1        1.47      24.6    4.91e-88
2 lstat       -1.39       0.167     -8.31   8.78e-16
3 age         -0.000721   0.0199    -0.0363 9.71e- 1
4 lstat:age    0.00416    0.00185    2.24   2.52e- 2

glance3 <- glance(lm_fit3)
glance3 <- glance3 %>% 
  select(-c(logLik, deviance, nobs)) %>% 
  mutate(model = "medv ~ lstat*age")
glance3

# A tibble: 1 x 10
  r.squared adj.r.squared sigma statistic  p.value    df   AIC   BIC df.residual
      <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl> <dbl> <dbl>       <int>
1     0.556         0.553  6.15      209. 4.86e-88     3 3280. 3301.         502
# ... with 1 more variable: model <chr>

augment(lm_fit3, boston) %>% 
  select(medv, .pred, .resid)

# A tibble: 506 x 3
    medv .pred .resid
   <dbl> <dbl>  <dbl>
 1  24   30.5  -6.46 
 2  21.6 26.3  -4.70 
 3  34.7 31.5   3.24 
 4  33.4 32.5   0.878
 5  36.2 29.8   6.37 
 6  28.7 30.1  -1.36 
 7  22.9 22.2   0.723
 8  27.1 17.0  10.1  
 9  16.5  6.79  9.71 
10  18.9 18.3   0.574
# ... with 496 more rows

Sometimes we want to perform transformations, and we want those transformations to be applied, as part of the model fit as a pre-processing step. We will use the recipes package for this task.

We use the step_interact() to specify the interaction term. Next, we create a workflow object to combine the linear regression model specification lm_spec with the pre-processing specification rec_spec_interact which can then be fitted much like a parsnip model specification.

lm_rec_interact <- 
  recipe(medv ~ lstat + age, data = boston) %>% 
  step_interact(~ lstat : age)

lm_wf_interact <- 
  workflow() %>% 
  add_model(lm_model) %>% 
  add_recipe(lm_rec_interact)

lm_wf_interact

== Workflow ====================================================================
Preprocessor: Recipe
Model: linear_reg()

-- Preprocessor ----------------------------------------------------------------
1 Recipe Step

* step_interact()

-- Model -----------------------------------------------------------------------
Linear Regression Model Specification (regression)

Computational engine: lm

fit(lm_wf_interact, boston)

== Workflow [trained] ==========================================================
Preprocessor: Recipe
Model: linear_reg()

-- Preprocessor ----------------------------------------------------------------
1 Recipe Step

* step_interact()

-- Model -----------------------------------------------------------------------

Call:
stats::lm(formula = ..y ~ ., data = data)

Coefficients:
(Intercept)        lstat          age  lstat_x_age  
 36.0885359   -1.3921168   -0.0007209    0.0041560

Non-linear transformations of the predictors

Much like we could use recipes to create interaction terms between values are we able to apply transformations to individual variables as well. If you are familiar with the dplyr package then you know how to mutate() which works in much the same way using step_mutate().

You would want to keep as much of the pre-processing inside recipes such that the transformation will be applied consistently to new data.

lm_rec_pow2 <- 
  recipe(medv ~ lstat, data = boston) %>% 
  step_mutate(lstat2 = lstat ^ 2)

lm_wf_pow2 <- 
  workflow() %>% 
  add_model(lm_model) %>% 
  add_recipe(lm_rec_pow2)

lm_fit_pow2 <- fit(lm_wf_pow2, boston)

tidy(lm_fit_pow2)

# A tibble: 3 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)  42.9      0.872        49.1 3.50e-194
2 lstat        -2.33     0.124       -18.8 2.55e- 60
3 lstat2        0.0435   0.00375      11.6 7.63e- 28

glance4 <- glance(lm_fit_pow2)
glance4 <- glance4 %>% 
  select(-c(logLik, deviance, nobs)) %>% 
  mutate(model = "medv ~ lstat ^2")
glance4

# A tibble: 1 x 10
  r.squared adj.r.squared sigma statistic   p.value    df   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl> <dbl> <dbl>
1     0.641         0.639  5.52      449. 1.56e-112     2 3171. 3187.
# ... with 2 more variables: df.residual <int>, model <chr>

lm log model

lm_rec_log <- 
  recipe(
    medv ~ lstat, data = boston) %>% 
  step_log(lstat)
lm_rec_log

Data Recipe

Inputs:

      role #variables
   outcome          1
 predictor          1

Operations:

Log transformation on lstat

lm_wf_log <- 
  workflow() %>% 
  add_model(lm_model) %>% 
  add_recipe(lm_rec_log)

lm_fit_log <- fit(lm_wf_log, boston)

tidy(lm_fit_log)

# A tibble: 2 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)     52.1     0.965      54.0 1.02e-211
2 lstat          -12.5     0.395     -31.6 9.28e-122

glance5 <- glance(lm_fit_log)
glance5 <- glance5 %>% 
  select(-c(logLik, deviance, nobs)) %>% 
  mutate(model = "medv ~ log(lstat)")
glance5

# A tibble: 1 x 10
  r.squared adj.r.squared sigma statistic   p.value    df   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl> <dbl> <dbl>
1     0.665         0.664  5.33     1000. 9.28e-122     1 3133. 3146.
# ... with 2 more variables: df.residual <int>, model <chr>

Comparing models

comparison <- 
  bind_rows(glance1, glance2, glance3, glance4, glance5) %>% 
  select(model, everything())

comparison

# A tibble: 5 x 10
  model      r.squared adj.r.squared sigma statistic   p.value    df   AIC   BIC
  <chr>          <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl> <dbl> <dbl>
1 medv ~ ls~     0.544         0.543  6.22      602. 5.08e- 88     1 3289. 3302.
2 medv ~ ls~     0.551         0.549  6.17      309. 2.98e- 88     2 3283. 3300.
3 medv ~ ls~     0.556         0.553  6.15      209. 4.86e- 88     3 3280. 3301.
4 medv ~ ls~     0.641         0.639  5.52      449. 1.56e-112     2 3171. 3187.
5 medv ~ lo~     0.665         0.664  5.33     1000. 9.28e-122     1 3133. 3146.
# ... with 1 more variable: df.residual <int>

Qualitative predictors

We will now turn our attention to the Carseats data set. We will attempt to predict Sales of child car seats in 400 locations based on a number of predictors. One of these variables is ShelveLoc which is a qualitative predictor that indicates the quality of the shelving location. ShelveLoc takes on three possible values.

Bad
Medium
Good

If you pass such a variable to lm() it will read it and generate dummy variables automatically using the following convention.

carseats = tibble(Carseats)
glimpse(carseats)

Rows: 400
Columns: 11
$ Sales       <dbl> 9.50, 11.22, 10.06, 7.40, 4.15, 10.81, 6.63, 11.85, 6.54, ~
$ CompPrice   <dbl> 138, 111, 113, 117, 141, 124, 115, 136, 132, 132, 121, 117~
$ Income      <dbl> 73, 48, 35, 100, 64, 113, 105, 81, 110, 113, 78, 94, 35, 2~
$ Advertising <dbl> 11, 16, 10, 4, 3, 13, 0, 15, 0, 0, 9, 4, 2, 11, 11, 5, 0, ~
$ Population  <dbl> 276, 260, 269, 466, 340, 501, 45, 425, 108, 131, 150, 503,~
$ Price       <dbl> 120, 83, 80, 97, 128, 72, 108, 120, 124, 124, 100, 94, 136~
$ ShelveLoc   <fct> Bad, Good, Medium, Medium, Bad, Bad, Medium, Good, Medium,~
$ Age         <dbl> 42, 65, 59, 55, 38, 78, 71, 67, 76, 76, 26, 50, 62, 53, 52~
$ Education   <dbl> 17, 10, 12, 14, 13, 16, 15, 10, 10, 17, 10, 13, 18, 18, 18~
$ Urban       <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, No, No, No, Yes, Ye~
$ US          <fct> Yes, Yes, Yes, Yes, No, Yes, No, Yes, No, Yes, Yes, Yes, N~

levels(carseats$ShelveLoc)

[1] "Bad"    "Good"   "Medium"

carseats %>% 
  pull(ShelveLoc) %>% 
  contrasts()

       Good Medium
Bad       0      0
Good      1      0
Medium    0      1

The step_dummy() will perform the same transformation of turning 1 qualitative with C levels into C-1 indicator variables. While this might seem unnecessary right now, some of the engines, later on, do not handle qualitative variables and this step would be necessary. We are also using the all_nominal_predictors() selector to select all character and factor predictor variables. This allows us to select by type rather than having to type out the names.

lm_rec_qual <-
  recipe(Sales ~ ., data = carseats) %>% 
  step_dummy(all_nominal_predictors()) %>% 
  step_interact(~ Income: Advertising + Price: Age)

lm_wf_qual <- 
  workflow() %>% 
  add_model(lm_model) %>% 
  add_recipe(lm_rec_qual)

lm_wf_qual

== Workflow ====================================================================
Preprocessor: Recipe
Model: linear_reg()

-- Preprocessor ----------------------------------------------------------------
2 Recipe Steps

* step_dummy()
* step_interact()

-- Model -----------------------------------------------------------------------
Linear Regression Model Specification (regression)

Computational engine: lm

fit(lm_wf_qual, carseats)

== Workflow [trained] ==========================================================
Preprocessor: Recipe
Model: linear_reg()

-- Preprocessor ----------------------------------------------------------------
2 Recipe Steps

* step_dummy()
* step_interact()

-- Model -----------------------------------------------------------------------

Call:
stats::lm(formula = ..y ~ ., data = data)

Coefficients:
         (Intercept)             CompPrice                Income  
           6.5755654             0.0929371             0.0108940  
         Advertising            Population                 Price  
           0.0702462             0.0001592            -0.1008064  
                 Age             Education        ShelveLoc_Good  
          -0.0579466            -0.0208525             4.8486762  
    ShelveLoc_Medium             Urban_Yes                US_Yes  
           1.9532620             0.1401597            -0.1575571  
Income_x_Advertising           Price_x_Age  
           0.0007510             0.0001068

fit(lm_wf_qual, carseats) %>% tidy()

# A tibble: 14 x 5
   term                  estimate std.error statistic   p.value
   <chr>                    <dbl>     <dbl>     <dbl>     <dbl>
 1 (Intercept)           6.58      1.01         6.52  2.22e- 10
 2 CompPrice             0.0929    0.00412     22.6   1.64e- 72
 3 Income                0.0109    0.00260      4.18  3.57e-  5
 4 Advertising           0.0702    0.0226       3.11  2.03e-  3
 5 Population            0.000159  0.000368     0.433 6.65e-  1
 6 Price                -0.101     0.00744    -13.5   1.74e- 34
 7 Age                  -0.0579    0.0160      -3.63  3.18e-  4
 8 Education            -0.0209    0.0196      -1.06  2.88e-  1
 9 ShelveLoc_Good        4.85      0.153       31.7   1.38e-109
10 ShelveLoc_Medium      1.95      0.126       15.5   1.34e- 42
11 Urban_Yes             0.140     0.112        1.25  2.13e-  1
12 US_Yes               -0.158     0.149       -1.06  2.91e-  1
13 Income_x_Advertising  0.000751  0.000278     2.70  7.29e-  3
14 Price_x_Age           0.000107  0.000133     0.801 4.24e-  1

fit(lm_wf_qual, carseats) %>% glance()

# A tibble: 1 x 12
  r.squared adj.r.squared sigma statistic   p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.876         0.872  1.01      210. 6.14e-166    13  -565. 1159. 1219.
# ... with 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>

An Introduction to Statistcial Learning https://www.statlearning.com/

– END

sessionInfo()

R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Mexico.1252  LC_CTYPE=Spanish_Mexico.1252   
[3] LC_MONETARY=Spanish_Mexico.1252 LC_NUMERIC=C                   
[5] LC_TIME=Spanish_Mexico.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] corrplot_0.90      GGally_2.1.2       yardstick_0.0.8    workflowsets_0.0.2
 [5] workflows_0.2.2    tune_0.1.5         tidyr_1.1.3        tibble_3.1.2      
 [9] rsample_0.1.0      recipes_0.1.16     purrr_0.3.4        parsnip_0.1.6     
[13] modeldata_0.1.1    infer_0.5.4        ggplot2_3.3.5      dplyr_1.0.7       
[17] dials_0.0.9        scales_1.1.1       broom_0.7.8        tidymodels_0.1.3  
[21] MASS_7.3-54        ISLR_1.2          

loaded via a namespace (and not attached):
 [1] sass_0.4.0         jsonlite_1.7.2     splines_4.1.0      foreach_1.5.1     
 [5] prodlim_2019.11.13 bslib_0.2.5.1      assertthat_0.2.1   highr_0.9         
 [9] GPfit_1.0-8        yaml_2.2.1         globals_0.14.0     ipred_0.9-11      
[13] pillar_1.6.1       backports_1.2.1    lattice_0.20-44    glue_1.4.2        
[17] pROC_1.17.0.1      digest_0.6.27      RColorBrewer_1.1-2 hardhat_0.1.6     
[21] colorspace_2.0-2   htmltools_0.5.1.1  Matrix_1.3-3       plyr_1.8.6        
[25] timeDate_3043.102  pkgconfig_2.0.3    lhs_1.1.1          DiceDesign_1.9    
[29] listenv_0.8.0      gower_0.2.2        lava_1.6.9         farver_2.1.0      
[33] generics_0.1.0     ellipsis_0.3.2     withr_2.4.2        furrr_0.2.3       
[37] nnet_7.3-16        cli_3.0.0          survival_3.2-11    magrittr_2.0.1    
[41] crayon_1.4.1       evaluate_0.14      future_1.21.0      fansi_0.5.0       
[45] parallelly_1.26.1  class_7.3-19       prettyunits_1.1.1  tools_4.1.0       
[49] lifecycle_1.0.0    stringr_1.4.0      munsell_0.5.0      compiler_4.1.0    
[53] jquerylib_0.1.4    rlang_0.4.11       grid_4.1.0         iterators_1.0.13  
[57] rstudioapi_0.13    labeling_0.4.2     rmarkdown_2.9      gtable_0.3.0      
[61] codetools_0.2-18   reshape_0.8.8      DBI_1.1.1          R6_2.5.0          
[65] lubridate_1.7.10   knitr_1.33         utf8_1.2.1         stringi_1.6.2     
[69] parallel_4.1.0     Rcpp_1.0.7         vctrs_0.3.8        rpart_4.1-15      
[73] tidyselect_1.1.1   xfun_0.24

03. Linear regression

GS

10/8/2021