This project analyzes the drivers of Walmart’s weekly store-level sales using linear regression and predictive modeling in R.
The analysis progresses from simple models to a robust log-linear specification, balancing interpretability with predictive performance.
Key findings:
This report is designed to be read end-to-end without running any code.
This report is designed to be read end-to-end without running any code.
## Warning: package 'tidyverse' was built under R version 4.4.1
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## ── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
##
## ✔ broom 1.0.6 ✔ rsample 1.2.1
## ✔ dials 1.2.1 ✔ tune 1.2.1
## ✔ infer 1.0.7 ✔ workflows 1.1.4
## ✔ modeldata 1.3.0 ✔ workflowsets 1.1.0
## ✔ parsnip 1.2.1 ✔ yardstick 1.3.1
## ✔ recipes 1.0.10
##
## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ scales::discard() masks purrr::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ recipes::fixed() masks stringr::fixed()
## ✖ dplyr::lag() masks stats::lag()
## ✖ yardstick::spec() masks readr::spec()
## ✖ recipes::step() masks stats::step()
## • Use suppressPackageStartupMessages() to eliminate package startup messages
## Warning: package 'tidylog' was built under R version 4.4.1
##
## Attaching package: 'tidylog'
##
## The following objects are masked from 'package:dplyr':
##
## add_count, add_tally, anti_join, count, distinct, distinct_all,
## distinct_at, distinct_if, filter, filter_all, filter_at, filter_if,
## full_join, group_by, group_by_all, group_by_at, group_by_if,
## inner_join, left_join, mutate, mutate_all, mutate_at, mutate_if,
## relocate, rename, rename_all, rename_at, rename_if, rename_with,
## right_join, sample_frac, sample_n, select, select_all, select_at,
## select_if, semi_join, slice, slice_head, slice_max, slice_min,
## slice_sample, slice_tail, summarise, summarise_all, summarise_at,
## summarise_if, summarize, summarize_all, summarize_at, summarize_if,
## tally, top_frac, top_n, transmute, transmute_all, transmute_at,
## transmute_if, ungroup
##
## The following objects are masked from 'package:tidyr':
##
## drop_na, fill, gather, pivot_longer, pivot_wider, replace_na,
## separate_wider_delim, separate_wider_position,
## separate_wider_regex, spread, uncount
##
## The following object is masked from 'package:stats':
##
## filter
## Rows: 6435 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (7): Store, Temperature, Fuel_Price, CPI, Unemployment, Size, Weekly_Sales
## lgl (1): IsHoliday
## date (1): Date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## rename_with: renamed 9 variables (store, date, isholiday, temperature, fuel_price, …)
## # A tibble: 6 × 9
## store date isholiday temperature fuel_price cpi unemployment size
## <dbl> <date> <lgl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 2010-04-16 FALSE 66.3 2.81 210. 7.81 151315
## 2 1 2012-04-06 FALSE 70.4 3.89 221. 7.14 151315
## 3 1 2010-08-06 FALSE 87.2 2.63 212. 7.79 151315
## 4 1 2010-02-05 FALSE 42.3 2.57 211. 8.11 151315
## 5 1 2012-08-17 FALSE 84.8 3.57 222. 6.91 151315
## 6 1 2011-02-04 FALSE 42.3 2.99 213. 7.74 151315
## # ℹ 1 more variable: weekly_sales <dbl>
## spc_tbl_ [6,435 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ store : num [1:6435] 1 1 1 1 1 1 1 1 1 1 ...
## $ date : Date[1:6435], format: "2010-04-16" "2012-04-06" ...
## $ isholiday : logi [1:6435] FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ temperature : num [1:6435] 66.3 70.4 87.2 42.3 84.8 ...
## $ fuel_price : num [1:6435] 2.81 3.89 2.63 2.57 3.57 ...
## $ cpi : num [1:6435] 210 221 212 211 222 ...
## $ unemployment: num [1:6435] 7.81 7.14 7.79 8.11 6.91 ...
## $ size : num [1:6435] 151315 151315 151315 151315 151315 ...
## $ weekly_sales: num [1:6435] 1105515 1505325 837329 1112467 1085133 ...
## - attr(*, "spec")=
## .. cols(
## .. Store = col_double(),
## .. Date = col_date(format = ""),
## .. IsHoliday = col_logical(),
## .. Temperature = col_double(),
## .. Fuel_Price = col_double(),
## .. CPI = col_double(),
## .. Unemployment = col_double(),
## .. Size = col_double(),
## .. Weekly_Sales = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
## # A tibble: 6 × 9
## store date isholiday temperature fuel_price cpi unemployment size
## <dbl> <date> <lgl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 2010-04-16 FALSE 66.3 2.81 210. 7.81 151315
## 2 1 2012-04-06 FALSE 70.4 3.89 221. 7.14 151315
## 3 1 2010-08-06 FALSE 87.2 2.63 212. 7.79 151315
## 4 1 2010-02-05 FALSE 42.3 2.57 211. 8.11 151315
## 5 1 2012-08-17 FALSE 84.8 3.57 222. 6.91 151315
## 6 1 2011-02-04 FALSE 42.3 2.99 213. 7.74 151315
## # ℹ 1 more variable: weekly_sales <dbl>
## spc_tbl_ [6,435 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ store : num [1:6435] 1 1 1 1 1 1 1 1 1 1 ...
## $ date : Date[1:6435], format: "2010-04-16" "2012-04-06" ...
## $ isholiday : logi [1:6435] FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ temperature : num [1:6435] 66.3 70.4 87.2 42.3 84.8 ...
## $ fuel_price : num [1:6435] 2.81 3.89 2.63 2.57 3.57 ...
## $ cpi : num [1:6435] 210 221 212 211 222 ...
## $ unemployment: num [1:6435] 7.81 7.14 7.79 8.11 6.91 ...
## $ size : num [1:6435] 151315 151315 151315 151315 151315 ...
## $ weekly_sales: num [1:6435] 1105515 1505325 837329 1112467 1085133 ...
## - attr(*, "spec")=
## .. cols(
## .. Store = col_double(),
## .. Date = col_date(format = ""),
## .. IsHoliday = col_logical(),
## .. Temperature = col_double(),
## .. Fuel_Price = col_double(),
## .. CPI = col_double(),
## .. Unemployment = col_double(),
## .. Size = col_double(),
## .. Weekly_Sales = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
##
## Call:
## stats::lm(formula = weekly_sales ~ cpi, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -662386 -318443 -73868 258442 2095880
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 827280.5 21778.4 37.986 < 2e-16 ***
## cpi -732.7 123.7 -5.923 3.33e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 390600 on 6433 degrees of freedom
## Multiple R-squared: 0.005423, Adjusted R-squared: 0.005269
## F-statistic: 35.08 on 1 and 6433 DF, p-value: 3.332e-09
## filter: removed 6,292 rows (98%), 143 rows remaining
## `geom_smooth()` using formula = 'y ~ x'
## filter: removed 6,292 rows (98%), 143 rows remaining
## `geom_smooth()` using formula = 'y ~ x'
## filter: removed 6,292 rows (98%), 143 rows remaining
## `geom_smooth()` using formula = 'y ~ x'
## filter: removed 6,292 rows (98%), 143 rows remaining
## `geom_smooth()` using formula = 'y ~ x'
## filter: removed 5,720 rows (89%), 715 rows remaining
## Warning: No renderer available. Please install the gifski, av, or magick package to
## create animated output
## NULL
## group_by: one grouping variable (store)
## filter (grouped): removed 315 rows (88%), 45 rows remaining (removed 0 groups, 45 groups remaining)
## # A tibble: 45 × 6
## # Groups: store [45]
## store term estimate std.error statistic p.value
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1 cpi -15806. 15501. -1.02 0.310
## 2 2 cpi 30013. 28889. 1.04 0.301
## 3 3 cpi 9663. 7616. 1.27 0.207
## 4 4 cpi -20934. 61271. -0.342 0.733
## 5 5 cpi 6166. 5817. 1.06 0.291
## 6 6 cpi 10838. 19566. 0.554 0.581
## 7 7 cpi -1555. 17927. -0.0867 0.931
## 8 8 cpi -7967. 10842. -0.735 0.464
## 9 9 cpi 10544. 8460. 1.25 0.215
## 10 10 cpi -79325. 58475. -1.36 0.177
## # ℹ 35 more rows
## filter: removed 4,500 rows (70%), 1,935 rows remaining
## NULL
## `geom_smooth()` using formula = 'y ~ x'
## filter: removed 6,392 rows (99%), 43 rows remaining
## NULL
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## stats::lm(formula = weekly_sales ~ cpi + size, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -563750 -167145 -29612 112172 1912650
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.828e+05 1.497e+04 12.216 <2e-16 ***
## cpi -6.570e+02 7.692e+01 -8.542 <2e-16 ***
## size 4.847e+00 4.796e-02 101.048 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 242800 on 6432 degrees of freedom
## Multiple R-squared: 0.6156, Adjusted R-squared: 0.6155
## F-statistic: 5151 on 2 and 6432 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: weekly_sales ~ cpi
## Model 2: weekly_sales ~ cpi + size
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 6433 9.8128e+14
## 2 6432 3.7924e+14 1 6.0204e+14 10211 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 827280. 21778. 38.0 4.65e-285
## 2 cpi -733. 124. -5.92 3.33e- 9
## # A tibble: 3 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 182832. 14967. 12.2 6.08e-34
## 2 cpi -657. 76.9 -8.54 1.63e-17
## 3 size 4.85 0.0480 101. 0
##
## Call:
## stats::lm(formula = weekly_sales ~ . - store - date, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -557148 -165608 -24125 112851 1918479
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.133e+05 3.546e+04 8.834 < 2e-16 ***
## isholidayTRUE 6.012e+04 1.196e+04 5.026 5.14e-07 ***
## temperature 1.002e+03 1.739e+02 5.761 8.72e-09 ***
## fuel_price -1.333e+04 6.822e+03 -1.954 0.0507 .
## cpi -9.461e+02 8.445e+01 -11.203 < 2e-16 ***
## unemployment -1.252e+04 1.725e+03 -7.258 4.40e-13 ***
## size 4.840e+00 4.802e-02 100.786 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 241200 on 6428 degrees of freedom
## Multiple R-squared: 0.621, Adjusted R-squared: 0.6206
## F-statistic: 1755 on 6 and 6428 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: weekly_sales ~ cpi + size
## Model 2: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 6432 3.7924e+14
## 2 6428 3.7394e+14 4 5.3028e+12 22.789 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## stats::lm(formula = weekly_sales ~ . - store - date + isholiday *
## temperature, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -557499 -165415 -24493 112914 1918376
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.148e+05 3.565e+04 8.830 < 2e-16 ***
## isholidayTRUE 4.745e+04 3.265e+04 1.453 0.1462
## temperature 9.809e+02 1.808e+02 5.424 6.04e-08 ***
## fuel_price -1.342e+04 6.826e+03 -1.966 0.0493 *
## cpi -9.460e+02 8.446e+01 -11.200 < 2e-16 ***
## unemployment -1.251e+04 1.725e+03 -7.254 4.53e-13 ***
## size 4.840e+00 4.802e-02 100.779 < 2e-16 ***
## isholidayTRUE:temperature 2.473e+02 5.932e+02 0.417 0.6768
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 241200 on 6427 degrees of freedom
## Multiple R-squared: 0.621, Adjusted R-squared: 0.6206
## F-statistic: 1504 on 7 and 6427 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date
## Model 2: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date + isholiday * temperature
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 6428 3.7394e+14
## 2 6427 3.7393e+14 1 1.0112e+10 0.1738 0.6768
##
## Call:
## stats::lm(formula = weekly_sales ~ . - store - date + I(temperature^2),
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -561455 -165260 -24674 112058 1911166
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.610e+05 4.111e+04 6.350 2.30e-10 ***
## isholidayTRUE 6.230e+04 1.199e+04 5.197 2.09e-07 ***
## temperature 3.294e+03 9.301e+02 3.542 0.0004 ***
## fuel_price -1.471e+04 6.841e+03 -2.151 0.0315 *
## cpi -9.547e+02 8.449e+01 -11.300 < 2e-16 ***
## unemployment -1.253e+04 1.724e+03 -7.268 4.09e-13 ***
## size 4.831e+00 4.811e-02 100.420 < 2e-16 ***
## I(temperature^2) -1.982e+01 7.901e+00 -2.509 0.0121 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 241100 on 6427 degrees of freedom
## Multiple R-squared: 0.6214, Adjusted R-squared: 0.621
## F-statistic: 1507 on 7 and 6427 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date
## Model 2: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date + I(temperature^2)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 6428 3.7394e+14
## 2 6427 3.7357e+14 1 3.6586e+11 6.2943 0.01214 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## stats::lm(formula = weekly_sales ~ . - date - store + I(temperature^2),
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -557260 -165114 -25112 115048 1913671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.546e+05 4.725e+04 5.389 7.42e-08 ***
## isholidayTRUE 6.038e+04 1.397e+04 4.323 1.57e-05 ***
## temperature 3.056e+03 1.068e+03 2.861 0.00424 **
## fuel_price -1.939e+04 7.819e+03 -2.480 0.01316 *
## cpi -9.217e+02 9.640e+01 -9.561 < 2e-16 ***
## unemployment -1.058e+04 1.992e+03 -5.312 1.14e-07 ***
## size 4.826e+00 5.496e-02 87.809 < 2e-16 ***
## I(temperature^2) -1.628e+01 9.058e+00 -1.797 0.07237 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 239000 on 4818 degrees of freedom
## Multiple R-squared: 0.6248, Adjusted R-squared: 0.6242
## F-statistic: 1146 on 7 and 4818 DF, p-value: < 2.2e-16
## # A tibble: 8 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 254647. 47252. 5.39 7.42e- 8
## 2 isholidayTRUE 60380. 13967. 4.32 1.57e- 5
## 3 temperature 3056. 1068. 2.86 4.24e- 3
## 4 fuel_price -19393. 7819. -2.48 1.32e- 2
## 5 cpi -922. 96.4 -9.56 1.81e-21
## 6 unemployment -10580. 1992. -5.31 1.14e- 7
## 7 size 4.83 0.0550 87.8 0
## 8 I(temperature^2) -16.3 9.06 -1.80 7.24e- 2
## rename: renamed one variable (Predicted_Sales)
## # A tibble: 1,609 × 10
## Predicted_Sales store date isholiday temperature fuel_price cpi
## <dbl> <dbl> <date> <lgl> <dbl> <dbl> <dbl>
## 1 754860. 1 2010-02-05 FALSE 42.3 2.57 211.
## 2 215324. 3 2010-02-05 FALSE 45.7 2.57 214.
## 3 1006246. 6 2010-02-05 FALSE 40.4 2.57 213.
## 4 774226. 8 2010-02-05 FALSE 34.1 2.57 214.
## 5 585828. 12 2010-02-05 FALSE 49.5 2.96 126.
## 6 978757. 14 2010-02-05 FALSE 27.3 2.78 182.
## 7 528790. 17 2010-02-05 FALSE 23.1 2.67 126.
## 8 693424. 21 2010-02-05 FALSE 39.0 2.57 211.
## 9 1032041. 24 2010-02-05 FALSE 22.4 2.95 132.
## 10 220072. 36 2010-02-05 FALSE 46.0 2.54 210.
## # ℹ 1,599 more rows
## # ℹ 3 more variables: unemployment <dbl>, size <dbl>, weekly_sales <dbl>
## # A tibble: 2 × 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 rmse standard 247631.
## 2 mae standard 181493.
##
## Call:
## stats::lm(formula = weekly_sales ~ . - store - date, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -553813 -165778 -24194 114613 1919644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.981e+05 4.060e+04 7.344 2.42e-13 ***
## isholidayTRUE 5.846e+04 1.393e+04 4.197 2.76e-05 ***
## temperature 1.170e+03 1.998e+02 5.857 5.02e-09 ***
## fuel_price -1.842e+04 7.802e+03 -2.360 0.0183 *
## cpi -9.137e+02 9.632e+01 -9.486 < 2e-16 ***
## unemployment -1.056e+04 1.992e+03 -5.302 1.20e-07 ***
## size 4.832e+00 5.486e-02 88.094 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 239000 on 4819 degrees of freedom
## Multiple R-squared: 0.6245, Adjusted R-squared: 0.624
## F-statistic: 1336 on 6 and 4819 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - date - store + I(temperature^2)
## Model 2: weekly_sales ~ (store + date + isholiday + temperature + fuel_price +
## cpi + unemployment + size) - store - date
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 4818 2.7514e+14
## 2 4819 2.7533e+14 -1 -1.8445e+11 3.2299 0.07237 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## rename: renamed one variable (Predicted_Sales)
## # A tibble: 2 × 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 rmse standard 247845.
## 2 mae standard 181757.
## mutate: new variable 'log_sales' (double) with 6,435 unique values and 0% NA
## # A tibble: 6,435 × 10
## store date isholiday temperature fuel_price cpi unemployment size
## <dbl> <date> <lgl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 2010-04-16 FALSE 66.3 2.81 210. 7.81 151315
## 2 1 2012-04-06 FALSE 70.4 3.89 221. 7.14 151315
## 3 1 2010-08-06 FALSE 87.2 2.63 212. 7.79 151315
## 4 1 2010-02-05 FALSE 42.3 2.57 211. 8.11 151315
## 5 1 2012-08-17 FALSE 84.8 3.57 222. 6.91 151315
## 6 1 2011-02-04 FALSE 42.3 2.99 213. 7.74 151315
## 7 1 2012-08-03 FALSE 86.1 3.42 222. 6.91 151315
## 8 1 2012-04-20 FALSE 66.8 3.88 222. 7.14 151315
## 9 1 2012-07-06 FALSE 81.6 3.23 222. 6.91 151315
## 10 1 2010-09-03 FALSE 81.2 2.58 212. 7.79 151315
## # ℹ 6,425 more rows
## # ℹ 2 more variables: weekly_sales <dbl>, log_sales <dbl>
##
## Call:
## stats::lm(formula = log_sales ~ . - store - date - weekly_sales,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.27631 -0.22829 -0.01924 0.22924 1.48007
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.247e+01 5.575e-02 223.594 < 2e-16 ***
## isholidayTRUE 6.378e-02 1.913e-02 3.334 0.000863 ***
## temperature 4.764e-04 2.744e-04 1.736 0.082580 .
## fuel_price -7.008e-03 1.072e-02 -0.654 0.513151
## cpi -1.185e-03 1.323e-04 -8.958 < 2e-16 ***
## unemployment -4.849e-03 2.736e-03 -1.772 0.076401 .
## size 8.095e-06 7.534e-08 107.444 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3283 on 4819 degrees of freedom
## Multiple R-squared: 0.7109, Adjusted R-squared: 0.7106
## F-statistic: 1975 on 6 and 4819 DF, p-value: < 2.2e-16