유럽에 3,000개의 매장을 갖고 있는 드럭 스토어(한국의 올리브영같은 매장) 데이터를 분석하는 미션으로 데이터 분석 툴(SQL, R, Python)을 활용해서 데이터가 필요한-마케팅, 영업, 투자 전략, 현장-부서에서 사용하기에 알맞은 형태의 데이터로 가공해서 제공한다. — # Rossman Store Sales
Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied.
In their first Kaggle competition, Rossmann is challenging you to predict 6 weeks of daily sales for 1,115 stores located across Germany. Reliable sales forecasts enable store managers to create effective staff schedules that increase productivity and motivation. By helping Rossmann create a robust prediction model, you will help store managers stay focused on what’s most important to them: their customers and their teams!
You are provided with historical sales data for 1,115 Rossmann stores. The task is to forecast the “Sales” column for the test set. Note that some stores in the dataset were temporarily closed for refurbishment.
Most of the fields are self-explanatory. The following are descriptions for those that aren’t.
library(data.table)
library(forecast)
library(lubridate) # 날짜 라이브러리
library(zoo) # 날짜 라이브러리
library(ggplot2) # 시각화 라이브러리
library(DescTools) # 통계 요약 라이브러리
library(sqldf) # sql 라이브러리
library(dplyr) # 전처리 라이브러리
library(knitr) # 출판 라이브러리
library(VIM) # 결측치 라이브러리
test <- fread("../../data/sql/rossman/test.csv")
train <- fread("../../data/sql/rossman/train.csv")
store <- fread("../../data/sql/rossman/store.csv")
str(train)
## Classes 'data.table' and 'data.frame': 1017209 obs. of 9 variables:
## $ Store : int 1 2 3 4 5 6 7 8 9 10 ...
## $ DayOfWeek : int 5 5 5 5 5 5 5 5 5 5 ...
## $ Date : chr "2015-07-31" "2015-07-31" "2015-07-31" "2015-07-31" ...
## $ Sales : int 5263 6064 8314 13995 4822 5651 15344 8492 8565 7185 ...
## $ Customers : int 555 625 821 1498 559 589 1414 833 687 681 ...
## $ Open : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Promo : int 1 1 1 1 1 1 1 1 1 1 ...
## $ StateHoliday : chr "0" "0" "0" "0" ...
## $ SchoolHoliday: int 1 1 1 1 1 1 1 1 1 1 ...
## - attr(*, ".internal.selfref")=<externalptr>
glimpse(train)
## Observations: 1,017,209
## Variables: 9
## $ Store <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
## $ DayOfWeek <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, ...
## $ Date <chr> "2015-07-31", "2015-07-31", "2015-07-31", "2015-...
## $ Sales <int> 5263, 6064, 8314, 13995, 4822, 5651, 15344, 8492...
## $ Customers <int> 555, 625, 821, 1498, 559, 589, 1414, 833, 687, 6...
## $ Open <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Promo <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ StateHoliday <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"...
## $ SchoolHoliday <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, ...
summary(train)
## Store DayOfWeek Date Sales
## Min. : 1.0 Min. :1.000 Length:1017209 Min. : 0
## 1st Qu.: 280.0 1st Qu.:2.000 Class :character 1st Qu.: 3727
## Median : 558.0 Median :4.000 Mode :character Median : 5744
## Mean : 558.4 Mean :3.998 Mean : 5774
## 3rd Qu.: 838.0 3rd Qu.:6.000 3rd Qu.: 7856
## Max. :1115.0 Max. :7.000 Max. :41551
## Customers Open Promo StateHoliday
## Min. : 0.0 Min. :0.0000 Min. :0.0000 Length:1017209
## 1st Qu.: 405.0 1st Qu.:1.0000 1st Qu.:0.0000 Class :character
## Median : 609.0 Median :1.0000 Median :0.0000 Mode :character
## Mean : 633.1 Mean :0.8301 Mean :0.3815
## 3rd Qu.: 837.0 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :7388.0 Max. :1.0000 Max. :1.0000
## SchoolHoliday
## Min. :0.0000
## 1st Qu.:0.0000
## Median :0.0000
## Mean :0.1786
## 3rd Qu.:0.0000
## Max. :1.0000
head(train)
## Store DayOfWeek Date Sales Customers Open Promo StateHoliday
## 1: 1 5 2015-07-31 5263 555 1 1 0
## 2: 2 5 2015-07-31 6064 625 1 1 0
## 3: 3 5 2015-07-31 8314 821 1 1 0
## 4: 4 5 2015-07-31 13995 1498 1 1 0
## 5: 5 5 2015-07-31 4822 559 1 1 0
## 6: 6 5 2015-07-31 5651 589 1 1 0
## SchoolHoliday
## 1: 1
## 2: 1
## 3: 1
## 4: 1
## 5: 1
## 6: 1
train$Date <- as.Date(train$Date)
#train[, Date := as.Date(Date)]
#train$Date <- as.Date(train$Date)
train <- train[order(Date)]
total_sales <- sqldf("
select
count(distinct store) as tot_store
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, min(date) as first_purchase_date
, max(date) as last_purchase_date
from train;")
summary(total_sales)
## tot_store tot_cust avg_cust tot_sales
## Min. :1115 Min. :6.44e+08 Min. :633.1 Min. :5873180623
## 1st Qu.:1115 1st Qu.:6.44e+08 1st Qu.:633.1 1st Qu.:5873180623
## Median :1115 Median :6.44e+08 Median :633.1 Median :5873180623
## Mean :1115 Mean :6.44e+08 Mean :633.1 Mean :5873180623
## 3rd Qu.:1115 3rd Qu.:6.44e+08 3rd Qu.:633.1 3rd Qu.:5873180623
## Max. :1115 Max. :6.44e+08 Max. :633.1 Max. :5873180623
## avg_sales first_purchase_date last_purchase_date
## Min. :5774 Min. :15706 Min. :16647
## 1st Qu.:5774 1st Qu.:15706 1st Qu.:16647
## Median :5774 Median :15706 Median :16647
## Mean :5774 Mean :15706 Mean :16647
## 3rd Qu.:5774 3rd Qu.:15706 3rd Qu.:16647
## Max. :5774 Max. :15706 Max. :16647
| tot_store | tot_cust | avg_cust | tot_sales | avg_sales | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|
| 1115 | 644041755 | 633.15 | 5873180623 | 5773.82 | 15706 | 16647 |
annual_sales <- sqldf("
select
substr(date, 1, 4) as year
, count(distinct store) as tot_store
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, count(*) as cnt_trade
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, min(date) as first_purchase_date
, max(date) as last_purchase_date
from train
group by 1;
")
summary(annual_sales)
## year tot_store tot_cust avg_cust
## Length:95 Min. : 935 Min. :2396743 Min. :432.9
## Class :character 1st Qu.:1115 1st Qu.:6420442 1st Qu.:599.6
## Mode :character Median :1115 Median :6832654 Median :630.6
## Mean :1083 Mean :6779387 Mean :633.6
## 3rd Qu.:1115 3rd Qu.:7277780 3rd Qu.:669.2
## Max. :1115 Max. :9525313 Max. :854.3
## cnt_trade tot_sales avg_sales first_purchase_date
## Min. : 4459 Min. :20033838 Min. :3875 Min. :15706
## 1st Qu.:11150 1st Qu.:57395018 1st Qu.:5300 1st Qu.:15935
## Median :11150 Median :61340641 Median :5730 Median :16170
## Mean :10707 Mean :61822954 Mean :5776 Mean :16170
## 3rd Qu.:11150 3rd Qu.:66413716 3rd Qu.:6177 3rd Qu.:16405
## Max. :11150 Max. :95991386 Max. :8609 Max. :16640
## last_purchase_date
## Min. :15709
## 1st Qu.:15944
## Median :16179
## Mean :16179
## 3rd Qu.:16414
## Max. :16647
annual_sales$tot_sales <- as.numeric(annual_sales$tot_sales)
| year | tot_store | tot_cust | avg_cust | cnt_trade | tot_sales | avg_sales | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|---|
| 1570 | 1115 | 2396743 | 537.51 | 4459 | 20033838 | 4492.90 | 15706 | 15709 |
| 1571 | 1115 | 6674810 | 598.64 | 11150 | 61257507 | 5493.95 | 15710 | 15719 |
| 1572 | 1115 | 6857427 | 615.02 | 11150 | 60758739 | 5449.21 | 15720 | 15729 |
| 1573 | 1115 | 6050198 | 542.62 | 11150 | 52238436 | 4685.06 | 15730 | 15739 |
| 1574 | 1115 | 7350066 | 659.20 | 11150 | 66257993 | 5942.42 | 15740 | 15749 |
| 1575 | 1115 | 7302965 | 654.97 | 11150 | 66556834 | 5969.22 | 15750 | 15759 |
| 1576 | 1115 | 6672469 | 598.43 | 11150 | 58798163 | 5273.38 | 15760 | 15769 |
| 1577 | 1115 | 7028338 | 630.34 | 11150 | 60634339 | 5438.06 | 15770 | 15779 |
| 1578 | 1115 | 7031567 | 630.63 | 11150 | 66142071 | 5932.02 | 15780 | 15789 |
| 1579 | 1115 | 6733738 | 603.92 | 11150 | 60674610 | 5441.67 | 15790 | 15799 |
| 1580 | 1115 | 6773018 | 607.45 | 11150 | 61058795 | 5476.13 | 15800 | 15809 |
| 1581 | 1115 | 7097850 | 636.58 | 11150 | 59974267 | 5378.86 | 15810 | 15819 |
| 1582 | 1115 | 7316861 | 656.22 | 11150 | 67433091 | 6047.81 | 15820 | 15829 |
| 1583 | 1115 | 6349850 | 569.49 | 11150 | 55553797 | 4982.40 | 15830 | 15839 |
| 1584 | 1115 | 6586412 | 590.71 | 11150 | 56752031 | 5089.87 | 15840 | 15849 |
| 1585 | 1115 | 6815031 | 611.21 | 11150 | 64855718 | 5816.66 | 15850 | 15859 |
| 1586 | 1115 | 7298046 | 654.53 | 11150 | 62918504 | 5642.92 | 15860 | 15869 |
| 1587 | 1115 | 6722795 | 602.94 | 11150 | 60966372 | 5467.84 | 15870 | 15879 |
| 1588 | 1115 | 7502901 | 672.91 | 11150 | 67631129 | 6065.57 | 15880 | 15889 |
| 1589 | 1115 | 7123260 | 638.86 | 11150 | 61232883 | 5491.74 | 15890 | 15899 |
| 1590 | 1115 | 6933002 | 621.79 | 11150 | 63864886 | 5727.79 | 15900 | 15909 |
| 1591 | 1115 | 7734513 | 693.68 | 11150 | 72510687 | 6503.20 | 15910 | 15919 |
| 1592 | 1115 | 6519142 | 584.68 | 11150 | 56806267 | 5094.73 | 15920 | 15929 |
| 1593 | 1115 | 7227530 | 648.21 | 11150 | 62520179 | 5607.19 | 15930 | 15939 |
| 1594 | 1115 | 6803763 | 610.20 | 11150 | 62098160 | 5569.34 | 15940 | 15949 |
| 1595 | 1115 | 7400148 | 663.69 | 11150 | 65419656 | 5867.23 | 15950 | 15959 |
| 1596 | 1115 | 6878898 | 616.94 | 11150 | 57525187 | 5159.21 | 15960 | 15969 |
| 1597 | 1115 | 6832654 | 612.79 | 11150 | 61535330 | 5518.86 | 15970 | 15979 |
| 1598 | 1115 | 7179652 | 643.91 | 11150 | 66062947 | 5924.93 | 15980 | 15989 |
| 1599 | 1115 | 6210688 | 557.01 | 11150 | 52892260 | 4743.70 | 15990 | 15999 |
| 1600 | 1115 | 7176881 | 643.67 | 11150 | 62029923 | 5563.22 | 16000 | 16009 |
| 1601 | 1115 | 6696881 | 600.62 | 11150 | 62776823 | 5630.21 | 16010 | 16019 |
| 1602 | 1115 | 7299302 | 654.65 | 11150 | 65035485 | 5832.78 | 16020 | 16029 |
| 1603 | 1115 | 7547685 | 676.92 | 11150 | 68358616 | 6130.82 | 16030 | 16039 |
| 1604 | 1115 | 7724983 | 692.82 | 11150 | 75527096 | 6773.73 | 16040 | 16049 |
| 1605 | 1115 | 9525313 | 854.29 | 11150 | 95991386 | 8609.09 | 16050 | 16059 |
| 1606 | 1115 | 6097653 | 546.87 | 11150 | 55657215 | 4991.68 | 16060 | 16069 |
| 1607 | 1115 | 6566606 | 588.93 | 11150 | 60978485 | 5468.92 | 16070 | 16079 |
| 1608 | 1115 | 6052144 | 542.79 | 11150 | 51052492 | 4578.70 | 16080 | 16089 |
| 1609 | 1115 | 7185804 | 644.47 | 11150 | 66007753 | 5919.98 | 16090 | 16099 |
| 1610 | 1115 | 7813908 | 700.80 | 11150 | 73953517 | 6632.60 | 16100 | 16109 |
| 1611 | 1115 | 6646894 | 596.13 | 11150 | 57879440 | 5190.98 | 16110 | 16119 |
| 1612 | 1115 | 7131513 | 639.60 | 11150 | 61340641 | 5501.40 | 16120 | 16129 |
| 1613 | 1115 | 7016945 | 629.32 | 11150 | 64982872 | 5828.06 | 16130 | 16139 |
| 1614 | 1115 | 7282227 | 653.11 | 11150 | 64338184 | 5770.24 | 16140 | 16149 |
| 1615 | 1115 | 6062408 | 543.71 | 11150 | 51635865 | 4631.02 | 16150 | 16159 |
| 1616 | 1115 | 7847055 | 703.77 | 11150 | 72348899 | 6488.69 | 16160 | 16169 |
| 1617 | 1115 | 7615121 | 682.97 | 11150 | 71448480 | 6407.94 | 16170 | 16179 |
| 1618 | 1115 | 5910800 | 530.12 | 11150 | 51969491 | 4660.94 | 16180 | 16189 |
| 1619 | 1115 | 7785269 | 698.23 | 11150 | 74869562 | 6714.76 | 16190 | 16199 |
| 1620 | 1115 | 6387943 | 572.91 | 11150 | 54883915 | 4922.32 | 16200 | 16209 |
| 1621 | 1115 | 6624713 | 594.14 | 11150 | 59053322 | 5296.26 | 16210 | 16219 |
| 1622 | 1115 | 7514523 | 673.95 | 11150 | 71737401 | 6433.85 | 16220 | 16229 |
| 1623 | 1115 | 7021595 | 629.74 | 11150 | 64259697 | 5763.20 | 16230 | 16239 |
| 1624 | 1115 | 6617774 | 593.52 | 11150 | 57211336 | 5131.06 | 16240 | 16249 |
| 1625 | 1115 | 6429133 | 662.11 | 9710 | 60035717 | 6182.88 | 16250 | 16259 |
| 1626 | 935 | 6503701 | 695.58 | 9350 | 59033219 | 6313.71 | 16260 | 16269 |
| 1627 | 935 | 5253821 | 561.91 | 9350 | 45312901 | 4846.30 | 16270 | 16279 |
| 1628 | 935 | 6944748 | 742.75 | 9350 | 66270597 | 7087.76 | 16280 | 16289 |
| 1629 | 935 | 5275826 | 564.26 | 9350 | 44310537 | 4739.09 | 16290 | 16299 |
| 1630 | 935 | 6558951 | 701.49 | 9350 | 58408023 | 6246.85 | 16300 | 16309 |
| 1631 | 935 | 6724028 | 719.15 | 9350 | 61317731 | 6558.05 | 16310 | 16319 |
| 1632 | 935 | 5656856 | 605.01 | 9350 | 48425837 | 5179.23 | 16320 | 16329 |
| 1633 | 935 | 6005169 | 642.26 | 9350 | 49970801 | 5344.47 | 16330 | 16339 |
| 1634 | 935 | 5787380 | 618.97 | 9350 | 54484729 | 5827.24 | 16340 | 16349 |
| 1635 | 935 | 6279592 | 671.61 | 9350 | 54446950 | 5823.20 | 16350 | 16359 |
| 1636 | 935 | 5628443 | 601.97 | 9350 | 49587405 | 5303.47 | 16360 | 16369 |
| 1637 | 935 | 6411750 | 685.75 | 9350 | 57699633 | 6171.08 | 16370 | 16379 |
| 1638 | 935 | 6685251 | 715.00 | 9350 | 61545837 | 6582.44 | 16380 | 16389 |
| 1639 | 935 | 5703682 | 610.02 | 9350 | 50311781 | 5380.94 | 16390 | 16399 |
| 1640 | 935 | 7742634 | 828.09 | 9350 | 77840828 | 8325.22 | 16400 | 16409 |
| 1641 | 935 | 6143050 | 657.01 | 9350 | 57264848 | 6124.58 | 16410 | 16419 |
| 1642 | 935 | 7408640 | 792.37 | 9350 | 74134673 | 7928.84 | 16420 | 16429 |
| 1643 | 1115 | 4358953 | 432.87 | 10070 | 39025458 | 3875.42 | 16430 | 16439 |
| 1644 | 1115 | 7638629 | 685.08 | 11150 | 75031207 | 6729.26 | 16440 | 16449 |
| 1645 | 1115 | 6750899 | 605.46 | 11150 | 59036599 | 5294.76 | 16450 | 16459 |
| 1646 | 1115 | 7110375 | 637.70 | 11150 | 70158235 | 6292.22 | 16460 | 16469 |
| 1647 | 1115 | 7193755 | 645.18 | 11150 | 63931869 | 5733.80 | 16470 | 16479 |
| 1648 | 1115 | 6510723 | 583.92 | 11150 | 61199423 | 5488.74 | 16480 | 16489 |
| 1649 | 1115 | 7559842 | 678.01 | 11150 | 70804801 | 6350.21 | 16490 | 16499 |
| 1650 | 1115 | 6221630 | 557.99 | 11150 | 54707795 | 4906.53 | 16500 | 16509 |
| 1651 | 1115 | 7524718 | 674.86 | 11150 | 69751401 | 6255.73 | 16510 | 16519 |
| 1652 | 1115 | 7503111 | 672.92 | 11150 | 76201952 | 6834.26 | 16520 | 16529 |
| 1653 | 1115 | 6058278 | 543.34 | 11150 | 55338162 | 4963.06 | 16530 | 16539 |
| 1654 | 1115 | 7123996 | 638.92 | 11150 | 63175642 | 5665.98 | 16540 | 16549 |
| 1655 | 1115 | 6761216 | 606.39 | 11150 | 67567902 | 6059.90 | 16550 | 16559 |
| 1656 | 1115 | 6942273 | 622.63 | 11150 | 63893494 | 5730.36 | 16560 | 16569 |
| 1657 | 1115 | 6995066 | 627.36 | 11150 | 66850674 | 5995.58 | 16570 | 16579 |
| 1658 | 1115 | 7193896 | 645.19 | 11150 | 69526000 | 6235.52 | 16580 | 16589 |
| 1659 | 1115 | 6745150 | 604.95 | 11150 | 60427424 | 5419.50 | 16590 | 16599 |
| 1660 | 1115 | 6851474 | 614.48 | 11150 | 66229888 | 5939.90 | 16600 | 16609 |
| 1661 | 1115 | 7678390 | 688.64 | 11150 | 75271815 | 6750.84 | 16610 | 16619 |
| 1662 | 1115 | 6400197 | 574.01 | 11150 | 58556073 | 5251.67 | 16620 | 16629 |
| 1663 | 1115 | 7273333 | 652.32 | 11150 | 67660411 | 6068.20 | 16630 | 16639 |
| 1664 | 1115 | 5946921 | 666.70 | 8920 | 60139779 | 6742.13 | 16640 | 16647 |
| <img sr | c="rossman-e | da-in-sql-r | _files/figu | re-html/pres | sure-1.png" | width=“672” | /> |
train$Quarter <- substr(quarters(as.Date(train$Date)), 2, 2)
table(train$Quarter)
##
## 1 2 3 4
## 301049 304395 223165 188600
quartely_sales <- sqldf("
select
substr(date, 1, 4) as year
, quarter
, count(distinct store) as tot_store
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
from train
group by 1, 2;
")
summary(quartely_sales)
## year Quarter tot_store tot_cust
## Length:105 Length:105 Min. : 935 Min. : 531392
## Class :character Class :character 1st Qu.:1115 1st Qu.:6028617
## Mode :character Mode :character Median :1115 Median :6696881
## Mean :1081 Mean :6133731
## 3rd Qu.:1115 3rd Qu.:7176881
## Max. :1115 Max. :9525313
## avg_cust tot_sales avg_sales
## Min. : 350.8 Min. : 4534864 Min. : 3148
## 1st Qu.: 598.4 1st Qu.:51969491 1st Qu.: 5295
## Median : 631.9 Median :60285966 Median : 5734
## Mean : 636.1 Mean :55935054 Mean : 5827
## 3rd Qu.: 669.9 3rd Qu.:65419656 3rd Qu.: 6213
## Max. :1009.1 Max. :95991386 Max. :10819
| year | Quarter | tot_store | tot_cust | avg_cust | tot_sales | avg_sales |
|---|---|---|---|---|---|---|
| 1570 | 1 | 1115 | 2396743 | 537.51 | 20033838 | 4492.90 |
| 1571 | 1 | 1115 | 6674810 | 598.64 | 61257507 | 5493.95 |
| 1572 | 1 | 1115 | 6857427 | 615.02 | 60758739 | 5449.21 |
| 1573 | 1 | 1115 | 6050198 | 542.62 | 52238436 | 4685.06 |
| 1574 | 1 | 1115 | 7350066 | 659.20 | 66257993 | 5942.42 |
| 1575 | 1 | 1115 | 7302965 | 654.97 | 66556834 | 5969.22 |
| 1576 | 1 | 1115 | 6672469 | 598.43 | 58798163 | 5273.38 |
| 1577 | 1 | 1115 | 7028338 | 630.34 | 60634339 | 5438.06 |
| 1578 | 1 | 1115 | 7031567 | 630.63 | 66142071 | 5932.02 |
| 1579 | 1 | 1115 | 4229770 | 632.25 | 40168931 | 6004.32 |
| 1579 | 2 | 1115 | 2503968 | 561.43 | 20505679 | 4597.69 |
| 1580 | 2 | 1115 | 6773018 | 607.45 | 61058795 | 5476.13 |
| 1581 | 2 | 1115 | 7097850 | 636.58 | 59974267 | 5378.86 |
| 1582 | 2 | 1115 | 7316861 | 656.22 | 67433091 | 6047.81 |
| 1583 | 2 | 1115 | 6349850 | 569.49 | 55553797 | 4982.40 |
| 1584 | 2 | 1115 | 6586412 | 590.71 | 56752031 | 5089.87 |
| 1585 | 2 | 1115 | 6815031 | 611.21 | 64855718 | 5816.66 |
| 1586 | 2 | 1115 | 7298046 | 654.53 | 62918504 | 5642.92 |
| 1587 | 2 | 1115 | 6722795 | 602.94 | 60966372 | 5467.84 |
| 1588 | 2 | 1115 | 4615466 | 591.35 | 39526592 | 5064.27 |
| 1588 | 3 | 1115 | 2887435 | 863.21 | 28104537 | 8401.95 |
| 1589 | 3 | 1115 | 7123260 | 638.86 | 61232883 | 5491.74 |
| 1590 | 3 | 1115 | 6933002 | 621.79 | 63864886 | 5727.79 |
| 1591 | 3 | 1115 | 7734513 | 693.68 | 72510687 | 6503.20 |
| 1592 | 3 | 1115 | 6519142 | 584.68 | 56806267 | 5094.73 |
| 1593 | 3 | 1115 | 7227530 | 648.21 | 62520179 | 5607.19 |
| 1594 | 3 | 1115 | 6803763 | 610.20 | 62098160 | 5569.34 |
| 1595 | 3 | 1115 | 7400148 | 663.69 | 65419656 | 5867.23 |
| 1596 | 3 | 1115 | 6878898 | 616.94 | 57525187 | 5159.21 |
| 1597 | 3 | 1115 | 6028617 | 600.76 | 54858130 | 5466.68 |
| 1597 | 4 | 1115 | 804037 | 721.11 | 6677200 | 5988.52 |
| 1598 | 4 | 1115 | 7179652 | 643.91 | 66062947 | 5924.93 |
| 1599 | 4 | 1115 | 6210688 | 557.01 | 52892260 | 4743.70 |
| 1600 | 4 | 1115 | 7176881 | 643.67 | 62029923 | 5563.22 |
| 1601 | 4 | 1115 | 6696881 | 600.62 | 62776823 | 5630.21 |
| 1602 | 4 | 1115 | 7299302 | 654.65 | 65035485 | 5832.78 |
| 1603 | 4 | 1115 | 7547685 | 676.92 | 68358616 | 6130.82 |
| 1604 | 4 | 1115 | 7724983 | 692.82 | 75527096 | 6773.73 |
| 1605 | 4 | 1115 | 9525313 | 854.29 | 95991386 | 8609.09 |
| 1606 | 4 | 1115 | 6097653 | 546.87 | 55657215 | 4991.68 |
| 1607 | 1 | 1115 | 6035214 | 601.42 | 56443621 | 5624.68 |
| 1607 | 4 | 1115 | 531392 | 476.58 | 4534864 | 4067.14 |
| 1608 | 1 | 1115 | 6052144 | 542.79 | 51052492 | 4578.70 |
| 1609 | 1 | 1115 | 7185804 | 644.47 | 66007753 | 5919.98 |
| 1610 | 1 | 1115 | 7813908 | 700.80 | 73953517 | 6632.60 |
| 1611 | 1 | 1115 | 6646894 | 596.13 | 57879440 | 5190.98 |
| 1612 | 1 | 1115 | 7131513 | 639.60 | 61340641 | 5501.40 |
| 1613 | 1 | 1115 | 7016945 | 629.32 | 64982872 | 5828.06 |
| 1614 | 1 | 1115 | 7282227 | 653.11 | 64338184 | 5770.24 |
| 1615 | 1 | 1115 | 6062408 | 543.71 | 51635865 | 4631.02 |
| 1616 | 1 | 1115 | 1125153 | 1009.11 | 12062933 | 10818.77 |
| 1616 | 2 | 1115 | 6721902 | 669.85 | 60285966 | 6007.57 |
| 1617 | 2 | 1115 | 7615121 | 682.97 | 71448480 | 6407.94 |
| 1618 | 2 | 1115 | 5910800 | 530.12 | 51969491 | 4660.94 |
| 1619 | 2 | 1115 | 7785269 | 698.23 | 74869562 | 6714.76 |
| 1620 | 2 | 1115 | 6387943 | 572.91 | 54883915 | 4922.32 |
| 1621 | 2 | 1115 | 6624713 | 594.14 | 59053322 | 5296.26 |
| 1622 | 2 | 1115 | 7514523 | 673.95 | 71737401 | 6433.85 |
| 1623 | 2 | 1115 | 7021595 | 629.74 | 64259697 | 5763.20 |
| 1624 | 2 | 1115 | 6617774 | 593.52 | 57211336 | 5131.06 |
| 1625 | 2 | 1115 | 1231675 | 552.32 | 13566115 | 6083.46 |
| 1625 | 3 | 935 | 5197458 | 694.85 | 46469602 | 6212.51 |
| 1626 | 3 | 935 | 6503701 | 695.58 | 59033219 | 6313.71 |
| 1627 | 3 | 935 | 5253821 | 561.91 | 45312901 | 4846.30 |
| 1628 | 3 | 935 | 6944748 | 742.75 | 66270597 | 7087.76 |
| 1629 | 3 | 935 | 5275826 | 564.26 | 44310537 | 4739.09 |
| 1630 | 3 | 935 | 6558951 | 701.49 | 58408023 | 6246.85 |
| 1631 | 3 | 935 | 6724028 | 719.15 | 61317731 | 6558.05 |
| 1632 | 3 | 935 | 5656856 | 605.01 | 48425837 | 5179.23 |
| 1633 | 3 | 935 | 6005169 | 642.26 | 49970801 | 5344.47 |
| 1634 | 3 | 935 | 2363140 | 631.86 | 22638102 | 6052.97 |
| 1634 | 4 | 935 | 3424240 | 610.38 | 31846627 | 5676.76 |
| 1635 | 4 | 935 | 6279592 | 671.61 | 54446950 | 5823.20 |
| 1636 | 4 | 935 | 5628443 | 601.97 | 49587405 | 5303.47 |
| 1637 | 4 | 935 | 6411750 | 685.75 | 57699633 | 6171.08 |
| 1638 | 4 | 935 | 6685251 | 715.00 | 61545837 | 6582.44 |
| 1639 | 4 | 935 | 5703682 | 610.02 | 50311781 | 5380.94 |
| 1640 | 4 | 935 | 7742634 | 828.09 | 77840828 | 8325.22 |
| 1641 | 4 | 935 | 6143050 | 657.01 | 57264848 | 6124.58 |
| 1642 | 4 | 935 | 7408640 | 792.37 | 74134673 | 7928.84 |
| 1643 | 1 | 1115 | 1564487 | 350.78 | 14039097 | 3147.78 |
| 1643 | 4 | 935 | 2794466 | 498.12 | 24986361 | 4453.90 |
| 1644 | 1 | 1115 | 7638629 | 685.08 | 75031207 | 6729.26 |
| 1645 | 1 | 1115 | 6750899 | 605.46 | 59036599 | 5294.76 |
| 1646 | 1 | 1115 | 7110375 | 637.70 | 70158235 | 6292.22 |
| 1647 | 1 | 1115 | 7193755 | 645.18 | 63931869 | 5733.80 |
| 1648 | 1 | 1115 | 6510723 | 583.92 | 61199423 | 5488.74 |
| 1649 | 1 | 1115 | 7559842 | 678.01 | 70804801 | 6350.21 |
| 1650 | 1 | 1115 | 6221630 | 557.99 | 54707795 | 4906.53 |
| 1651 | 1 | 1115 | 7524718 | 674.86 | 69751401 | 6255.73 |
| 1652 | 1 | 1115 | 4326445 | 646.70 | 44090444 | 6590.50 |
| 1652 | 2 | 1115 | 3176666 | 712.26 | 32111508 | 7199.89 |
| 1653 | 2 | 1115 | 6058278 | 543.34 | 55338162 | 4963.06 |
| 1654 | 2 | 1115 | 7123996 | 638.92 | 63175642 | 5665.98 |
| 1655 | 2 | 1115 | 6761216 | 606.39 | 67567902 | 6059.90 |
| 1656 | 2 | 1115 | 6942273 | 622.63 | 63893494 | 5730.36 |
| 1657 | 2 | 1115 | 6995066 | 627.36 | 66850674 | 5995.58 |
| 1658 | 2 | 1115 | 7193896 | 645.19 | 69526000 | 6235.52 |
| 1659 | 2 | 1115 | 6745150 | 604.95 | 60427424 | 5419.50 |
| 1660 | 2 | 1115 | 6851474 | 614.48 | 66229888 | 5939.90 |
| 1661 | 2 | 1115 | 5044953 | 646.37 | 49305462 | 6317.16 |
| 1661 | 3 | 1115 | 2633437 | 787.28 | 25966353 | 7762.74 |
| 1662 | 3 | 1115 | 6400197 | 574.01 | 58556073 | 5251.67 |
| 1663 | 3 | 1115 | 7273333 | 652.32 | 67660411 | 6068.20 |
| 1664 | 3 | 1115 | 5946921 | 666.70 | 60139779 | 6742.13 |
sales_by_store_type <- sqldf("
select
storetype
, count(distinct a.store) as tot_store
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, substr(min(date), 1, 10) as first_purchase_date
, substr(max(date), 1, 10) as last_purchase_date
from train a
left join store b
on a.store = b.store
group by 1;
")
summary(sales_by_store_type)
## StoreType tot_store tot_cust avg_cust
## Length:4 Min. : 17.0 Min. : 31465621 Min. : 501.4
## Class :character 1st Qu.:115.2 1st Qu.: 76963684 1st Qu.: 619.6
## Mode :character Median :248.0 Median :124517350 Median : 666.1
## Mean :278.8 Mean :161010439 Mean : 955.4
## 3rd Qu.:411.5 3rd Qu.:208564105 3rd Qu.:1001.9
## Max. :602.0 Max. :363541434 Max. :1987.7
## tot_sales avg_sales first_purchase_date
## Min. : 159231395 Min. : 5642 Length:4
## 1st Qu.: 783221426 1st Qu.: 5703 Class :character
## Median : 783221426 Median : 5731 Mode :character
## Mean :1468295155 Mean : 6791
## 3rd Qu.:1765392943 3rd Qu.: 6818
## Max. :3165334859 Max. :10059
## last_purchase_date
## Length:4
## Class :character
## Mode :character
##
##
##
| StoreType | tot_store | tot_cust | avg_cust | tot_sales | avg_sales | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|
| a | 602 | 363541434 | 659.03 | 3165334859 | 5738.18 | 15706.0 | 16647.0 |
| b | 17 | 31465621 | 1987.72 | 159231395 | 10058.84 | 15706.0 | 16647.0 |
| c | 148 | 92129705 | 673.27 | 783221426 | 5723.63 | 15706.0 | 16647.0 |
| d | 348 | 156904995 | 501.43 | 1765392943 | 5641.82 | 15706.0 | 16647.0 |
base <- sqldf("select * from train a left join store b on a.store = b.store")
head(base)
## Store DayOfWeek Date Sales Customers Open Promo StateHoliday
## 1 1 2 2013-01-01 0 0 0 0 a
## 2 2 2 2013-01-01 0 0 0 0 a
## 3 3 2 2013-01-01 0 0 0 0 a
## 4 4 2 2013-01-01 0 0 0 0 a
## 5 5 2 2013-01-01 0 0 0 0 a
## 6 6 2 2013-01-01 0 0 0 0 a
## SchoolHoliday Quarter Store..11 StoreType Assortment CompetitionDistance
## 1 1 1 1 c a 1270
## 2 1 1 2 a a 570
## 3 1 1 3 a a 14130
## 4 1 1 4 c c 620
## 5 1 1 5 a a 29910
## 6 1 1 6 a a 310
## CompetitionOpenSinceMonth CompetitionOpenSinceYear Promo2
## 1 9 2008 0
## 2 11 2007 1
## 3 12 2006 1
## 4 9 2009 0
## 5 4 2015 0
## 6 12 2013 0
## Promo2SinceWeek Promo2SinceYear PromoInterval
## 1 NA NA
## 2 13 2010 Jan,Apr,Jul,Oct
## 3 14 2011 Jan,Apr,Jul,Oct
## 4 NA NA
## 5 NA NA
## 6 NA NA
sales_by_store_type_year <- sqldf("
select
storetype
, substr(date, 1, 4) as year
, count(distinct a.store) as tot_store
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, substr(min(date), 1, 10) as first_purchase_date
, substr(max(date), 1, 10) as last_purchase_date
from train a
left join store b
on a.store = b.store
group by 1, 2;
")
summary(sales_by_store_type_year)
## StoreType year tot_store tot_cust
## Length:380 Length:380 Min. : 16.0 Min. : 95689
## Class :character Class :character 1st Qu.:104.8 1st Qu.: 382608
## Mode :character Mode :character Median :207.5 Median :1188746
## Mean :270.7 Mean :1694847
## 3rd Qu.:390.5 3rd Qu.:2057489
## Max. :602.0 Max. :5421709
## avg_cust tot_sales avg_sales first_purchase_date
## Min. : 346.2 Min. : 452444 Min. : 3666 Length:380
## 1st Qu.: 564.0 1st Qu.: 2531426 1st Qu.: 5362 Class :character
## Median : 666.5 Median :11202269 Median : 5935 Mode :character
## Mean : 954.9 Mean :15455738 Mean : 6789
## 3rd Qu.:1061.7 3rd Qu.:23607748 3rd Qu.: 8265
## Max. :2333.6 Max. :51645134 Max. :13309
## last_purchase_date
## Length:380
## Class :character
## Mode :character
##
##
##
| StoreType | year | tot_store | tot_cust | avg_cust | tot_sales | avg_sales | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|---|
| a | 1570 | 602 | 1381681 | 574.03 | 11061535 | 4595.57 | 15706.0 | 15709.0 |
| a | 1571 | 602 | 3801168 | 631.42 | 33372558 | 5543.61 | 15710.0 | 15719.0 |
| a | 1572 | 602 | 3960231 | 657.85 | 33559157 | 5574.61 | 15720.0 | 15729.0 |
| a | 1573 | 602 | 3409873 | 566.42 | 28179401 | 4680.96 | 15730.0 | 15739.0 |
| a | 1574 | 602 | 4230332 | 702.71 | 36618070 | 6082.74 | 15740.0 | 15749.0 |
| a | 1575 | 602 | 4129331 | 685.94 | 35926320 | 5967.83 | 15750.0 | 15759.0 |
| a | 1576 | 602 | 3858562 | 640.96 | 32599743 | 5415.24 | 15760.0 | 15769.0 |
| a | 1577 | 602 | 4041632 | 671.37 | 33163239 | 5508.84 | 15770.0 | 15779.0 |
| a | 1578 | 602 | 3948851 | 655.96 | 35611874 | 5915.59 | 15780.0 | 15789.0 |
| a | 1579 | 602 | 3781699 | 628.19 | 32688011 | 5429.90 | 15790.0 | 15799.0 |
| a | 1580 | 602 | 3806589 | 632.32 | 32827650 | 5453.10 | 15800.0 | 15809.0 |
| a | 1581 | 602 | 4139782 | 687.67 | 33341812 | 5538.51 | 15810.0 | 15819.0 |
| a | 1582 | 602 | 4121731 | 684.67 | 36307578 | 6031.16 | 15820.0 | 15829.0 |
| a | 1583 | 602 | 3592786 | 596.81 | 30104569 | 5000.76 | 15830.0 | 15839.0 |
| a | 1584 | 602 | 3722802 | 618.41 | 30644657 | 5090.47 | 15840.0 | 15849.0 |
| a | 1585 | 602 | 3798311 | 630.95 | 34569096 | 5742.37 | 15850.0 | 15859.0 |
| a | 1586 | 602 | 4241696 | 704.60 | 34851646 | 5789.31 | 15860.0 | 15869.0 |
| a | 1587 | 602 | 3768813 | 626.05 | 32598276 | 5415.00 | 15870.0 | 15879.0 |
| a | 1588 | 602 | 4286730 | 712.08 | 36758114 | 6106.00 | 15880.0 | 15889.0 |
| a | 1589 | 602 | 4044443 | 671.83 | 33036478 | 5487.79 | 15890.0 | 15899.0 |
| a | 1590 | 602 | 3960302 | 657.86 | 34869393 | 5792.26 | 15900.0 | 15909.0 |
| a | 1591 | 602 | 4399036 | 730.74 | 39227659 | 6516.22 | 15910.0 | 15919.0 |
| a | 1592 | 602 | 3675399 | 610.53 | 30565692 | 5077.36 | 15920.0 | 15929.0 |
| a | 1593 | 602 | 4135894 | 687.03 | 33985317 | 5645.40 | 15930.0 | 15939.0 |
| a | 1594 | 602 | 3811301 | 633.11 | 33057368 | 5491.26 | 15940.0 | 15949.0 |
| a | 1595 | 602 | 4238893 | 704.14 | 35570832 | 5908.78 | 15950.0 | 15959.0 |
| a | 1596 | 602 | 3866472 | 642.27 | 30733566 | 5105.24 | 15960.0 | 15969.0 |
| a | 1597 | 602 | 3897768 | 647.47 | 33491365 | 5563.35 | 15970.0 | 15979.0 |
| a | 1598 | 602 | 4047020 | 672.26 | 35499462 | 5896.92 | 15980.0 | 15989.0 |
| a | 1599 | 602 | 3484668 | 578.85 | 28350973 | 4709.46 | 15990.0 | 15999.0 |
| a | 1600 | 602 | 4114440 | 683.46 | 33848825 | 5622.73 | 16000.0 | 16009.0 |
| a | 1601 | 602 | 3700907 | 614.77 | 33204850 | 5515.76 | 16010.0 | 16019.0 |
| a | 1602 | 602 | 4178903 | 694.17 | 35585370 | 5911.19 | 16020.0 | 16029.0 |
| a | 1603 | 602 | 4247882 | 705.63 | 36543615 | 6070.37 | 16030.0 | 16039.0 |
| a | 1604 | 602 | 4387935 | 728.89 | 41005733 | 6811.58 | 16040.0 | 16049.0 |
| a | 1605 | 602 | 5421709 | 900.62 | 51645134 | 8578.93 | 16050.0 | 16059.0 |
| a | 1606 | 602 | 3276255 | 544.23 | 28609040 | 4752.33 | 16060.0 | 16069.0 |
| a | 1607 | 602 | 3705394 | 615.51 | 32885012 | 5462.63 | 16070.0 | 16079.0 |
| a | 1608 | 602 | 3362299 | 558.52 | 27000278 | 4485.10 | 16080.0 | 16089.0 |
| a | 1609 | 602 | 4078818 | 677.54 | 35603342 | 5914.18 | 16090.0 | 16099.0 |
| a | 1610 | 602 | 4377430 | 727.15 | 39453138 | 6553.68 | 16100.0 | 16109.0 |
| a | 1611 | 602 | 3766229 | 625.62 | 31314036 | 5201.67 | 16110.0 | 16119.0 |
| a | 1612 | 602 | 4043967 | 671.76 | 33091993 | 5497.01 | 16120.0 | 16129.0 |
| a | 1613 | 602 | 3931903 | 653.14 | 34758584 | 5773.85 | 16130.0 | 16139.0 |
| a | 1614 | 602 | 4165343 | 691.92 | 34922933 | 5801.15 | 16140.0 | 16149.0 |
| a | 1615 | 602 | 3369547 | 559.73 | 27294077 | 4533.90 | 16150.0 | 16159.0 |
| a | 1616 | 602 | 4493914 | 746.50 | 39449044 | 6553.00 | 16160.0 | 16169.0 |
| a | 1617 | 602 | 4218665 | 700.77 | 37698932 | 6262.28 | 16170.0 | 16179.0 |
| a | 1618 | 602 | 3323533 | 552.08 | 27976421 | 4647.25 | 16180.0 | 16189.0 |
| a | 1619 | 602 | 4356117 | 723.61 | 39925213 | 6632.10 | 16190.0 | 16199.0 |
| a | 1620 | 602 | 3556459 | 590.77 | 29198742 | 4850.29 | 16200.0 | 16209.0 |
| a | 1621 | 602 | 3717388 | 617.51 | 31559985 | 5242.52 | 16210.0 | 16219.0 |
| a | 1622 | 602 | 4205684 | 698.62 | 38266113 | 6356.50 | 16220.0 | 16229.0 |
| a | 1623 | 602 | 3952518 | 656.56 | 34425556 | 5718.53 | 16230.0 | 16239.0 |
| a | 1624 | 602 | 3703239 | 615.16 | 30445315 | 5057.36 | 16240.0 | 16249.0 |
| a | 1625 | 602 | 3684965 | 689.04 | 33130813 | 6194.99 | 16250.0 | 16259.0 |
| a | 1626 | 518 | 3738232 | 721.67 | 32660630 | 6305.14 | 16260.0 | 16269.0 |
| a | 1627 | 518 | 2955941 | 570.64 | 24575234 | 4744.25 | 16270.0 | 16279.0 |
| a | 1628 | 518 | 4011050 | 774.33 | 36682098 | 7081.49 | 16280.0 | 16289.0 |
| a | 1629 | 518 | 2977777 | 574.86 | 24049856 | 4642.83 | 16290.0 | 16299.0 |
| a | 1630 | 518 | 3803164 | 734.20 | 32636743 | 6300.53 | 16300.0 | 16309.0 |
| a | 1631 | 518 | 3858260 | 744.84 | 33803223 | 6525.72 | 16310.0 | 16319.0 |
| a | 1632 | 518 | 3237196 | 624.94 | 26712729 | 5156.90 | 16320.0 | 16329.0 |
| a | 1633 | 518 | 3451616 | 666.34 | 27622993 | 5332.62 | 16330.0 | 16339.0 |
| a | 1634 | 518 | 3263718 | 630.06 | 29707671 | 5735.07 | 16340.0 | 16349.0 |
| a | 1635 | 518 | 3622097 | 699.25 | 30162885 | 5822.95 | 16350.0 | 16359.0 |
| a | 1636 | 518 | 3156001 | 609.27 | 26825544 | 5178.68 | 16360.0 | 16369.0 |
| a | 1637 | 518 | 3685417 | 711.47 | 31905223 | 6159.31 | 16370.0 | 16379.0 |
| a | 1638 | 518 | 3817077 | 736.89 | 33672524 | 6500.49 | 16380.0 | 16389.0 |
| a | 1639 | 518 | 3264692 | 630.25 | 27850019 | 5376.45 | 16390.0 | 16399.0 |
| a | 1640 | 518 | 4430513 | 855.31 | 42777239 | 8258.15 | 16400.0 | 16409.0 |
| a | 1641 | 518 | 3465356 | 668.99 | 31084612 | 6000.89 | 16410.0 | 16419.0 |
| a | 1642 | 518 | 4144230 | 800.04 | 39778185 | 7679.19 | 16420.0 | 16429.0 |
| a | 1643 | 602 | 2326288 | 421.73 | 20218892 | 3665.50 | 16430.0 | 16439.0 |
| a | 1644 | 602 | 4281848 | 711.27 | 40011342 | 6646.40 | 16440.0 | 16449.0 |
| a | 1645 | 602 | 3741684 | 621.54 | 31155384 | 5175.31 | 16450.0 | 16459.0 |
| a | 1646 | 602 | 3967209 | 659.00 | 37403276 | 6213.17 | 16460.0 | 16469.0 |
| a | 1647 | 602 | 4033112 | 669.95 | 34160318 | 5674.47 | 16470.0 | 16479.0 |
| a | 1648 | 602 | 3564756 | 592.15 | 32029931 | 5320.59 | 16480.0 | 16489.0 |
| a | 1649 | 602 | 4269942 | 709.29 | 38099127 | 6328.76 | 16490.0 | 16499.0 |
| a | 1650 | 602 | 3439258 | 571.31 | 28736542 | 4773.51 | 16500.0 | 16509.0 |
| a | 1651 | 602 | 4273290 | 709.85 | 37646065 | 6253.50 | 16510.0 | 16519.0 |
| a | 1652 | 602 | 4062614 | 674.85 | 39418472 | 6547.92 | 16520.0 | 16529.0 |
| a | 1653 | 602 | 3398092 | 564.47 | 29785338 | 4947.73 | 16530.0 | 16539.0 |
| a | 1654 | 602 | 4045966 | 672.09 | 34048808 | 5655.95 | 16540.0 | 16549.0 |
| a | 1655 | 602 | 3706823 | 615.75 | 35380379 | 5877.14 | 16550.0 | 16559.0 |
| a | 1656 | 602 | 3907752 | 649.13 | 34202095 | 5681.41 | 16560.0 | 16569.0 |
| a | 1657 | 602 | 3876297 | 643.90 | 35263471 | 5857.72 | 16570.0 | 16579.0 |
| a | 1658 | 602 | 4023189 | 668.30 | 37031211 | 6151.36 | 16580.0 | 16589.0 |
| a | 1659 | 602 | 3766710 | 625.70 | 32102854 | 5332.70 | 16590.0 | 16599.0 |
| a | 1660 | 602 | 3813574 | 633.48 | 35116092 | 5833.24 | 16600.0 | 16609.0 |
| a | 1661 | 602 | 4336170 | 720.29 | 40511265 | 6729.45 | 16610.0 | 16619.0 |
| a | 1662 | 602 | 3527516 | 585.97 | 30790107 | 5114.64 | 16620.0 | 16629.0 |
| a | 1663 | 602 | 4089356 | 679.30 | 36196416 | 6012.69 | 16630.0 | 16639.0 |
| a | 1664 | 602 | 3282412 | 681.56 | 31506561 | 6542.06 | 16640.0 | 16647.0 |
| b | 1570 | 17 | 95689 | 1407.19 | 452444 | 6653.59 | 15706.0 | 15709.0 |
| b | 1571 | 17 | 263628 | 1550.75 | 1341506 | 7891.21 | 15710.0 | 15719.0 |
| b | 1572 | 17 | 288121 | 1694.83 | 1412593 | 8309.37 | 15720.0 | 15729.0 |
| b | 1573 | 17 | 279135 | 1641.97 | 1347179 | 7924.58 | 15730.0 | 15739.0 |
| b | 1574 | 17 | 294783 | 1734.02 | 1472923 | 8664.25 | 15740.0 | 15749.0 |
| b | 1575 | 17 | 296388 | 1743.46 | 1468459 | 8637.99 | 15750.0 | 15759.0 |
| b | 1576 | 17 | 299284 | 1760.49 | 1478822 | 8698.95 | 15760.0 | 15769.0 |
| b | 1577 | 17 | 286244 | 1683.79 | 1395383 | 8208.14 | 15770.0 | 15779.0 |
| b | 1578 | 17 | 301171 | 1771.59 | 1540938 | 9064.34 | 15780.0 | 15789.0 |
| b | 1579 | 17 | 319739 | 1880.82 | 1687423 | 9926.02 | 15790.0 | 15799.0 |
| b | 1580 | 17 | 309201 | 1818.83 | 1513123 | 8900.72 | 15800.0 | 15809.0 |
| b | 1581 | 17 | 301033 | 1770.78 | 1402424 | 8249.55 | 15810.0 | 15819.0 |
| b | 1582 | 17 | 334942 | 1970.25 | 1684582 | 9909.31 | 15820.0 | 15829.0 |
| b | 1583 | 17 | 331940 | 1952.59 | 1630043 | 9588.49 | 15830.0 | 15839.0 |
| b | 1584 | 17 | 332452 | 1955.60 | 1607734 | 9457.26 | 15840.0 | 15849.0 |
| b | 1585 | 17 | 326845 | 1922.62 | 1692096 | 9953.51 | 15850.0 | 15859.0 |
| b | 1586 | 17 | 330021 | 1941.30 | 1550025 | 9117.79 | 15860.0 | 15869.0 |
| b | 1587 | 17 | 341042 | 2006.13 | 1615111 | 9500.65 | 15870.0 | 15879.0 |
| b | 1588 | 17 | 327926 | 1928.98 | 1606825 | 9451.91 | 15880.0 | 15889.0 |
| b | 1589 | 17 | 329279 | 1936.94 | 1512013 | 8894.19 | 15890.0 | 15899.0 |
| b | 1590 | 17 | 339893 | 1999.37 | 1612376 | 9484.56 | 15900.0 | 15909.0 |
| b | 1591 | 17 | 336043 | 1976.72 | 1603569 | 9432.76 | 15910.0 | 15919.0 |
| b | 1592 | 17 | 316927 | 1864.28 | 1514549 | 8909.11 | 15920.0 | 15929.0 |
| b | 1593 | 17 | 315409 | 1855.35 | 1506347 | 8860.86 | 15930.0 | 15939.0 |
| b | 1594 | 17 | 322913 | 1899.49 | 1609226 | 9466.04 | 15940.0 | 15949.0 |
| b | 1595 | 17 | 323744 | 1904.38 | 1592932 | 9370.19 | 15950.0 | 15959.0 |
| b | 1596 | 17 | 312378 | 1837.52 | 1482225 | 8718.97 | 15960.0 | 15969.0 |
| b | 1597 | 17 | 324773 | 1910.43 | 1631922 | 9599.54 | 15970.0 | 15979.0 |
| b | 1598 | 17 | 342066 | 2012.15 | 1773038 | 10429.64 | 15980.0 | 15989.0 |
| b | 1599 | 17 | 322238 | 1895.52 | 1564748 | 9204.40 | 15990.0 | 15999.0 |
| b | 1600 | 17 | 333769 | 1963.35 | 1614167 | 9495.10 | 16000.0 | 16009.0 |
| b | 1601 | 17 | 342141 | 2012.59 | 1776775 | 10451.62 | 16010.0 | 16019.0 |
| b | 1602 | 17 | 334274 | 1966.32 | 1665331 | 9796.06 | 16020.0 | 16029.0 |
| b | 1603 | 17 | 339241 | 1995.54 | 1711531 | 10067.83 | 16030.0 | 16039.0 |
| b | 1604 | 17 | 370624 | 2180.14 | 2025107 | 11912.39 | 16040.0 | 16049.0 |
| b | 1605 | 17 | 373889 | 2199.35 | 2048429 | 12049.58 | 16050.0 | 16059.0 |
| b | 1606 | 17 | 299368 | 1760.99 | 1693822 | 9963.66 | 16060.0 | 16069.0 |
| b | 1607 | 17 | 289590 | 1703.47 | 1556287 | 9154.63 | 16070.0 | 16079.0 |
| b | 1608 | 17 | 310913 | 1828.90 | 1528287 | 8989.92 | 16080.0 | 16089.0 |
| b | 1609 | 17 | 327293 | 1925.25 | 1660635 | 9768.44 | 16090.0 | 16099.0 |
| b | 1610 | 17 | 344002 | 2023.54 | 1790440 | 10532.00 | 16100.0 | 16109.0 |
| b | 1611 | 17 | 338860 | 1993.29 | 1683921 | 9905.42 | 16110.0 | 16119.0 |
| b | 1612 | 17 | 332683 | 1956.96 | 1607754 | 9457.38 | 16120.0 | 16129.0 |
| b | 1613 | 17 | 346052 | 2035.60 | 1784170 | 10495.12 | 16130.0 | 16139.0 |
| b | 1614 | 17 | 331995 | 1952.91 | 1619996 | 9529.39 | 16140.0 | 16149.0 |
| b | 1615 | 17 | 326279 | 1919.29 | 1558135 | 9165.50 | 16150.0 | 16159.0 |
| b | 1616 | 17 | 356075 | 2094.56 | 1787256 | 10513.27 | 16160.0 | 16169.0 |
| b | 1617 | 17 | 360436 | 2120.21 | 1862310 | 10954.76 | 16170.0 | 16179.0 |
| b | 1618 | 17 | 338301 | 1990.01 | 1655261 | 9736.83 | 16180.0 | 16189.0 |
| b | 1619 | 17 | 373081 | 2194.59 | 1955990 | 11505.82 | 16190.0 | 16199.0 |
| b | 1620 | 17 | 349863 | 2058.02 | 1698165 | 9989.21 | 16200.0 | 16209.0 |
| b | 1621 | 17 | 366944 | 2158.49 | 1797745 | 10574.97 | 16210.0 | 16219.0 |
| b | 1622 | 17 | 378428 | 2226.05 | 1929616 | 11350.68 | 16220.0 | 16229.0 |
| b | 1623 | 17 | 355288 | 2089.93 | 1762566 | 10368.04 | 16230.0 | 16239.0 |
| b | 1624 | 17 | 342976 | 2017.51 | 1642237 | 9660.22 | 16240.0 | 16249.0 |
| b | 1625 | 17 | 361400 | 2230.86 | 1836094 | 11333.91 | 16250.0 | 16259.0 |
| b | 1626 | 16 | 343642 | 2147.76 | 1674910 | 10468.19 | 16260.0 | 16269.0 |
| b | 1627 | 16 | 331194 | 2069.96 | 1535405 | 9596.28 | 16270.0 | 16279.0 |
| b | 1628 | 16 | 337117 | 2106.98 | 1728042 | 10800.26 | 16280.0 | 16289.0 |
| b | 1629 | 16 | 314637 | 1966.48 | 1519499 | 9496.87 | 16290.0 | 16299.0 |
| b | 1630 | 16 | 312581 | 1953.63 | 1577857 | 9861.61 | 16300.0 | 16309.0 |
| b | 1631 | 16 | 326615 | 2041.34 | 1684765 | 10529.78 | 16310.0 | 16319.0 |
| b | 1632 | 16 | 324682 | 2029.26 | 1603491 | 10021.82 | 16320.0 | 16329.0 |
| b | 1633 | 16 | 324840 | 2030.25 | 1561672 | 9760.45 | 16330.0 | 16339.0 |
| b | 1634 | 16 | 346241 | 2164.01 | 1835617 | 11472.61 | 16340.0 | 16349.0 |
| b | 1635 | 16 | 323765 | 2023.53 | 1623288 | 10145.55 | 16350.0 | 16359.0 |
| b | 1636 | 16 | 321085 | 2006.78 | 1614259 | 10089.12 | 16360.0 | 16369.0 |
| b | 1637 | 16 | 345495 | 2159.34 | 1785990 | 11162.44 | 16370.0 | 16379.0 |
| b | 1638 | 16 | 312800 | 1955.00 | 1645113 | 10281.96 | 16380.0 | 16389.0 |
| b | 1639 | 16 | 333970 | 2087.31 | 1720390 | 10752.44 | 16390.0 | 16399.0 |
| b | 1640 | 16 | 373380 | 2333.62 | 2129488 | 13309.30 | 16400.0 | 16409.0 |
| b | 1641 | 16 | 347661 | 2172.88 | 1835406 | 11471.29 | 16410.0 | 16419.0 |
| b | 1642 | 16 | 335248 | 2095.30 | 1941576 | 12134.85 | 16420.0 | 16429.0 |
| b | 1643 | 17 | 281328 | 1715.41 | 1599963 | 9755.87 | 16430.0 | 16439.0 |
| b | 1644 | 17 | 332201 | 1954.12 | 1775637 | 10444.92 | 16440.0 | 16449.0 |
| b | 1645 | 17 | 323231 | 1901.36 | 1598432 | 9402.54 | 16450.0 | 16459.0 |
| b | 1646 | 17 | 353161 | 2077.42 | 1920799 | 11298.82 | 16460.0 | 16469.0 |
| b | 1647 | 17 | 346216 | 2036.56 | 1739266 | 10230.98 | 16470.0 | 16479.0 |
| b | 1648 | 17 | 339689 | 1998.17 | 1782685 | 10486.38 | 16480.0 | 16489.0 |
| b | 1649 | 17 | 348344 | 2049.08 | 1828037 | 10753.16 | 16490.0 | 16499.0 |
| b | 1650 | 17 | 338798 | 1992.93 | 1702805 | 10016.50 | 16500.0 | 16509.0 |
| b | 1651 | 17 | 347514 | 2044.20 | 1753033 | 10311.96 | 16510.0 | 16519.0 |
| b | 1652 | 17 | 357067 | 2100.39 | 1997706 | 11751.21 | 16520.0 | 16529.0 |
| b | 1653 | 17 | 350996 | 2064.68 | 1808989 | 10641.11 | 16530.0 | 16539.0 |
| b | 1654 | 17 | 345186 | 2030.51 | 1692611 | 9956.54 | 16540.0 | 16549.0 |
| b | 1655 | 17 | 373500 | 2197.06 | 2047805 | 12045.91 | 16550.0 | 16559.0 |
| b | 1656 | 17 | 356708 | 2098.28 | 1852855 | 10899.15 | 16560.0 | 16569.0 |
| b | 1657 | 17 | 349804 | 2057.67 | 1832335 | 10778.44 | 16570.0 | 16579.0 |
| b | 1658 | 17 | 374942 | 2205.54 | 2006269 | 11801.58 | 16580.0 | 16589.0 |
| b | 1659 | 17 | 365930 | 2152.53 | 1779947 | 10470.28 | 16590.0 | 16599.0 |
| b | 1660 | 17 | 363034 | 2135.49 | 1903003 | 11194.14 | 16600.0 | 16609.0 |
| b | 1661 | 17 | 384001 | 2258.83 | 1939562 | 11409.19 | 16610.0 | 16619.0 |
| b | 1662 | 17 | 364083 | 2141.66 | 1792156 | 10542.09 | 16620.0 | 16629.0 |
| b | 1663 | 17 | 365557 | 2150.34 | 1810468 | 10649.81 | 16630.0 | 16639.0 |
| b | 1664 | 17 | 280366 | 2061.51 | 1495659 | 10997.49 | 16640.0 | 16647.0 |
| c | 1570 | 148 | 352083 | 594.73 | 2665406 | 4502.37 | 15706.0 | 15709.0 |
| c | 1571 | 148 | 960910 | 649.26 | 8074615 | 5455.82 | 15710.0 | 15719.0 |
| c | 1572 | 148 | 973353 | 657.67 | 7928935 | 5357.39 | 15720.0 | 15729.0 |
| c | 1573 | 148 | 882210 | 596.09 | 7007874 | 4735.05 | 15730.0 | 15739.0 |
| c | 1574 | 148 | 1050163 | 709.57 | 8654653 | 5847.74 | 15740.0 | 15749.0 |
| c | 1575 | 148 | 1069063 | 722.34 | 8975827 | 6064.75 | 15750.0 | 15759.0 |
| c | 1576 | 148 | 939634 | 634.89 | 7662938 | 5177.66 | 15760.0 | 15769.0 |
| c | 1577 | 148 | 999485 | 675.33 | 7941404 | 5365.81 | 15770.0 | 15779.0 |
| c | 1578 | 148 | 1041551 | 703.75 | 8972195 | 6062.29 | 15780.0 | 15789.0 |
| c | 1579 | 148 | 995793 | 672.83 | 8253974 | 5577.01 | 15790.0 | 15799.0 |
| c | 1580 | 148 | 978896 | 661.42 | 8132295 | 5494.79 | 15800.0 | 15809.0 |
| c | 1581 | 148 | 978278 | 661.00 | 7758436 | 5242.19 | 15810.0 | 15819.0 |
| c | 1582 | 148 | 1049454 | 709.09 | 9042509 | 6109.80 | 15820.0 | 15829.0 |
| c | 1583 | 148 | 884863 | 597.88 | 7227502 | 4883.45 | 15830.0 | 15839.0 |
| c | 1584 | 148 | 925372 | 625.25 | 7404165 | 5002.81 | 15840.0 | 15849.0 |
| c | 1585 | 148 | 1006729 | 680.22 | 8911765 | 6021.46 | 15850.0 | 15859.0 |
| c | 1586 | 148 | 992906 | 670.88 | 8027994 | 5424.32 | 15860.0 | 15869.0 |
| c | 1587 | 148 | 943956 | 637.81 | 8071549 | 5453.75 | 15870.0 | 15879.0 |
| c | 1588 | 148 | 1061105 | 716.96 | 8974249 | 6063.68 | 15880.0 | 15889.0 |
| c | 1589 | 148 | 982686 | 663.98 | 7937653 | 5363.28 | 15890.0 | 15899.0 |
| c | 1590 | 148 | 941245 | 635.98 | 8111579 | 5480.80 | 15900.0 | 15909.0 |
| c | 1591 | 148 | 1087793 | 735.00 | 9507036 | 6423.67 | 15910.0 | 15919.0 |
| c | 1592 | 148 | 918304 | 620.48 | 7439361 | 5026.60 | 15920.0 | 15929.0 |
| c | 1593 | 148 | 1012297 | 683.98 | 8167706 | 5518.72 | 15930.0 | 15939.0 |
| c | 1594 | 148 | 958412 | 647.58 | 8205785 | 5544.45 | 15940.0 | 15949.0 |
| c | 1595 | 148 | 1023015 | 691.23 | 8459669 | 5715.99 | 15950.0 | 15959.0 |
| c | 1596 | 148 | 984022 | 664.88 | 7716372 | 5213.76 | 15960.0 | 15969.0 |
| c | 1597 | 148 | 948754 | 641.05 | 7999015 | 5404.74 | 15970.0 | 15979.0 |
| c | 1598 | 148 | 1019709 | 688.99 | 8702411 | 5880.01 | 15980.0 | 15989.0 |
| c | 1599 | 148 | 884635 | 597.73 | 6996844 | 4727.60 | 15990.0 | 15999.0 |
| c | 1600 | 148 | 988406 | 667.84 | 7952830 | 5373.53 | 16000.0 | 16009.0 |
| c | 1601 | 148 | 986589 | 666.61 | 8552979 | 5779.04 | 16010.0 | 16019.0 |
| c | 1602 | 148 | 1020387 | 689.45 | 8408217 | 5681.23 | 16020.0 | 16029.0 |
| c | 1603 | 148 | 1108520 | 749.00 | 9298760 | 6282.95 | 16030.0 | 16039.0 |
| c | 1604 | 148 | 1117649 | 755.17 | 10124286 | 6840.73 | 16040.0 | 16049.0 |
| c | 1605 | 148 | 1400775 | 946.47 | 13116710 | 8862.64 | 16050.0 | 16059.0 |
| c | 1606 | 148 | 928356 | 627.27 | 7795266 | 5267.07 | 16060.0 | 16069.0 |
| c | 1607 | 148 | 943734 | 637.66 | 8031585 | 5426.75 | 16070.0 | 16079.0 |
| c | 1608 | 148 | 858487 | 580.06 | 6689421 | 4519.88 | 16080.0 | 16089.0 |
| c | 1609 | 148 | 995011 | 672.30 | 8468256 | 5721.79 | 16090.0 | 16099.0 |
| c | 1610 | 148 | 1124587 | 759.86 | 9829570 | 6641.60 | 16100.0 | 16109.0 |
| c | 1611 | 148 | 923464 | 623.96 | 7475494 | 5051.01 | 16110.0 | 16119.0 |
| c | 1612 | 148 | 995397 | 672.57 | 7960347 | 5378.61 | 16120.0 | 16129.0 |
| c | 1613 | 148 | 985517 | 665.89 | 8470969 | 5723.63 | 16130.0 | 16139.0 |
| c | 1614 | 148 | 996419 | 673.26 | 8194544 | 5536.85 | 16140.0 | 16149.0 |
| c | 1615 | 148 | 854168 | 577.14 | 6746678 | 4558.57 | 16150.0 | 16159.0 |
| c | 1616 | 148 | 1086382 | 734.04 | 9324503 | 6300.34 | 16160.0 | 16169.0 |
| c | 1617 | 148 | 1092762 | 738.35 | 9525055 | 6435.85 | 16170.0 | 16179.0 |
| c | 1618 | 148 | 805063 | 543.96 | 6610063 | 4466.26 | 16180.0 | 16189.0 |
| c | 1619 | 148 | 1093921 | 739.14 | 9823296 | 6637.36 | 16190.0 | 16199.0 |
| c | 1620 | 148 | 896749 | 605.91 | 7185648 | 4855.17 | 16200.0 | 16209.0 |
| c | 1621 | 148 | 903539 | 610.50 | 7566160 | 5112.27 | 16210.0 | 16219.0 |
| c | 1622 | 148 | 1045726 | 706.57 | 9361989 | 6325.67 | 16220.0 | 16229.0 |
| c | 1623 | 148 | 958788 | 647.83 | 8192847 | 5535.71 | 16230.0 | 16239.0 |
| c | 1624 | 148 | 932524 | 630.08 | 7588703 | 5127.50 | 16240.0 | 16249.0 |
| c | 1625 | 148 | 920119 | 672.60 | 8110632 | 5928.82 | 16250.0 | 16259.0 |
| c | 1626 | 134 | 958154 | 715.04 | 8251185 | 6157.60 | 16260.0 | 16269.0 |
| c | 1627 | 134 | 776857 | 579.74 | 6363424 | 4748.82 | 16270.0 | 16279.0 |
| c | 1628 | 134 | 1040447 | 776.45 | 9389870 | 7007.37 | 16280.0 | 16289.0 |
| c | 1629 | 134 | 795750 | 593.84 | 6307018 | 4706.73 | 16290.0 | 16299.0 |
| c | 1630 | 134 | 979426 | 730.91 | 8207681 | 6125.14 | 16300.0 | 16309.0 |
| c | 1631 | 134 | 1008328 | 752.48 | 8661892 | 6464.10 | 16310.0 | 16319.0 |
| c | 1632 | 134 | 822743 | 613.99 | 6670206 | 4977.77 | 16320.0 | 16329.0 |
| c | 1633 | 134 | 880235 | 656.89 | 6949432 | 5186.14 | 16330.0 | 16339.0 |
| c | 1634 | 134 | 854457 | 637.65 | 7632229 | 5695.69 | 16340.0 | 16349.0 |
| c | 1635 | 134 | 938879 | 700.66 | 7737758 | 5774.45 | 16350.0 | 16359.0 |
| c | 1636 | 134 | 862795 | 643.88 | 7151563 | 5336.99 | 16360.0 | 16369.0 |
| c | 1637 | 134 | 957228 | 714.35 | 8118020 | 6058.22 | 16370.0 | 16379.0 |
| c | 1638 | 134 | 1020127 | 761.29 | 8805137 | 6571.00 | 16380.0 | 16389.0 |
| c | 1639 | 134 | 851687 | 635.59 | 7089290 | 5290.51 | 16390.0 | 16399.0 |
| c | 1640 | 134 | 1203715 | 898.29 | 11360533 | 8478.01 | 16400.0 | 16409.0 |
| c | 1641 | 134 | 972237 | 725.55 | 8526046 | 6362.72 | 16410.0 | 16419.0 |
| c | 1642 | 134 | 1191320 | 889.04 | 11103543 | 8286.23 | 16420.0 | 16429.0 |
| c | 1643 | 148 | 714729 | 511.98 | 5905608 | 4230.38 | 16430.0 | 16439.0 |
| c | 1644 | 148 | 1089669 | 736.26 | 9783850 | 6610.71 | 16440.0 | 16449.0 |
| c | 1645 | 148 | 970395 | 655.67 | 7829933 | 5290.50 | 16450.0 | 16459.0 |
| c | 1646 | 148 | 1014520 | 685.49 | 9248319 | 6248.86 | 16460.0 | 16469.0 |
| c | 1647 | 148 | 1020365 | 689.44 | 8396847 | 5673.55 | 16470.0 | 16479.0 |
| c | 1648 | 148 | 937961 | 633.76 | 8130553 | 5493.62 | 16480.0 | 16489.0 |
| c | 1649 | 148 | 1059089 | 715.60 | 9194165 | 6212.27 | 16490.0 | 16499.0 |
| c | 1650 | 148 | 874512 | 590.89 | 7129597 | 4817.30 | 16500.0 | 16509.0 |
| c | 1651 | 148 | 1038358 | 701.59 | 8983591 | 6069.99 | 16510.0 | 16519.0 |
| c | 1652 | 148 | 1120748 | 757.26 | 10466536 | 7071.98 | 16520.0 | 16529.0 |
| c | 1653 | 148 | 833261 | 563.01 | 7080963 | 4784.43 | 16530.0 | 16539.0 |
| c | 1654 | 148 | 973858 | 658.01 | 8110929 | 5480.36 | 16540.0 | 16549.0 |
| c | 1655 | 148 | 945995 | 639.19 | 8847624 | 5978.12 | 16550.0 | 16559.0 |
| c | 1656 | 148 | 946362 | 639.43 | 8153203 | 5508.92 | 16560.0 | 16569.0 |
| c | 1657 | 148 | 980123 | 662.25 | 8736657 | 5903.15 | 16570.0 | 16579.0 |
| c | 1658 | 148 | 999354 | 675.24 | 9013448 | 6090.17 | 16580.0 | 16589.0 |
| c | 1659 | 148 | 942154 | 636.59 | 7910369 | 5344.84 | 16590.0 | 16599.0 |
| c | 1660 | 148 | 944520 | 638.19 | 8532388 | 5765.13 | 16600.0 | 16609.0 |
| c | 1661 | 148 | 1048094 | 708.17 | 9630741 | 6507.26 | 16610.0 | 16619.0 |
| c | 1662 | 148 | 896768 | 605.92 | 7677641 | 5187.60 | 16620.0 | 16629.0 |
| c | 1663 | 148 | 1007333 | 680.63 | 8794387 | 5942.15 | 16630.0 | 16639.0 |
| c | 1664 | 148 | 847437 | 715.74 | 8002726 | 6759.06 | 16640.0 | 16647.0 |
| d | 1570 | 348 | 567290 | 407.54 | 5854453 | 4205.79 | 15706.0 | 15709.0 |
| d | 1571 | 348 | 1649104 | 473.88 | 18468828 | 5307.13 | 15710.0 | 15719.0 |
| d | 1572 | 348 | 1635722 | 470.04 | 17858054 | 5131.62 | 15720.0 | 15729.0 |
| d | 1573 | 348 | 1478980 | 424.99 | 15703982 | 4512.64 | 15730.0 | 15739.0 |
| d | 1574 | 348 | 1774788 | 510.00 | 19512347 | 5607.00 | 15740.0 | 15749.0 |
| d | 1575 | 348 | 1808183 | 519.59 | 20186228 | 5800.64 | 15750.0 | 15759.0 |
| d | 1576 | 348 | 1574989 | 452.58 | 17056660 | 4901.34 | 15760.0 | 15769.0 |
| d | 1577 | 348 | 1700977 | 488.79 | 18134313 | 5211.01 | 15770.0 | 15779.0 |
| d | 1578 | 348 | 1739994 | 500.00 | 20017064 | 5752.03 | 15780.0 | 15789.0 |
| d | 1579 | 348 | 1636507 | 470.26 | 18045202 | 5185.40 | 15790.0 | 15799.0 |
| d | 1580 | 348 | 1678332 | 482.28 | 18585727 | 5340.73 | 15800.0 | 15809.0 |
| d | 1581 | 348 | 1678757 | 482.40 | 17471595 | 5020.57 | 15810.0 | 15819.0 |
| d | 1582 | 348 | 1810734 | 520.33 | 20398422 | 5861.62 | 15820.0 | 15829.0 |
| d | 1583 | 348 | 1540261 | 442.60 | 16591683 | 4767.73 | 15830.0 | 15839.0 |
| d | 1584 | 348 | 1605786 | 461.43 | 17095475 | 4912.49 | 15840.0 | 15849.0 |
| d | 1585 | 348 | 1683146 | 483.66 | 19682761 | 5655.97 | 15850.0 | 15859.0 |
| d | 1586 | 348 | 1733423 | 498.11 | 18488839 | 5312.88 | 15860.0 | 15869.0 |
| d | 1587 | 348 | 1668984 | 479.59 | 18681436 | 5368.23 | 15870.0 | 15879.0 |
| d | 1588 | 348 | 1827140 | 525.04 | 20291941 | 5831.02 | 15880.0 | 15889.0 |
| d | 1589 | 348 | 1766852 | 507.72 | 18746739 | 5386.99 | 15890.0 | 15899.0 |
| d | 1590 | 348 | 1691562 | 486.08 | 19271538 | 5537.80 | 15900.0 | 15909.0 |
| d | 1591 | 348 | 1911641 | 549.32 | 22172423 | 6371.39 | 15910.0 | 15919.0 |
| d | 1592 | 348 | 1608512 | 462.22 | 17286665 | 4967.43 | 15920.0 | 15929.0 |
| d | 1593 | 348 | 1763930 | 506.88 | 18860809 | 5419.77 | 15930.0 | 15939.0 |
| d | 1594 | 348 | 1711137 | 491.71 | 19225781 | 5524.65 | 15940.0 | 15949.0 |
| d | 1595 | 348 | 1814496 | 521.41 | 19796223 | 5688.57 | 15950.0 | 15959.0 |
| d | 1596 | 348 | 1716026 | 493.11 | 17593024 | 5055.47 | 15960.0 | 15969.0 |
| d | 1597 | 348 | 1661359 | 477.40 | 18413028 | 5291.10 | 15970.0 | 15979.0 |
| d | 1598 | 348 | 1770857 | 508.87 | 20088036 | 5772.42 | 15980.0 | 15989.0 |
| d | 1599 | 348 | 1519147 | 436.54 | 15979695 | 4591.87 | 15990.0 | 15999.0 |
| d | 1600 | 348 | 1740266 | 500.08 | 18614101 | 5348.88 | 16000.0 | 16009.0 |
| d | 1601 | 348 | 1667244 | 479.09 | 19242219 | 5529.37 | 16010.0 | 16019.0 |
| d | 1602 | 348 | 1765738 | 507.40 | 19376567 | 5567.98 | 16020.0 | 16029.0 |
| d | 1603 | 348 | 1852042 | 532.20 | 20804710 | 5978.36 | 16030.0 | 16039.0 |
| d | 1604 | 348 | 1848775 | 531.26 | 22371970 | 6428.73 | 16040.0 | 16049.0 |
| d | 1605 | 348 | 2328940 | 669.24 | 29181113 | 8385.38 | 16050.0 | 16059.0 |
| d | 1606 | 348 | 1593674 | 457.95 | 17559087 | 5045.71 | 16060.0 | 16069.0 |
| d | 1607 | 348 | 1627888 | 467.78 | 18505601 | 5317.70 | 16070.0 | 16079.0 |
| d | 1608 | 348 | 1520445 | 436.91 | 15834506 | 4550.15 | 16080.0 | 16089.0 |
| d | 1609 | 348 | 1784682 | 512.84 | 20275520 | 5826.30 | 16090.0 | 16099.0 |
| d | 1610 | 348 | 1967889 | 565.49 | 22880369 | 6574.82 | 16100.0 | 16109.0 |
| d | 1611 | 348 | 1618341 | 465.04 | 17405989 | 5001.72 | 16110.0 | 16119.0 |
| d | 1612 | 348 | 1759466 | 505.59 | 18680547 | 5367.97 | 16120.0 | 16129.0 |
| d | 1613 | 348 | 1753473 | 503.87 | 19969149 | 5738.26 | 16130.0 | 16139.0 |
| d | 1614 | 348 | 1788470 | 513.93 | 19600711 | 5632.39 | 16140.0 | 16149.0 |
| d | 1615 | 348 | 1512414 | 434.60 | 16036975 | 4608.33 | 16150.0 | 16159.0 |
| d | 1616 | 348 | 1910684 | 549.05 | 21788096 | 6260.95 | 16160.0 | 16169.0 |
| d | 1617 | 348 | 1943258 | 558.41 | 22362183 | 6425.91 | 16170.0 | 16179.0 |
| d | 1618 | 348 | 1443903 | 414.91 | 15727746 | 4519.47 | 16180.0 | 16189.0 |
| d | 1619 | 348 | 1962150 | 563.84 | 23165063 | 6656.63 | 16190.0 | 16199.0 |
| d | 1620 | 348 | 1584872 | 455.42 | 16801360 | 4827.98 | 16200.0 | 16209.0 |
| d | 1621 | 348 | 1636842 | 470.36 | 18129432 | 5209.61 | 16210.0 | 16219.0 |
| d | 1622 | 348 | 1884685 | 541.58 | 22179683 | 6373.47 | 16220.0 | 16229.0 |
| d | 1623 | 348 | 1755001 | 504.31 | 19878728 | 5712.28 | 16230.0 | 16239.0 |
| d | 1624 | 348 | 1639035 | 470.99 | 17535081 | 5038.82 | 16240.0 | 16249.0 |
| d | 1625 | 348 | 1462649 | 516.47 | 16958178 | 5988.06 | 16250.0 | 16259.0 |
| d | 1626 | 267 | 1463673 | 548.19 | 16446494 | 6159.74 | 16260.0 | 16269.0 |
| d | 1627 | 267 | 1189829 | 445.63 | 12838838 | 4808.55 | 16270.0 | 16279.0 |
| d | 1628 | 267 | 1556134 | 582.82 | 18470587 | 6917.82 | 16280.0 | 16289.0 |
| d | 1629 | 267 | 1187662 | 444.82 | 12434164 | 4656.99 | 16290.0 | 16299.0 |
| d | 1630 | 267 | 1463780 | 548.23 | 15985742 | 5987.17 | 16300.0 | 16309.0 |
| d | 1631 | 267 | 1530825 | 573.34 | 17167851 | 6429.91 | 16310.0 | 16319.0 |
| d | 1632 | 267 | 1272235 | 476.49 | 13439411 | 5033.49 | 16320.0 | 16329.0 |
| d | 1633 | 267 | 1348478 | 505.05 | 13836704 | 5182.29 | 16330.0 | 16339.0 |
| d | 1634 | 267 | 1322964 | 495.49 | 15309212 | 5733.79 | 16340.0 | 16349.0 |
| d | 1635 | 267 | 1394851 | 522.42 | 14923019 | 5589.15 | 16350.0 | 16359.0 |
| d | 1636 | 267 | 1288562 | 482.61 | 13996039 | 5241.96 | 16360.0 | 16369.0 |
| d | 1637 | 267 | 1423610 | 533.19 | 15890400 | 5951.46 | 16370.0 | 16379.0 |
| d | 1638 | 267 | 1535247 | 575.00 | 17423063 | 6525.49 | 16380.0 | 16389.0 |
| d | 1639 | 267 | 1253333 | 469.41 | 13652082 | 5113.14 | 16390.0 | 16399.0 |
| d | 1640 | 267 | 1735026 | 649.82 | 21573568 | 8079.99 | 16400.0 | 16409.0 |
| d | 1641 | 267 | 1357796 | 508.54 | 15818784 | 5924.64 | 16410.0 | 16419.0 |
| d | 1642 | 267 | 1737842 | 650.88 | 21311369 | 7981.79 | 16420.0 | 16429.0 |
| d | 1643 | 348 | 1036608 | 346.23 | 11300995 | 3774.55 | 16430.0 | 16439.0 |
| d | 1644 | 348 | 1934911 | 556.01 | 23460378 | 6741.49 | 16440.0 | 16449.0 |
| d | 1645 | 348 | 1715589 | 492.99 | 18452850 | 5302.54 | 16450.0 | 16459.0 |
| d | 1646 | 348 | 1775485 | 510.20 | 21585841 | 6202.83 | 16460.0 | 16469.0 |
| d | 1647 | 348 | 1794062 | 515.54 | 19635438 | 5642.37 | 16470.0 | 16479.0 |
| d | 1648 | 348 | 1668317 | 479.40 | 19256254 | 5533.41 | 16480.0 | 16489.0 |
| d | 1649 | 348 | 1882467 | 540.94 | 21683472 | 6230.88 | 16490.0 | 16499.0 |
| d | 1650 | 348 | 1569062 | 450.88 | 17138851 | 4924.96 | 16500.0 | 16509.0 |
| d | 1651 | 348 | 1865556 | 536.08 | 21368712 | 6140.43 | 16510.0 | 16519.0 |
| d | 1652 | 348 | 1962682 | 563.99 | 24319238 | 6988.29 | 16520.0 | 16529.0 |
| d | 1653 | 348 | 1475929 | 424.12 | 16662872 | 4788.18 | 16530.0 | 16539.0 |
| d | 1654 | 348 | 1758986 | 505.46 | 19323294 | 5552.67 | 16540.0 | 16549.0 |
| d | 1655 | 348 | 1734898 | 498.53 | 21292094 | 6118.42 | 16550.0 | 16559.0 |
| d | 1656 | 348 | 1731451 | 497.54 | 19685341 | 5656.71 | 16560.0 | 16569.0 |
| d | 1657 | 348 | 1788842 | 514.04 | 21018211 | 6039.72 | 16570.0 | 16579.0 |
| d | 1658 | 348 | 1796411 | 516.21 | 21475072 | 6171.00 | 16580.0 | 16589.0 |
| d | 1659 | 348 | 1670356 | 479.99 | 18634254 | 5354.67 | 16590.0 | 16599.0 |
| d | 1660 | 348 | 1730346 | 497.23 | 20678405 | 5942.07 | 16600.0 | 16609.0 |
| d | 1661 | 348 | 1910125 | 548.89 | 23190247 | 6663.86 | 16610.0 | 16619.0 |
| d | 1662 | 348 | 1611830 | 463.17 | 18296169 | 5257.52 | 16620.0 | 16629.0 |
| d | 1663 | 348 | 1811087 | 520.43 | 20859140 | 5994.01 | 16630.0 | 16639.0 |
| d | 1664 | 348 | 1536706 | 551.98 | 19134833 | 6873.14 | 16640.0 | 16647.0 |
base['CompetitionDistanceGroup'] <- trunc(base$CompetitionDistance / 5000) * 5000
# base['CompetitionDistanceGroup'] <- round(base$CompetitionDistance / 5000, 1) * 5000
# base['CompetitionDistanceGroup'] <- ceiling(base$CompetitionDistance / 5000) * 5000
#base %>% select(CompetitionDistance, CompetitionDistanceGroup) %>% head(10)
sales_by_distance <- sqldf("
select
CompetitionDistanceGroup as competition_distance_group
, count(distinct store) as tot_store
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, substr(min(date), 1, 10) as first_purchase_date
, substr(max(date), 1, 10) as last_purchase_date
from base
group by 1;
")
summary(sales_by_distance)
## competition_distance_group tot_store tot_sales
## Min. : 0 Min. : 1.00 Min. : 5718883
## 1st Qu.:13750 1st Qu.: 3.00 1st Qu.: 13614827
## Median :27500 Median : 7.00 Median : 36389933
## Mean :29583 Mean : 85.77 Mean : 451783124
## 3rd Qu.:41250 3rd Qu.: 65.00 3rd Qu.: 324222806
## Max. :75000 Max. :762.00 Max. :4059521480
## NA's :1
## avg_sales tot_cust avg_cust first_purchase_date
## Min. :4455 Min. : 702496 Min. :457.5 Length:13
## 1st Qu.:5158 1st Qu.: 1379518 1st Qu.:504.9 Class :character
## Median :5575 Median : 3826739 Median :542.4 Mode :character
## Mean :5686 Mean : 49541673 Mean :597.8
## 3rd Qu.:6071 3rd Qu.: 29891814 3rd Qu.:678.2
## Max. :7321 Max. :471232154 Max. :861.2
##
## last_purchase_date
## Length:13
## Class :character
## Mode :character
##
##
##
##
| competition_distance_group | tot_store | tot_sales | avg_sales | tot_cust | avg_cust | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|
| NA | 3 | 11983829 | 4535.89 | 1208630 | 457.47 | 15706.0 | 16647.0 |
| 0 | 762 | 4059521480 | 5842.69 | 471232154 | 678.22 | 15706.0 | 16647.0 |
| 5000 | 163 | 828389324 | 5575.45 | 77488707 | 521.54 | 15706.0 | 16647.0 |
| 10000 | 65 | 324222806 | 5476.18 | 29891814 | 504.88 | 15706.0 | 16647.0 |
| 15000 | 65 | 335727649 | 5670.50 | 32114869 | 542.43 | 15706.0 | 16647.0 |
| 20000 | 25 | 152528996 | 6685.76 | 15082188 | 661.09 | 15706.0 | 16647.0 |
| 25000 | 13 | 62219790 | 5158.33 | 6481443 | 537.34 | 15706.0 | 16647.0 |
| 30000 | 7 | 36389933 | 5518.64 | 3826739 | 580.34 | 15706.0 | 16647.0 |
| 35000 | 4 | 16785669 | 4454.80 | 1797685 | 477.09 | 15706.0 | 16647.0 |
| 40000 | 3 | 19180616 | 6787.20 | 2024284 | 716.31 | 15706.0 | 16647.0 |
| 45000 | 3 | 13614827 | 4817.70 | 1379518 | 488.15 | 15706.0 | 16647.0 |
| 55000 | 1 | 5718883 | 6071.00 | 702496 | 745.75 | 15706.0 | 16647.0 |
| 75000 | 1 | 6896821 | 7321.47 | 811228 | 861.18 | 15706.0 | 16647.0 |
x <- sqldf("
select
CompetitionDistance
, Sales
from base
")
lm <- lm(x$Sales~x$CompetitionDistance)
summary(lm)
##
## Call:
## lm(formula = x$Sales ~ x$CompetitionDistance)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5829 -2051 -27 2084 35738
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.829e+03 4.674e+00 1247.03 <2e-16 ***
## x$CompetitionDistance -9.598e-03 4.955e-04 -19.37 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3850 on 1014565 degrees of freedom
## (2642 observations deleted due to missingness)
## Multiple R-squared: 0.0003698, Adjusted R-squared: 0.0003688
## F-statistic: 375.3 on 1 and 1014565 DF, p-value: < 2.2e-16
ggplot(x, aes(x = x$CompetitionDistance, y = x$Sales)) +
geom_point(alpha = 0.02) +
geom_smooth(method = 'lm', col = "red")
sales_by_store_type_distance <- sqldf("
select
storetype
, CompetitionDistanceGroup as competition_distance_group
, count(distinct store) as tot_store
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, substr(min(date), 1, 10) as first_purchase_date
, substr(max(date), 1, 10) as last_purchase_date
from base
group by 1, 2;
")
summary(sales_by_store_type_distance)
## StoreType competition_distance_group tot_store
## Length:32 Min. : 0 Min. : 1.00
## Class :character 1st Qu.:10000 1st Qu.: 1.00
## Mode :character Median :20000 Median : 3.50
## Mean :23000 Mean : 34.84
## 3rd Qu.:33750 3rd Qu.: 26.50
## Max. :75000 Max. :450.00
## NA's :2
## tot_sales avg_sales tot_cust avg_cust
## Min. : 3385282 Min. : 3594 Min. : 337120 Min. : 357.9
## 1st Qu.: 6599117 1st Qu.: 5272 1st Qu.: 632957 1st Qu.: 514.1
## Median : 19117000 Median : 5652 Median : 2117185 Median : 555.8
## Mean : 183536894 Mean : 5789 Mean : 20126305 Mean : 618.1
## 3rd Qu.: 159231395 3rd Qu.: 6563 3rd Qu.: 14296809 3rd Qu.: 682.9
## Max. :2383432599 Max. :10059 Max. :282162255 Max. :1987.7
##
## first_purchase_date last_purchase_date
## Length:32 Length:32
## Class :character Class :character
## Mode :character Mode :character
##
##
##
##
| StoreType | competition_distance_group | tot_store | tot_sales | avg_sales | tot_cust | avg_cust | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|---|
| a | NA | 1 | 3385282 | 3593.72 | 337120 | 357.88 | 15706.0 | 16647.0 |
| a | 0 | 450 | 2383432599 | 5788.47 | 282162255 | 685.27 | 15706.0 | 16647.0 |
| a | 5000 | 56 | 282703073 | 5532.78 | 29054203 | 568.62 | 15706.0 | 16647.0 |
| a | 10000 | 25 | 123032160 | 5392.84 | 12439092 | 545.24 | 15706.0 | 16647.0 |
| a | 15000 | 33 | 171557353 | 5618.57 | 17564543 | 575.25 | 15706.0 | 16647.0 |
| a | 20000 | 13 | 86263837 | 7151.70 | 9140240 | 757.77 | 15706.0 | 16647.0 |
| a | 25000 | 11 | 49206113 | 4834.56 | 5439905 | 534.48 | 15706.0 | 16647.0 |
| a | 30000 | 4 | 19117000 | 5073.51 | 2210086 | 586.54 | 15706.0 | 16647.0 |
| a | 35000 | 3 | 11286129 | 3993.68 | 1195430 | 423.01 | 15706.0 | 16647.0 |
| a | 40000 | 3 | 19180616 | 6787.20 | 2024284 | 716.31 | 15706.0 | 16647.0 |
| a | 45000 | 1 | 3554993 | 3773.88 | 460552 | 488.91 | 15706.0 | 16647.0 |
| a | 55000 | 1 | 5718883 | 6071.00 | 702496 | 745.75 | 15706.0 | 16647.0 |
| a | 75000 | 1 | 6896821 | 7321.47 | 811228 | 861.18 | 15706.0 | 16647.0 |
| b | 0 | 17 | 159231395 | 10058.84 | 31465621 | 1987.72 | 15706.0 | 16647.0 |
| c | 0 | 123 | 653916153 | 5753.37 | 78575720 | 691.33 | 15706.0 | 16647.0 |
| c | 5000 | 14 | 69411008 | 5337.67 | 7748849 | 595.88 | 15706.0 | 16647.0 |
| c | 10000 | 5 | 25437380 | 5400.72 | 2459591 | 522.21 | 15706.0 | 16647.0 |
| c | 15000 | 1 | 4239004 | 4500.00 | 485624 | 515.52 | 15706.0 | 16647.0 |
| c | 20000 | 2 | 11128629 | 6546.25 | 1161711 | 683.36 | 15706.0 | 16647.0 |
| c | 25000 | 1 | 6228961 | 6612.49 | 507968 | 539.24 | 15706.0 | 16647.0 |
| c | 30000 | 1 | 6599117 | 7005.43 | 643191 | 682.79 | 15706.0 | 16647.0 |
| c | 45000 | 1 | 6261174 | 6646.68 | 547051 | 580.73 | 15706.0 | 16647.0 |
| d | NA | 2 | 8598547 | 5057.97 | 871510 | 512.65 | 15706.0 | 16647.0 |
| d | 0 | 172 | 862941333 | 5619.57 | 79028558 | 514.64 | 15706.0 | 16647.0 |
| d | 5000 | 93 | 476275243 | 5637.86 | 40685655 | 481.61 | 15706.0 | 16647.0 |
| d | 10000 | 35 | 175753266 | 5547.42 | 14993131 | 473.24 | 15706.0 | 16647.0 |
| d | 15000 | 31 | 159931292 | 5767.45 | 14064702 | 507.20 | 15706.0 | 16647.0 |
| d | 20000 | 10 | 55136530 | 6091.09 | 4780237 | 528.09 | 15706.0 | 16647.0 |
| d | 25000 | 1 | 6784716 | 7202.46 | 533570 | 566.42 | 15706.0 | 16647.0 |
| d | 30000 | 2 | 10673816 | 5665.51 | 973462 | 516.70 | 15706.0 | 16647.0 |
| d | 35000 | 1 | 5499540 | 5838.15 | 602255 | 639.34 | 15706.0 | 16647.0 |
| d | 45000 | 1 | 3798660 | 4032.55 | 371915 | 394.81 | 15706.0 | 16647.0 |
competition_open_since_year_sales <- sqldf("
select
CompetitionOpenSinceYear as competition_open_since_year
, count(distinct store) as tot_store
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, min(date) as first_purchase_date
, max(date) as last_purchase_date
from base
group by 1;
")
summary(competition_open_since_year_sales)
## competition_open_since_year tot_store tot_sales
## Min. :1900 Min. : 1.00 Min. :5.222e+06
## 1st Qu.:1998 1st Qu.: 7.25 1st Qu.:3.838e+07
## Median :2004 Median : 32.50 Median :1.648e+08
## Mean :1998 Mean : 46.46 Mean :2.447e+08
## 3rd Qu.:2010 3rd Qu.: 54.25 3rd Qu.:2.898e+08
## Max. :2015 Max. :354.00 Max. :1.862e+09
## NA's :1
## avg_sales tot_cust avg_cust first_purchase_date
## Min. :5225 Min. : 507202 Min. :550.4 Min. :15706
## 1st Qu.:5623 1st Qu.: 5136704 1st Qu.:601.2 1st Qu.:15706
## Median :5773 Median : 18711883 Median :629.3 Median :15706
## Mean :5863 Mean : 26835073 Mean :640.8 Mean :15706
## 3rd Qu.:6013 3rd Qu.: 31904140 3rd Qu.:669.1 3rd Qu.:15706
## Max. :6889 Max. :204627267 Max. :818.7 Max. :15706
##
## last_purchase_date
## Min. :16647
## 1st Qu.:16647
## Median :16647
## Mean :16647
## 3rd Qu.:16647
## Max. :16647
##
| competition_open_since_year | tot_store | tot_sales | avg_sales | tot_cust | avg_cust | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|
| NA | 354 | 1862040952 | 5758.63 | 204627267 | 632.84 | 15706 | 16647 |
| 1900 | 1 | 5222211 | 6889.46 | 507202 | 669.13 | 15706 | 16647 |
| 1961 | 1 | 6086195 | 6460.93 | 547187 | 580.88 | 15706 | 16647 |
| 1990 | 5 | 27862226 | 5915.55 | 3209038 | 681.32 | 15706 | 16647 |
| 1994 | 2 | 11280525 | 5987.54 | 1282646 | 680.81 | 15706 | 16647 |
| 1995 | 2 | 10472100 | 6160.06 | 935723 | 550.43 | 15706 | 16647 |
| 1998 | 1 | 5737921 | 6091.21 | 770860 | 818.32 | 15706 | 16647 |
| 1999 | 8 | 41892450 | 5698.10 | 6019094 | 818.70 | 15706 | 16647 |
| 2000 | 10 | 60175681 | 6515.34 | 5779259 | 625.73 | 15706 | 16647 |
| 2001 | 16 | 78278080 | 5323.59 | 8358755 | 568.47 | 15706 | 16647 |
| 2002 | 27 | 136190565 | 5473.46 | 16122104 | 647.94 | 15706 | 16647 |
| 2003 | 19 | 97218299 | 5788.87 | 9835892 | 585.68 | 15706 | 16647 |
| 2004 | 22 | 103469923 | 5224.70 | 11071731 | 559.07 | 15706 | 16647 |
| 2005 | 62 | 320132499 | 5659.65 | 34012056 | 601.30 | 15706 | 16647 |
| 2006 | 47 | 247721540 | 5787.62 | 28638014 | 669.08 | 15706 | 16647 |
| 2007 | 48 | 238392636 | 5449.72 | 26285748 | 600.90 | 15706 | 16647 |
| 2008 | 54 | 271344522 | 5597.50 | 30100113 | 620.93 | 15706 | 16647 |
| 2009 | 54 | 293110384 | 5933.89 | 32247157 | 652.83 | 15706 | 16647 |
| 2010 | 55 | 288653207 | 5631.38 | 31120758 | 607.14 | 15706 | 16647 |
| 2011 | 54 | 279859156 | 5665.62 | 31789801 | 643.57 | 15706 | 16647 |
| 2012 | 82 | 424676043 | 5715.77 | 46030586 | 619.53 | 15706 | 16647 |
| 2013 | 83 | 489038371 | 6483.68 | 52757749 | 699.46 | 15706 | 16647 |
| 2014 | 70 | 380853414 | 5975.86 | 40691353 | 638.48 | 15706 | 16647 |
| 2015 | 38 | 193471723 | 5518.30 | 21301662 | 607.58 | 15706 | 16647 |
competition_open_since_year_distance_sales <- sqldf("
select
CompetitionOpenSinceYear as competition_open_since_year
, CompetitionDistanceGroup as competition_distance_group
, count(distinct store) as tot_store
, sum(sales) as tot_sales
, round(avg(sales), 2) as avg_sales
, sum(customers) as tot_cust
, round(avg(customers), 2) as avg_cust
, substr(min(date), 1, 10) as first_purchase_date
, substr(max(date), 1, 10) as last_purchase_date
from base
group by 1, 2;
")
summary(competition_open_since_year_distance_sales)
## competition_open_since_year competition_distance_group tot_store
## Min. :1900 Min. : 0 Min. : 1.00
## 1st Qu.:2003 1st Qu.: 5000 1st Qu.: 1.00
## Median :2008 Median :10000 Median : 3.00
## Mean :2006 Mean :14227 Mean : 10.05
## 3rd Qu.:2012 3rd Qu.:20000 3rd Qu.: 6.00
## Max. :2015 Max. :75000 Max. :233.00
## NA's :9 NA's :1
## tot_sales avg_sales tot_cust avg_cust
## Min. :2.319e+06 Min. :2461 Min. : 246036 Min. : 274.8
## 1st Qu.:5.979e+06 1st Qu.:5126 1st Qu.: 568147 1st Qu.: 494.4
## Median :1.323e+07 Median :5705 Median : 1299613 Median : 566.7
## Mean :5.291e+07 Mean :5648 Mean : 5802178 Mean : 571.4
## 3rd Qu.:3.345e+07 3rd Qu.:6232 3rd Qu.: 3620793 3rd Qu.: 656.1
## Max. :1.236e+09 Max. :9303 Max. :145176448 Max. :1013.9
##
## first_purchase_date last_purchase_date
## Length:111 Length:111
## Class :character Class :character
## Mode :character Mode :character
##
##
##
##
| competition_open_since_year | competition_distance_group | tot_store | tot_sales | avg_sales | tot_cust | avg_cust | first_purchase_date | last_purchase_date |
|---|---|---|---|---|---|---|---|---|
| NA | NA | 3 | 11983829 | 4535.89 | 1208630 | 457.47 | 15706.0 | 16647.0 |
| NA | 0 | 233 | 1236273942 | 5817.92 | 145176448 | 683.20 | 15706.0 | 16647.0 |
| NA | 5000 | 56 | 283087701 | 5540.31 | 26156555 | 511.91 | 15706.0 | 16647.0 |
| NA | 10000 | 23 | 114937050 | 5396.61 | 10713115 | 503.01 | 15706.0 | 16647.0 |
| NA | 15000 | 24 | 129843392 | 5986.88 | 12774831 | 589.03 | 15706.0 | 16647.0 |
| NA | 20000 | 9 | 57741496 | 6810.74 | 5192859 | 612.51 | 15706.0 | 16647.0 |
| NA | 25000 | 4 | 16140805 | 4283.65 | 2009532 | 533.32 | 15706.0 | 16647.0 |
| NA | 35000 | 1 | 5135916 | 5452.14 | 584069 | 620.03 | 15706.0 | 16647.0 |
| NA | 75000 | 1 | 6896821 | 7321.47 | 811228 | 861.18 | 15706.0 | 16647.0 |
| 1900 | 0 | 1 | 5222211 | 6889.46 | 507202 | 669.13 | 15706.0 | 16647.0 |
| 1961 | 5000 | 1 | 6086195 | 6460.93 | 547187 | 580.88 | 15706.0 | 16647.0 |
| 1990 | 0 | 5 | 27862226 | 5915.55 | 3209038 | 681.32 | 15706.0 | 16647.0 |
| 1994 | 5000 | 2 | 11280525 | 5987.54 | 1282646 | 680.81 | 15706.0 | 16647.0 |
| 1995 | 0 | 2 | 10472100 | 6160.06 | 935723 | 550.43 | 15706.0 | 16647.0 |
| 1998 | 0 | 1 | 5737921 | 6091.21 | 770860 | 818.32 | 15706.0 | 16647.0 |
| 1999 | 0 | 6 | 30804788 | 5450.25 | 5031896 | 890.29 | 15706.0 | 16647.0 |
| 1999 | 20000 | 1 | 4477826 | 5907.42 | 434973 | 573.84 | 15706.0 | 16647.0 |
| 1999 | 25000 | 1 | 6609836 | 7016.81 | 552225 | 586.23 | 15706.0 | 16647.0 |
| 2000 | 0 | 6 | 33835243 | 6187.86 | 3596984 | 657.82 | 15706.0 | 16647.0 |
| 2000 | 5000 | 2 | 13517765 | 7175.03 | 1063559 | 564.52 | 15706.0 | 16647.0 |
| 2000 | 10000 | 1 | 7317626 | 7768.18 | 624924 | 663.40 | 15706.0 | 16647.0 |
| 2000 | 15000 | 1 | 5505047 | 5844.00 | 493792 | 524.20 | 15706.0 | 16647.0 |
| 2001 | 0 | 9 | 46456969 | 5479.71 | 5340571 | 629.93 | 15706.0 | 16647.0 |
| 2001 | 5000 | 5 | 24180618 | 5342.60 | 2275985 | 502.87 | 15706.0 | 16647.0 |
| 2001 | 10000 | 1 | 3713936 | 4899.65 | 313767 | 413.94 | 15706.0 | 16647.0 |
| 2001 | 15000 | 1 | 3926557 | 4168.32 | 428432 | 454.81 | 15706.0 | 16647.0 |
| 2002 | 0 | 16 | 84230349 | 5588.53 | 11316111 | 750.80 | 15706.0 | 16647.0 |
| 2002 | 5000 | 3 | 10123612 | 4451.90 | 881688 | 387.73 | 15706.0 | 16647.0 |
| 2002 | 10000 | 1 | 2873358 | 3050.27 | 301609 | 320.18 | 15706.0 | 16647.0 |
| 2002 | 15000 | 4 | 19241000 | 5106.42 | 1696492 | 450.24 | 15706.0 | 16647.0 |
| 2002 | 20000 | 1 | 4875320 | 5175.50 | 587900 | 624.10 | 15706.0 | 16647.0 |
| 2002 | 25000 | 1 | 6228961 | 6612.49 | 507968 | 539.24 | 15706.0 | 16647.0 |
| 2002 | 40000 | 1 | 8617965 | 9148.58 | 830336 | 881.46 | 15706.0 | 16647.0 |
| 2003 | 0 | 10 | 54414615 | 6266.08 | 5968487 | 687.30 | 15706.0 | 16647.0 |
| 2003 | 5000 | 5 | 18117349 | 4002.95 | 1819303 | 401.97 | 15706.0 | 16647.0 |
| 2003 | 10000 | 3 | 19814992 | 7500.00 | 1556415 | 589.10 | 15706.0 | 16647.0 |
| 2003 | 15000 | 1 | 4871343 | 5171.28 | 491687 | 521.96 | 15706.0 | 16647.0 |
| 2004 | 0 | 12 | 55223519 | 5225.54 | 6445182 | 609.88 | 15706.0 | 16647.0 |
| 2004 | 5000 | 5 | 24237320 | 5355.13 | 2180389 | 481.75 | 15706.0 | 16647.0 |
| 2004 | 10000 | 1 | 5018882 | 5327.90 | 474412 | 503.62 | 15706.0 | 16647.0 |
| 2004 | 15000 | 3 | 16671567 | 5899.35 | 1650253 | 583.95 | 15706.0 | 16647.0 |
| 2004 | 25000 | 1 | 2318635 | 2461.40 | 321495 | 341.29 | 15706.0 | 16647.0 |
| 2005 | 0 | 42 | 221739076 | 5793.16 | 24850902 | 649.26 | 15706.0 | 16647.0 |
| 2005 | 5000 | 9 | 46698555 | 5630.40 | 4560262 | 549.83 | 15706.0 | 16647.0 |
| 2005 | 10000 | 6 | 26822750 | 4905.40 | 2231186 | 408.04 | 15706.0 | 16647.0 |
| 2005 | 15000 | 3 | 11018057 | 4170.35 | 1057130 | 400.12 | 15706.0 | 16647.0 |
| 2005 | 20000 | 1 | 7069345 | 7504.61 | 779006 | 826.97 | 15706.0 | 16647.0 |
| 2005 | 25000 | 1 | 6784716 | 7202.46 | 533570 | 566.42 | 15706.0 | 16647.0 |
| 2006 | 0 | 36 | 188474732 | 5777.18 | 22687743 | 695.43 | 15706.0 | 16647.0 |
| 2006 | 5000 | 6 | 34863708 | 6168.38 | 3644602 | 644.83 | 15706.0 | 16647.0 |
| 2006 | 10000 | 2 | 10632856 | 5643.77 | 951371 | 504.97 | 15706.0 | 16647.0 |
| 2006 | 15000 | 3 | 13750244 | 5204.48 | 1354298 | 512.60 | 15706.0 | 16647.0 |
| 2007 | 0 | 36 | 182124772 | 5459.38 | 21314895 | 638.94 | 15706.0 | 16647.0 |
| 2007 | 5000 | 7 | 32038862 | 5145.98 | 2992574 | 480.66 | 15706.0 | 16647.0 |
| 2007 | 10000 | 1 | 3330211 | 4393.42 | 246036 | 324.59 | 15706.0 | 16647.0 |
| 2007 | 15000 | 3 | 16717700 | 6327.67 | 1303671 | 493.44 | 15706.0 | 16647.0 |
| 2007 | 20000 | 1 | 4181091 | 5515.95 | 428572 | 565.40 | 15706.0 | 16647.0 |
| 2008 | 0 | 41 | 200292655 | 5472.78 | 23226626 | 634.64 | 15706.0 | 16647.0 |
| 2008 | 5000 | 3 | 17823296 | 6746.14 | 1543775 | 584.32 | 15706.0 | 16647.0 |
| 2008 | 10000 | 1 | 5369397 | 7083.64 | 435346 | 574.34 | 15706.0 | 16647.0 |
| 2008 | 15000 | 3 | 14570826 | 5155.99 | 1435980 | 508.13 | 15706.0 | 16647.0 |
| 2008 | 20000 | 2 | 17527345 | 9303.26 | 1910272 | 1013.94 | 15706.0 | 16647.0 |
| 2008 | 30000 | 1 | 2914087 | 3093.51 | 269319 | 285.90 | 15706.0 | 16647.0 |
| 2008 | 35000 | 1 | 2816323 | 2989.73 | 258896 | 274.84 | 15706.0 | 16647.0 |
| 2008 | 40000 | 1 | 3769419 | 4001.51 | 472848 | 501.96 | 15706.0 | 16647.0 |
| 2008 | 45000 | 1 | 6261174 | 6646.68 | 547051 | 580.73 | 15706.0 | 16647.0 |
| 2009 | 0 | 44 | 242344696 | 5979.69 | 27573111 | 680.35 | 15706.0 | 16647.0 |
| 2009 | 5000 | 5 | 26555264 | 5867.27 | 2396637 | 529.53 | 15706.0 | 16647.0 |
| 2009 | 10000 | 3 | 13230162 | 5007.63 | 1308575 | 495.30 | 15706.0 | 16647.0 |
| 2009 | 15000 | 2 | 10980262 | 6458.98 | 968834 | 569.90 | 15706.0 | 16647.0 |
| 2010 | 0 | 33 | 177440595 | 5776.44 | 19949399 | 649.44 | 15706.0 | 16647.0 |
| 2010 | 5000 | 9 | 46398548 | 5472.82 | 4382401 | 516.91 | 15706.0 | 16647.0 |
| 2010 | 10000 | 3 | 11701416 | 4140.63 | 1146125 | 405.56 | 15706.0 | 16647.0 |
| 2010 | 15000 | 3 | 14859632 | 5258.19 | 1450715 | 513.35 | 15706.0 | 16647.0 |
| 2010 | 20000 | 2 | 11267014 | 5980.37 | 1285623 | 682.39 | 15706.0 | 16647.0 |
| 2010 | 25000 | 1 | 3509384 | 4629.79 | 307197 | 405.27 | 15706.0 | 16647.0 |
| 2010 | 30000 | 4 | 23476618 | 6230.52 | 2599298 | 689.83 | 15706.0 | 16647.0 |
| 2011 | 0 | 38 | 194779996 | 5644.49 | 23079739 | 668.82 | 15706.0 | 16647.0 |
| 2011 | 5000 | 10 | 54816894 | 5819.20 | 5549047 | 589.07 | 15706.0 | 16647.0 |
| 2011 | 10000 | 3 | 14409042 | 5098.74 | 1505351 | 532.68 | 15706.0 | 16647.0 |
| 2011 | 15000 | 2 | 12068834 | 6405.96 | 1299613 | 689.82 | 15706.0 | 16647.0 |
| 2011 | 20000 | 1 | 3784390 | 4992.60 | 356051 | 469.72 | 15706.0 | 16647.0 |
| 2012 | 0 | 62 | 325908310 | 5799.60 | 36775305 | 654.42 | 15706.0 | 16647.0 |
| 2012 | 5000 | 12 | 60460643 | 5623.20 | 5336631 | 496.34 | 15706.0 | 16647.0 |
| 2012 | 10000 | 2 | 10482577 | 5564.00 | 1017463 | 540.05 | 15706.0 | 16647.0 |
| 2012 | 15000 | 3 | 12053961 | 4562.44 | 1228064 | 464.82 | 15706.0 | 16647.0 |
| 2012 | 20000 | 2 | 10271012 | 5451.70 | 1070868 | 568.40 | 15706.0 | 16647.0 |
| 2012 | 35000 | 1 | 5499540 | 5838.15 | 602255 | 639.34 | 15706.0 | 16647.0 |
| 2013 | 0 | 50 | 308321589 | 6812.23 | 35034323 | 774.07 | 15706.0 | 16647.0 |
| 2013 | 5000 | 11 | 58191800 | 5822.67 | 5663153 | 566.66 | 15706.0 | 16647.0 |
| 2013 | 10000 | 6 | 33066227 | 6257.80 | 3010005 | 569.65 | 15706.0 | 16647.0 |
| 2013 | 15000 | 5 | 28810475 | 6365.55 | 2604942 | 575.55 | 15706.0 | 16647.0 |
| 2013 | 20000 | 4 | 26666825 | 7077.18 | 2584335 | 685.86 | 15706.0 | 16647.0 |
| 2013 | 25000 | 1 | 6654188 | 7063.89 | 808163 | 857.92 | 15706.0 | 16647.0 |
| 2013 | 30000 | 1 | 4127609 | 4381.75 | 444300 | 471.66 | 15706.0 | 16647.0 |
| 2013 | 35000 | 1 | 3333890 | 3539.16 | 352465 | 374.17 | 15706.0 | 16647.0 |
| 2013 | 40000 | 1 | 6793232 | 7211.50 | 721100 | 765.50 | 15706.0 | 16647.0 |
| 2013 | 45000 | 2 | 7353653 | 3903.21 | 832467 | 441.86 | 15706.0 | 16647.0 |
| 2013 | 55000 | 1 | 5718883 | 6071.00 | 702496 | 745.75 | 15706.0 | 16647.0 |
| 2014 | 0 | 52 | 283015311 | 5979.87 | 31906002 | 674.15 | 15706.0 | 16647.0 |
| 2014 | 5000 | 10 | 51637327 | 5704.52 | 4446098 | 491.17 | 15706.0 | 16647.0 |
| 2014 | 10000 | 4 | 25854624 | 7213.90 | 2381142 | 664.38 | 15706.0 | 16647.0 |
| 2014 | 15000 | 1 | 4144086 | 4399.24 | 421584 | 447.54 | 15706.0 | 16647.0 |
| 2014 | 25000 | 2 | 10330447 | 5483.25 | 1022705 | 542.84 | 15706.0 | 16647.0 |
| 2014 | 30000 | 1 | 5871619 | 6233.14 | 513822 | 545.46 | 15706.0 | 16647.0 |
| 2015 | 0 | 27 | 144545865 | 5809.25 | 16535607 | 664.56 | 15706.0 | 16647.0 |
| 2015 | 5000 | 2 | 8273342 | 4391.37 | 766215 | 406.70 | 15706.0 | 16647.0 |
| 2015 | 10000 | 4 | 15647700 | 4152.79 | 1674972 | 444.53 | 15706.0 | 16647.0 |
| 2015 | 15000 | 3 | 16694666 | 5907.53 | 1454551 | 514.70 | 15706.0 | 16647.0 |
| 2015 | 20000 | 1 | 4667332 | 6157.43 | 451729 | 595.95 | 15706.0 | 16647.0 |
| 2015 | 25000 | 1 | 3642818 | 3867.11 | 418588 | 444.36 | 15706.0 | 16647.0 |
table(base$SchoolHoliday)
##
## 0 1
## 835488 181721
Desc(base$SchoolHoliday)
## -------------------------------------------------------------------------
## base$SchoolHoliday (integer - dichotomous)
##
## length n NAs unique
## 1'017'209 1'017'209 0 2
## 100.0% 0.0%
##
## freq perc lci.95 uci.95'
## 0 835'488 82.1% 82.1% 82.2%
## 1 181'721 17.9% 17.8% 17.9%
##
## ' 95%-CI Wilson
school_holiday <- sqldf("
select
store
, schoolholiday
, sum(sales) as tot_sales
from base
group by 1, 2;
")
school_holiday_diff_desc <- sqldf("
select
store
, schoolholiday
, sum(case when schoolholiday = 0 then tot_sales end) as not_holiday_sales
, sum(case when schoolholiday = 1 then tot_sales end) as holiday_sales
, sum(case when schoolholiday = 0 then tot_sales end) - sum(case when schoolholiday = 1 then tot_sales end) as diff
from school_holiday
group by 1
order by diff desc ;") %>% head(10)
summary(school_holiday_diff_desc)
## Store SchoolHoliday not_holiday_sales holiday_sales
## Min. : 251.0 Min. :1 Min. :10175787 Min. :2365944
## 1st Qu.: 347.0 1st Qu.:1 1st Qu.:11211306 1st Qu.:2772411
## Median : 537.5 Median :1 Median :11766496 Median :2958632
## Mean : 575.8 Mean :1 Mean :12265578 Mean :3037902
## 3rd Qu.: 774.2 3rd Qu.:1 3rd Qu.:13172706 3rd Qu.:3372016
## Max. :1114.0 Max. :1 Max. :16131631 Max. :3653765
## diff
## Min. : 7809843
## 1st Qu.: 8297905
## Median : 9050979
## Mean : 9227676
## 3rd Qu.: 9599275
## Max. :12746420
| Store | SchoolHoliday | not_holiday_sales | holiday_sales | diff |
|---|---|---|---|---|
| 262 | 1 | 16131631 | 3385211 | 12746420 |
| 817 | 1 | 13441062 | 3616805 | 9824257 |
| 562 | 1 | 13273557 | 3653765 | 9619792 |
| 1114 | 1 | 12870155 | 3332430 | 9537725 |
| 733 | 1 | 11586482 | 2480676 | 9105806 |
| 251 | 1 | 11946511 | 2950359 | 8996152 |
| 513 | 1 | 11285500 | 2966906 | 8318594 |
| 788 | 1 | 11186575 | 2895566 | 8291009 |
| 383 | 1 | 10758520 | 2731359 | 8027161 |
| 335 | 1 | 10175787 | 2365944 | 7809843 |
school_holiday_diff_asc <- sqldf("
select
store
, schoolholiday
, sum(case when schoolholiday = 0 then tot_sales end) as not_holiday_sales
, sum(case when schoolholiday = 1 then tot_sales end) as holiday_sales
, sum(case when schoolholiday = 0 then tot_sales end) - sum(case when schoolholiday = 1 then tot_sales end) as diff
from school_holiday
group by 1
order by diff;") %>% head(10)
summary(school_holiday_diff_asc)
## Store SchoolHoliday not_holiday_sales holiday_sales
## Min. :198.0 Min. :1 Min. :1651351 Min. :462971
## 1st Qu.:222.2 1st Qu.:1 1st Qu.:1781068 1st Qu.:481274
## Median :366.0 Median :1 Median :1825997 Median :494913
## Mean :473.9 Mean :1 Mean :1841610 Mean :511127
## 3rd Qu.:731.2 3rd Qu.:1 3rd Qu.:1924701 3rd Qu.:526140
## Max. :972.0 Max. :1 Max. :2002451 Max. :588780
## diff
## Min. :1188380
## 1st Qu.:1236802
## Median :1341650
## Mean :1330483
## 3rd Qu.:1419803
## Max. :1451714
| Store | SchoolHoliday | not_holiday_sales | holiday_sales | diff |
|---|---|---|---|---|
| 307 | 1 | 1651351 | 462971 | 1188380 |
| 543 | 1 | 1700778 | 478509 | 1222269 |
| 972 | 1 | 1813847 | 588780 | 1225067 |
| 198 | 1 | 1770141 | 498132 | 1272009 |
| 841 | 1 | 1826941 | 491694 | 1335247 |
| 208 | 1 | 1825053 | 476999 | 1348054 |
| 219 | 1 | 1927666 | 513496 | 1414170 |
| 232 | 1 | 2002451 | 580770 | 1421681 |
| 794 | 1 | 1915807 | 489567 | 1426240 |
| 425 | 1 | 1982069 | 530355 | 1451714 |
Desc(base$StateHoliday)
## -------------------------------------------------------------------------
## base$StateHoliday (character)
##
## length n NAs unique levels dupes
## 1'017'209 1'017'209 0 4 4 y
## 100.0% 0.0%
##
## level freq perc cumfreq cumperc
## 1 0 986'159 96.9% 986'159 96.9%
## 2 a 20'260 2.0% 1'006'419 98.9%
## 3 b 6'690 0.7% 1'013'109 99.6%
## 4 c 4'100 0.4% 1'017'209 100.0%
state_holiday <- sqldf("
select
store
, StateHoliday
, sum(sales) as tot_sales
from base
group by 1, 2
")
state_holiday_desc <- sqldf("
select
Store
, not_holiday_sales
, public_holiday_sales
, easter_holiday_sales
, christmas_holiday_sales
, case when not_holiday_sales = public_diff then 0 else public_diff end +
case when not_holiday_sales = easter_diff then 0 else easter_diff end +
case when not_holiday_sales = christmas_diff then 0 else christmas_diff end as diff
, case
when public_diff != 0 then 'a'
when easter_diff != 0 then 'b'
when christmas_diff != 0 then 'c'
else 'error' end as category
from (
select
Store
, sum(case when StateHoliday = 0 then tot_sales end) as not_holiday_sales
, sum(case when StateHoliday = 'a' then tot_sales end) as public_holiday_sales
, sum(case when StateHoliday = 'b' then tot_sales end) as easter_holiday_sales
, sum(case when StateHoliday = 'c' then tot_sales end) as christmas_holiday_sales
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'a' then tot_sales end) as public_diff
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'b' then tot_sales end) as easter_diff
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'c' then tot_sales end) as christmas_diff
from
state_holiday
group by 1
)
order by diff desc; ") %>% head(10)
summary(state_holiday_desc)
## Store not_holiday_sales public_holiday_sales
## Min. : 259.0 Min. : 7039181 Min. :102286
## 1st Qu.: 357.0 1st Qu.: 9851432 1st Qu.:178832
## Median : 528.0 Median :10400706 Median :239213
## Mean : 561.6 Mean :11748420 Mean :235231
## 3rd Qu.: 720.2 3rd Qu.:13271430 3rd Qu.:270714
## Max. :1097.0 Max. :18751269 Max. :444916
## easter_holiday_sales christmas_holiday_sales diff
## Min. : 41700 Min. : 19060 Min. :20951087
## 1st Qu.: 56390 1st Qu.: 27449 1st Qu.:29205960
## Median : 90579 Median : 44518 Median :30898199
## Mean : 92815 Mean : 48105 Mean :34869110
## 3rd Qu.:102157 3rd Qu.: 57583 3rd Qu.:39399918
## Max. :208687 Max. :111970 Max. :55488234
## category
## Length:10
## Class :character
## Mode :character
##
##
##
| Store | not_holiday_sales | public_holiday_sales | easter_holiday_sales | christmas_holiday_sales | diff | category |
|---|---|---|---|---|---|---|
| 262 | 18751269 | 444916 | 208687 | 111970 | 55488234 | a |
| 562 | 16486022 | 257742 | 122245 | 61313 | 49016766 | a |
| 733 | 13593424 | 315136 | 102448 | 56150 | 40306538 | a |
| 335 | 12305447 | 169698 | 47526 | 19060 | 36680057 | a |
| 682 | 10382719 | 102831 | 51227 | 20602 | 30973497 | a |
| 259 | 10418693 | 275038 | 100079 | 58061 | 30822901 | a |
| 769 | 9857593 | 222954 | 71878 | 45575 | 29232372 | a |
| 423 | 9849378 | 206235 | 101283 | 43460 | 29197156 | a |
| 1097 | 8800475 | 255472 | 81079 | 42387 | 26022487 | a |
| 494 | 7039181 | 102286 | 41700 | 22470 | 20951087 | a |
state_holiday_asc <- sqldf("
select
Store
, not_holiday_sales
, public_holiday_sales
, easter_holiday_sales
, christmas_holiday_sales
, case when not_holiday_sales = public_diff then 0 else public_diff end +
case when not_holiday_sales = easter_diff then 0 else easter_diff end +
case when not_holiday_sales = christmas_diff then 0 else christmas_diff end as diff
, case
when public_diff != 0 then 'a'
when easter_diff != 0 then 'b'
when christmas_diff != 0 then 'c'
else 'error' end as category
from (
select
Store
, sum(case when StateHoliday = 0 then tot_sales end) as not_holiday_sales
, sum(case when StateHoliday = 'a' then tot_sales end) as public_holiday_sales
, sum(case when StateHoliday = 'b' then tot_sales end) as easter_holiday_sales
, sum(case when StateHoliday = 'c' then tot_sales end) as christmas_holiday_sales
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'a' then tot_sales end) as public_diff
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'b' then tot_sales end) as easter_diff
, sum(case when StateHoliday = 0 then tot_sales end) - sum(case when StateHoliday = 'c' then tot_sales end) as christmas_diff
from
state_holiday
group by 1
)
order by diff; ") %>% head(10)
summary(state_holiday_asc)
## Store not_holiday_sales public_holiday_sales
## Min. : 1.00 Min. :3126578 Min. :0
## 1st Qu.: 5.00 1st Qu.:4304095 1st Qu.:0
## Median : 9.50 Median :4738852 Median :0
## Mean : 8.50 Mean :5016646 Mean :0
## 3rd Qu.:11.75 3rd Qu.:5814749 3rd Qu.:0
## Max. :14.00 Max. :7556507 Max. :0
## easter_holiday_sales christmas_holiday_sales diff
## Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0
## category
## Length:10
## Class :character
## Mode :character
##
##
##
| Store | not_holiday_sales | public_holiday_sales | easter_holiday_sales | christmas_holiday_sales | diff | category |
|---|---|---|---|---|---|---|
| 1 | 3716854 | 0 | 0 | 0 | 0 | a |
| 3 | 5408261 | 0 | 0 | 0 | 0 | a |
| 4 | 7556507 | 0 | 0 | 0 | 0 | a |
| 8 | 4342857 | 0 | 0 | 0 | 0 | a |
| 9 | 5112061 | 0 | 0 | 0 | 0 | a |
| 10 | 4365642 | 0 | 0 | 0 | 0 | a |
| 11 | 6296286 | 0 | 0 | 0 | 0 | a |
| 12 | 5950245 | 0 | 0 | 0 | 0 | a |
| 13 | 3126578 | 0 | 0 | 0 | 0 | a |
| 14 | 4291174 | 0 | 0 | 0 | 0 | a |
hist(train$Sales)
sales_summary <- sqldf("
select
round(avg(Sales), 2) as open_sales
, sum(case when StateHoliday = 0 and SchoolHoliday = 0 then Sales else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 0 then 1 else 0 end) as open_not_holiday_sales
, sum(case when StateHoliday != 0 and SchoolHoliday != 0 then Sales else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday != 0 then 1 else 0 end) as open_holiday_sales
, sum(case when StateHoliday = 0 and SchoolHoliday = 1 then sales else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 1 then 1 else 0 end) as open_school_holiday_sales
, sum(case when StateHoliday != 0 and SchoolHoliday = 0 then sales else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday = 0 then 1 else 0 end) as open_state_holiday_sales
from train
where Open != 0
")
sales <- sqldf("
select
Store
, round(avg(Sales), 2) as open_sales
, sum(case when StateHoliday = 0 and SchoolHoliday = 0 then Sales else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 0 then 1 else 0 end) as open_not_holiday_sales
, sum(case when StateHoliday != 0 and SchoolHoliday != 0 then Sales else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday != 0 then 1 else 0 end) as open_holiday_sales
, sum(case when StateHoliday = 0 and SchoolHoliday = 1 then sales else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 1 then 1 else 0 end) as open_school_holiday_sales
, sum(case when StateHoliday != 0 and SchoolHoliday = 0 then sales else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday = 0 then 1 else 0 end) as open_state_holiday_sales
from train
where Open != 0
group by 1
")
summary(sales)
## Store open_sales open_not_holiday_sales
## Min. : 1.0 Min. : 2704 Min. : 2625
## 1st Qu.: 279.5 1st Qu.: 5322 1st Qu.: 5256
## Median : 558.0 Median : 6590 Median : 6531
## Mean : 558.0 Mean : 6934 Mean : 6879
## 3rd Qu.: 836.5 3rd Qu.: 7964 3rd Qu.: 7886
## Max. :1115.0 Max. :21757 Max. :21505
##
## open_holiday_sales open_school_holiday_sales open_state_holiday_sales
## Min. : 1121 Min. : 2990 Min. : 1278
## 1st Qu.: 4515 1st Qu.: 5517 1st Qu.: 4295
## Median : 6154 Median : 6837 Median : 7029
## Mean : 7049 Mean : 7162 Mean : 7146
## 3rd Qu.: 8934 3rd Qu.: 8211 3rd Qu.: 8592
## Max. :28285 Max. :22747 Max. :34956
## NA's :1054 NA's :984
boxplot(sales)
hist(sales$open_sales, 100,
main = "Mean sales per store when store was not closed")
hist(sales$open_holiday_sales, 100,
main = "Mean sales per store when store was not closed and holiday")
hist(train$Customers)
customers_summary <- sqldf("
select
round(avg(customers), 2) as customers
, sum(case when StateHoliday = 0 and SchoolHoliday = 0 then customers else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 0 then 1 else 0 end) as open_not_holiday_customers
, sum(case when StateHoliday != 0 and SchoolHoliday != 0 then customers else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday != 0 then 1 else 0 end) as open_holiday_customers
, sum(case when StateHoliday = 0 and SchoolHoliday = 1 then customers else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 1 then 1 else 0 end) as open_school_holiday_customers
, sum(case when StateHoliday != 0 and SchoolHoliday = 0 then customers else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday = 0 then 1 else 0 end) as open_state_holiday_customers
from
train
where Open != 0
")
customers <- sqldf("
select
Store
, round(avg(customers), 2) as customers
, sum(case when StateHoliday = 0 and SchoolHoliday = 0 then customers else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 0 then 1 else 0 end) as open_not_holiday_customers
, sum(case when StateHoliday != 0 and SchoolHoliday != 0 then customers else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday != 0 then 1 else 0 end) as open_holiday_customers
, sum(case when StateHoliday = 0 and SchoolHoliday = 1 then customers else 0 end) / sum(case when StateHoliday = 0 and SchoolHoliday = 1 then 1 else 0 end) as open_school_holiday_customers
, sum(case when StateHoliday != 0 and SchoolHoliday = 0 then customers else 0 end) / sum(case when StateHoliday != 0 and SchoolHoliday = 0 then 1 else 0 end) as open_state_holiday_customers
from
train
where Open != 0
group by 1
")
summary(customers)
## Store customers open_not_holiday_customers
## Min. : 1.0 Min. : 240.2 Min. : 235.0
## 1st Qu.: 279.5 1st Qu.: 541.5 1st Qu.: 537.5
## Median : 558.0 Median : 678.7 Median : 673.0
## Mean : 558.0 Mean : 754.5 Mean : 749.8
## 3rd Qu.: 836.5 3rd Qu.: 866.2 3rd Qu.: 859.0
## Max. :1115.0 Max. :3403.5 Max. :3408.0
##
## open_holiday_customers open_school_holiday_customers
## Min. : 135 Min. : 258.0
## 1st Qu.: 481 1st Qu.: 551.5
## Median : 689 Median : 692.0
## Mean : 953 Mean : 771.9
## 3rd Qu.:1225 3rd Qu.: 886.5
## Max. :4073 Max. :3406.0
## NA's :1054
## open_state_holiday_customers
## Min. : 163.0
## 1st Qu.: 538.5
## Median : 734.0
## Mean : 906.0
## 3rd Qu.: 982.0
## Max. :5154.0
## NA's :984
boxplot(customers)
hist(customers$customers, 100,
main = "Mean customers per store when store was not closed")
ggplot(train[train$Sales != 0 & train$Customers != 0],
aes(x = log(Customers), y = log(Sales))) +
geom_point(alpha = 0.2) + geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
ggplot(train[train$Sales != 0 & train$Customers != 0],
aes(x = factor(Promo), y = Sales)) +
geom_jitter(alpha = 0.1) +
geom_boxplot(color = "yellow", outlier.colour = NA, fill = NA)
ggplot(train[train$Sales != 0 & train$Customers != 0],
aes(x = factor(Promo), y = Customers)) +
geom_jitter(alpha = 0.1) +
geom_boxplot(color = "yellow", outlier.colour = NA, fill = NA)
promo_type_sales <- sqldf("
with base as (
select
a.*
, abs(promo_per_sales - not_promo_per_sales) as diff
from (
select
sum(case when promo == 0 then sales else 0 end) as not_promo_sales
, sum(case when promo == 0 then customers else 0 end) as not_promo_customers
, round(100. * (sum(case when promo == 0 then sales else 0 end)) / sum(case when promo == 0 then customers else 0 end)) / 100 as not_promo_per_sales
, sum(case when promo != 0 then sales else 0 end) as promo_sales
, sum(case when promo != 0 then customers else 0 end) as promo_customers
, round(100. * (sum(case when promo != 0 then sales else 0 end)) / sum(case when promo != 0 then customers else 0 end)) / 100 as promo_per_sales
from train
where Sales != 0
) as a
)
select
*
from base
")
promo_type_sales
## not_promo_sales not_promo_customers not_promo_per_sales promo_sales
## 1 2771974337 325777807 8.51 3101206286
## promo_customers promo_per_sales diff
## 1 318263940 9.74 1.23
예상대로 판매량은 고객 수와 밀접한 관련이 있음. customer의 박스 플롯이 sales의 박스 플롯보다 조금 더 겹쳐보이는 것 같은데, 이것은 광고가 주로 더 많은 고객을 끌어 들이는 것이 아니라 고객이 더 많은 것을 소비하도록 한다는 것을 의미함. 고객 당 평균 지출 금액은 약 1.23유로 더 높음.
table(ifelse(train$Open == 1, "Opened", "Closed"),
ifelse(train$Sales > 0, "Sales > 0", "Sales = 0"))
##
## Sales = 0 Sales > 0
## Closed 172817 0
## Opened 54 844338
매장이 열지 않았을 경우 매출이 없지만, 문을 열어도 매출이 없는 매장이 있다. 이러한 관측치는 데이터의 오류일 수 있다.
ggplot(train[Sales != 0],
aes(x = factor(DayOfWeek), y = Sales)) +
geom_jitter(alpha = 0.1) +
geom_boxplot(color = "yellow", outlier.colour = NA, fill = NA)
# There is a connection between store type and type of assortment
table(data.frame(Assortment = store$Assortment, StoreType = store$StoreType))
## StoreType
## Assortment a b c d
## a 381 7 77 128
## b 0 9 0 0
## c 221 1 71 220
task_1 <- sqldf("
select
assortment
, sum(case when storetype = 'a' then 1 else 0 end) as a
, sum(case when storetype = 'b' then 1 else 0 end) as b
, sum(case when storetype = 'c' then 1 else 0 end) as c
, sum(case when storetype = 'd' then 1 else 0 end) as d
from store
group by 1
")
task_2 <- sqldf("
with base as (
select
a.*
from (
select
assortment
, sum(case when storetype = 'a' then 1 else 0 end) as a
, sum(case when storetype = 'b' then 1 else 0 end) as b
, sum(case when storetype = 'c' then 1 else 0 end) as c
, sum(case when storetype = 'd' then 1 else 0 end) as d
, count(*) as total
from store
group by 1 ) as a
)
select *
from base
union all
select
'total'
, sum(a)
, sum(b)
, sum(c)
, sum(d)
, sum(total)
from base;
")
task_3 <- sqldf("
with base as (
select
aa.*
, a + b + c + d as total
from (
select
assortment
, round( 100. * a / total, 2) as a
, round( 100. * b / total, 2) as b
, round( 100. * c / total, 2) as c
, round( 100. * d / total, 2) as d
from (
select
assortment
, sum(case when storetype = 'a' then 1 else 0 end) as a
, sum(case when storetype = 'b' then 1 else 0 end) as b
, sum(case when storetype = 'c' then 1 else 0 end) as c
, sum(case when storetype = 'd' then 1 else 0 end) as d
, count(*) as total
from store
group by 1 )
) as aa
)
select *
from base
union all
select
'total'
, round(avg(a), 2)
, round(avg(b), 2)
, round(avg(c), 2)
, round(avg(d), 2)
, round(avg(total), 2)
from base;
")
task_4 <- sqldf("
select
assortment
, storetype
, count(*) as as_total
from store
group by 1, 2
")
task_5 <- sqldf("
select
a.assortment
, a.storetype
, round(100. * as_total / a_total, 2) as prob
from (
select
assortment
, storetype
, count(*) as as_total
from store
group by 1, 2
) as a, (
select
assortment
, count(*) as a_total
from store
group by 1
) as b
where a.assortment = b.assortment
")
| assortment | a | b | c | d | total |
|---|---|---|---|---|---|
| a | 381 | 7 | 77 | 128 | 593 |
| b | 0 | 9 | 0 | 0 | 9 |
| c | 221 | 1 | 71 | 220 | 513 |
| total | 602 | 17 | 148 | 348 | 1115 |
| assortment | a | b | c | d | total |
|---|---|---|---|---|---|
| a | 64.25 | 1.18 | 12.98 | 21.59 | 100.00 |
| b | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 |
| c | 43.08 | 0.19 | 13.84 | 42.88 | 99.99 |
| total | 35.78 | 33.79 | 8.94 | 21.49 | 100.00 |
ggplot(data = task_4, aes(x = Assortment, y = as_total, fill = StoreType)) +
geom_bar(stat = "identity", alpha = 1)
ggplot(data = task_5, aes(x = assortment, y = prob, fill = storetype)) +
geom_bar(stat = "identity", alpha = 1)
# Merge store and sales
#sales_store <- merge(sales, store, by = "Store")
#temp <- sales
#temp$year <- format(temp$Date, "%Y")
#temp$month <- format(temp$Date, "%m")
#temp[, StoreMean := mean(Sales), by = Store]
#temp <- temp[, .(MonthlySalesMean = mean(Sales / (StoreMean)) * 100),
# by = .(year, month)]
#temp <- as.data.frame(temp)
#SalesTS <- ts(temp$MonthlySalesMean, start=2013, frequency=12)
#col = rainbow(3)
#seasonplot(SalesTS, col=col, year.labels.left = TRUE, pch=19, las=1)