##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:data.table':
##
## hour, isoweek, mday, minute, month, quarter, second, wday, week,
## yday, year
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:Hmisc':
##
## src, summarize
## The following objects are masked from 'package:data.table':
##
## between, first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Joining, by = "product_id"
This part of the report will be directed to product managers throughout the client’s company. The idea is to give them the useful information they need to act on the specific questions they posed. Plan your communication accordingly.
For this report, make sure to use all of the data that was provided to you for the month. If you do note any issues with the data (Part 3), this can be reported to the engineering team for them to resolve.
During the first week of the month, what were the 10 most viewed products? Show the results in a table with the product’s identifier, category, and count of the number of views.
## product_id category count
## 1 gjcsELW14TF4dxot shirt 214
## 2 wvMbusrgZjBr04Ho pants 213
## 3 CIZkDqF9GheJ1mhT coat 211
## 4 Nx6fZQAusZtV1JVN shirt 209
## 5 UDawXFXMc4Sn5RB9 shirt 203
## 6 T2IDb82DS8ecGoJc shirt 198
## 7 I9vhjbvj5rD4A2Wk pants 197
## 8 XTmI0uGhAVgmdypt shirt 196
## 9 QxYq5yX0hNbPy5QH shirt 195
## 10 yyEeOsvUazwVtH6B shirt 195
During the whole month, what were the 10 most viewed products for each category? Show the results in separate tables by category. Including only the product’s identifier and the count of the number of views.
## # A tibble: 10 x 2
## product_id count
## <chr> <int>
## 1 T2IDb82DS8ecGoJc 748
## 2 GgBnL6Gx07tK8VrF 714
## 3 oaTfgLKNy3efmBL6 685
## 4 U8WJMwXa3Sb2vRR0 681
## 5 fprdTtl7TGcmFFUh 678
## 6 v6rCnGI4bllVLA3p 678
## 7 dhS5UCHe5VF7lYBL 677
## 8 9HmaGQzI6b1ayJcL 674
## 9 4DQewZYVwU9H4Acv 670
## 10 GFdFcxNxmh9isgbY 666
## # A tibble: 10 x 2
## product_id count
## <chr> <int>
## 1 WNIGeNa97YnOpAdw 661
## 2 Zh8mTIQ1dBtUKlwU 650
## 3 hUuAPXDMWPjFK8XE 645
## 4 Cu5V1RJBS2v1QO2c 637
## 5 m8sSxzMEBzgSZHR5 637
## 6 V9poxltYd3UFTOmP 636
## 7 4pPqOvA5nnWA85WG 633
## 8 mQLZIH7cioPHdpy4 633
## 9 KjCVeq5KL2ihjuPr 630
## 10 nOJryj3XEq4ZL9OW 629
## # A tibble: 10 x 2
## product_id count
## <chr> <int>
## 1 EqKAFYosFdW1Pifo 695
## 2 sKNpoakq96XF7dv6 681
## 3 gevXa0ZpHKbiD2qK 661
## 4 EIYARupNVVHG93XY 653
## 5 Udjqm2TqXATY1AXZ 652
## 6 A8miBlolQ84S98qH 643
## 7 lGeofqOvFo0u3ZcJ 642
## 8 xTqbcfoBuKC02GGt 642
## 9 RBjQIkr1qBnpOL6z 641
## 10 rfjMEqCigHn0dBpz 641
## # A tibble: 10 x 2
## product_id count
## <chr> <int>
## 1 LQv262onJ6CMQV3V 701
## 2 uXClWK2bruOsmV1u 700
## 3 57XAlBm7ISzPKLMo 693
## 4 2TNPbNJ2D2LQRLUX 675
## 5 8CuIuVvu9tWZotSE 663
## 6 9RylmoAfWeianhHM 662
## 7 GnKoiyttZFZAzmQn 653
## 8 1u8pkz2FiAKtkZJF 648
## 9 CuGQAYoSaQWiyAkO 645
## 10 C9QithXAFKZ50gD6 644
## # A tibble: 10 x 2
## product_id count
## <chr> <int>
## 1 hTQCSZpdaNEZMvhY 666
## 2 bJkcF4WYOfvws3qd 644
## 3 BOMyBJR1eqUShjDM 643
## 4 uBcSuyl4Qnl5sx8f 627
## 5 tBbGJFMpvB7Jw2yH 624
## 6 pY41R0877u9G95z4 623
## 7 W7AWmeOCRO8O7zqG 620
## 8 WcJzZvenCnI439HI 616
## 9 nX4rqev3NEaPwvqL 614
## 10 a3NEL240yJrFzqK8 608
What was the total revenue for each category of product during the month? Show the results in a single table sorted in decreasing order.
## # A tibble: 5 x 2
## category revenue
## <chr> <dbl>
## 1 shirt 6956670.
## 2 coat 6028708.
## 3 shoes 3515731.
## 4 pants 823091.
## 5 hat 155830.
Among customers with at least one transaction, show the average, median, and standard deviation of the customers’ monthly spending on the site.
## Average monthly spending: $ 494.81
##
## Median monthly spending: $ 257.56
##
## Standard deviation monthly spending: $ 799.08
What is the percentage distribution of spending by gender? Show the amount of revenue and the percentage.
## # A tibble: 2 x 3
## gender spend percentage
## <chr> <dbl> <dbl>
## 1 F 8785185. 0.503
## 2 M 8694846. 0.497
Using linear regression, what is the effect of an extra ten thousand dollars of income on monthly spending for a customer while adjusting for age, gender, and region?
## $ 10.58
Among customers who viewed at least 1 product, how many had at least one purchase during the month? Show the total number and as a percentage of the users with a view.
## Joining, by = "customer_id"
## Criteria Number Percentage
## 1: Made at least 1 purchase 35327 52.5
## 2: Total customers who viewed at least 1 product 67345 100.0
Now let’s look at the viewing habits in different age groups, including 18-34, 35-49, 50-64, and 65+. Within each group, what were the mean, median, and standard deviation for the number of unique products viewed per customer?
## For age group 18-34, the mean number of unique products viewed is 89.2 , the median is 41 , and the standard deviation is 127 .
## For age group 35-49, the mean number of unique products viewed is 94.5 , the median is 42 , and the standard deviation is 140.5 .
## For age group 50-64, the mean number of unique products viewed is 87.3 , the median is 41 , and the standard deviation is 122.2 .
## For age group 65+, the mean number of unique products viewed is 68.3 , the median is 40 , and the standard deviation is 82.9 .
What is the correlation between a user’s total page views and total spending? For customers without a transaction, include their spending as zero.
## Joining, by = "customer_id"
## [1] 0.8156881
Which customer purchased the largest number of coats? In the event of a tie, include all of the users who reached this value. Show their identifiers and total volume.
## # A tibble: 1 x 2
## customer_id coats
## <chr> <int>
## 1 cnwsiHuMZvd1 27