Recommendation Engines - Data Analysis Assignment 3

Question 1

library(arules)
library(tidyverse)
library(recommenderlab)
transactions <- read.transactions("retail_transactions_1.csv", sep = ",")
summary(transactions)
transactions as itemMatrix in sparse format with
 10000 rows (elements/itemsets/transactions) and
 5497 columns (items) and a density of 0.00277837 

most frequent items:
WHITE HANGING HEART T-LIGHT HOLDER           REGENCY CAKESTAND 3 TIER 
                               838                                775 
           JUMBO BAG RED RETROSPOT                      PARTY BUNTING 
                               671                                551 
     ASSORTED COLOUR BIRD ORNAMENT                            (Other) 
                               543                             149349 

element (itemset/transaction) length distribution:
sizes
   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
1622  684  518  406  399  352  304  313  291  265  268  234  236  221  249  259 
  17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
 220  201  223  185  176  138  145  115  110  114   96  101  109  100   74   73 
  33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
  68   66   62   40   59   42   50   43   43   55   36   28   31   28   29   23 
  49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64 
  29   21   34   13   15   25   17   20   13   16   20   14   11   11   13   14 
  65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80 
   9   15   12    8    2   10    4    7    6    9    1    7    6    3    3    4 
  81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96 
   7    6    3    2    4    3    5    5    1    2    1    4    2    1    1    2 
  97   98   99  101  103  105  107  108  109  110  111  113  114  116  117  118 
   1    3    1    2    2    1    2    2    3    1    1    2    1    1    3    2 
 120  121  122  123  125  126  135  143  147  149  154  157  158  168  171  177 
   1    1    1    1    2    3    1    1    1    1    4    1    1    1    1    1 
 202  204  249  320  428 
   1    1    1    1    1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    3.00   10.00   15.27   21.00  428.00 

includes extended item information - examples:
                      labels
1                   1 HANGER
2     10 COLOUR SPACEBOY PEN
3 12 COLOURED PARTY BALLOONS

10000 transactions

5497 possible items

Sparse Matrix = 0.00277837 -> 0.2%

10000 * 5497 = 54,970,000 total cells

54,970,000 x 0.00277837 = 152,726 cells contain non-zero values

Max Items = 428 items

Mean Items = 15.27 Items

itemFrequencyPlot(transactions, topN = 20, horiz = T)

retail_rules <- apriori(transactions, parameter = list(support = 0.01, confidence = 0.5, minlen = 2))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen
        0.5    0.1    1 none FALSE            TRUE       5    0.01      2
 maxlen target  ext
     10  rules TRUE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 100 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[5497 item(s), 10000 transaction(s)] done [0.14s].
sorting and recoding items ... [401 item(s)] done [0.00s].
creating transaction tree ... done [0.01s].
checking subsets of size 1 2 3 4 done [0.01s].
writing ... [90 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].

The Apriori Algorithm discovers 90 rules

Support value is the percentage of times a product was purchased in this data set, so for the algorithm it means that it only includes products that have a support value over 0.01

Confidence value is how often items are purchased together, so for this algorithm it means that items purchased together at least 50% of the time, a rule is created

summary(retail_rules)

58 Rules with 2 items 32 rules with 3 items

Min Lift = 6.452

Max Lift = 66.667

inspect(sort(retail_rules, by = "lift"))
     lhs                                      rhs                                   support confidence coverage      lift count
[1]  {SHED}                                => {KEY FOB}                              0.0100  1.0000000   0.0100 66.666667   100
[2]  {BACK DOOR}                           => {KEY FOB}                              0.0101  1.0000000   0.0101 66.666667   101
[3]  {KEY FOB}                             => {SHED}                                 0.0100  0.6666667   0.0150 66.666667   100
[4]  {KEY FOB}                             => {BACK DOOR}                            0.0101  0.6733333   0.0150 66.666667   101
[5]  {WOODEN STAR CHRISTMAS SCANDINAVIAN}  => {WOODEN HEART CHRISTMAS SCANDINAVIAN}  0.0114  0.7755102   0.0147 47.577313   114
[6]  {WOODEN HEART CHRISTMAS SCANDINAVIAN} => {WOODEN STAR CHRISTMAS SCANDINAVIAN}   0.0114  0.6993865   0.0163 47.577313   114
[7]  {PINK HAPPY BIRTHDAY BUNTING}         => {BLUE HAPPY BIRTHDAY BUNTING}          0.0101  0.6778523   0.0149 46.748438   101
[8]  {BLUE HAPPY BIRTHDAY BUNTING}         => {PINK HAPPY BIRTHDAY BUNTING}          0.0101  0.6965517   0.0145 46.748438   101
[9]  {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      REGENCY CAKESTAND 3 TIER}            => {PINK REGENCY TEACUP AND SAUCER}       0.0108  0.7105263   0.0152 30.494692   108
[10] {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      ROSES REGENCY TEACUP AND SAUCER}     => {PINK REGENCY TEACUP AND SAUCER}       0.0153  0.6860987   0.0223 29.446294   153
[11] {PINK REGENCY TEACUP AND SAUCER,                                                                                          
      ROSES REGENCY TEACUP AND SAUCER}     => {GREEN REGENCY TEACUP AND SAUCER}      0.0153  0.8793103   0.0174 29.408373   153
[12] {PINK REGENCY TEACUP AND SAUCER,                                                                                          
      REGENCY CAKESTAND 3 TIER}            => {GREEN REGENCY TEACUP AND SAUCER}      0.0108  0.8709677   0.0124 29.129356   108
[13] {HAND WARMER SCOTTY DOG DESIGN}       => {HAND WARMER OWL DESIGN}               0.0104  0.5621622   0.0185 27.556969   104
[14] {HAND WARMER OWL DESIGN}              => {HAND WARMER SCOTTY DOG DESIGN}        0.0104  0.5098039   0.0204 27.556969   104
[15] {PINK REGENCY TEACUP AND SAUCER}      => {GREEN REGENCY TEACUP AND SAUCER}      0.0188  0.8068670   0.0233 26.985517   188
[16] {GREEN REGENCY TEACUP AND SAUCER}     => {PINK REGENCY TEACUP AND SAUCER}       0.0188  0.6287625   0.0299 26.985517   188
[17] {REGENCY CAKESTAND 3 TIER,                                                                                                
      ROSES REGENCY TEACUP AND SAUCER}     => {PINK REGENCY TEACUP AND SAUCER}       0.0102  0.6107784   0.0167 26.213667   102
[18] {GARDENERS KNEELING PAD CUP OF TEA}   => {GARDENERS KNEELING PAD KEEP CALM}     0.0186  0.7440000   0.0250 25.220339   186
[19] {GARDENERS KNEELING PAD KEEP CALM}    => {GARDENERS KNEELING PAD CUP OF TEA}    0.0186  0.6305085   0.0295 25.220339   186
[20] {PINK REGENCY TEACUP AND SAUCER,                                                                                          
      REGENCY CAKESTAND 3 TIER}            => {ROSES REGENCY TEACUP AND SAUCER}      0.0102  0.8225806   0.0124 24.628163   102
[21] {SPACEBOY LUNCH BOX}                  => {DOLLY GIRL LUNCH BOX}                 0.0161  0.5812274   0.0277 24.524364   161
[22] {DOLLY GIRL LUNCH BOX}                => {SPACEBOY LUNCH BOX}                   0.0161  0.6793249   0.0237 24.524364   161
[23] {REGENCY CAKESTAND 3 TIER,                                                                                                
      ROSES REGENCY TEACUP AND SAUCER}     => {GREEN REGENCY TEACUP AND SAUCER}      0.0122  0.7305389   0.0167 24.432740   122
[24] {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      PINK REGENCY TEACUP AND SAUCER}      => {ROSES REGENCY TEACUP AND SAUCER}      0.0153  0.8138298   0.0188 24.366161   153
[25] {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      REGENCY CAKESTAND 3 TIER}            => {ROSES REGENCY TEACUP AND SAUCER}      0.0122  0.8026316   0.0152 24.030886   122
[26] {PINK REGENCY TEACUP AND SAUCER}      => {ROSES REGENCY TEACUP AND SAUCER}      0.0174  0.7467811   0.0233 22.358716   174
[27] {ROSES REGENCY TEACUP AND SAUCER}     => {PINK REGENCY TEACUP AND SAUCER}       0.0174  0.5209581   0.0334 22.358716   174
[28] {ROSES REGENCY TEACUP AND SAUCER}     => {GREEN REGENCY TEACUP AND SAUCER}      0.0223  0.6676647   0.0334 22.329922   223
[29] {GREEN REGENCY TEACUP AND SAUCER}     => {ROSES REGENCY TEACUP AND SAUCER}      0.0223  0.7458194   0.0299 22.329922   223
[30] {PLASTERS IN TIN CIRCUS PARADE}       => {PLASTERS IN TIN WOODLAND ANIMALS}     0.0110  0.5263158   0.0209 22.114109   110
[31] {CHARLOTTE BAG PINK POLKADOT}         => {RED RETROSPOT CHARLOTTE BAG}          0.0120  0.6315789   0.0190 22.083180   120
[32] {ROUND SNACK BOXES SET OF 4 FRUITS}   => {ROUND SNACK BOXES SET OF4 WOODLAND}   0.0106  0.5520833   0.0192 21.995352   106
[33] {HOT WATER BOTTLE I AM SO POORLY}     => {CHOCOLATE HOT WATER BOTTLE}           0.0123  0.6275510   0.0196 21.865889   123
[34] {RED KITCHEN SCALES}                  => {IVORY KITCHEN SCALES}                 0.0119  0.6010101   0.0198 21.854913   119
[35] {JUMBO BAG PEARS}                     => {JUMBO BAG APPLES}                     0.0122  0.6455026   0.0189 21.233640   122
[36] {STRAWBERRY CHARLOTTE BAG}            => {RED RETROSPOT CHARLOTTE BAG}          0.0115  0.5502392   0.0209 19.239134   115
[37] {BAKING SET SPACEBOY DESIGN}          => {BAKING SET 9 PIECE RETROSPOT}         0.0110  0.6508876   0.0169 19.087612   110
[38] {ALARM CLOCK BAKELIKE GREEN}          => {ALARM CLOCK BAKELIKE RED}             0.0195  0.6414474   0.0304 18.866099   195
[39] {ALARM CLOCK BAKELIKE RED}            => {ALARM CLOCK BAKELIKE GREEN}           0.0195  0.5735294   0.0340 18.866099   195
[40] {ALARM CLOCK BAKELIKE IVORY}          => {ALARM CLOCK BAKELIKE RED}             0.0115  0.6388889   0.0180 18.790850   115
[41] {ALARM CLOCK BAKELIKE ORANGE}         => {ALARM CLOCK BAKELIKE RED}             0.0101  0.6352201   0.0159 18.682945   101
[42] {ALARM CLOCK BAKELIKE IVORY}          => {ALARM CLOCK BAKELIKE GREEN}           0.0102  0.5666667   0.0180 18.640351   102
[43] {ALARM CLOCK BAKELIKE PINK}           => {ALARM CLOCK BAKELIKE RED}             0.0148  0.6271186   0.0236 18.444666   148
[44] {LOVE BUILDING BLOCK WORD}            => {HOME BUILDING BLOCK WORD}             0.0107  0.5270936   0.0203 18.429846   107
[45] {HOT WATER BOTTLE TEA AND SYMPATHY}   => {CHOCOLATE HOT WATER BOTTLE}           0.0103  0.5228426   0.0197 18.217514   103
[46] {LUNCH BAG PINK POLKADOT,                                                                                                 
      LUNCH BAG SPACEBOY DESIGN}           => {LUNCH BAG CARS BLUE}                  0.0105  0.7046980   0.0149 18.115629   105
[47] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG SPACEBOY DESIGN}           => {LUNCH BAG PINK POLKADOT}              0.0105  0.6730769   0.0156 17.666061   105
[48] {LUNCH BAG RED RETROSPOT,                                                                                                 
      LUNCH BAG SPACEBOY DESIGN}           => {LUNCH BAG WOODLAND}                   0.0105  0.6069364   0.0173 17.643500   105
[49] {ALARM CLOCK BAKELIKE PINK}           => {ALARM CLOCK BAKELIKE GREEN}           0.0123  0.5211864   0.0236 17.144291   123
[50] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG PINK POLKADOT}              0.0118  0.6519337   0.0181 17.111121   118
[51] {LUNCH BAG  BLACK SKULL,                                                                                                  
      LUNCH BAG CARS BLUE}                 => {LUNCH BAG PINK POLKADOT}              0.0106  0.6385542   0.0166 16.759953   106
[52] {LUNCH BAG RED RETROSPOT,                                                                                                 
      LUNCH BAG WOODLAND}                  => {LUNCH BAG SPACEBOY DESIGN}            0.0105  0.6140351   0.0171 16.158818   105
[53] {LUNCH BAG RED RETROSPOT,                                                                                                 
      LUNCH BAG WOODLAND}                  => {LUNCH BAG PINK POLKADOT}              0.0103  0.6023392   0.0171 15.809427   103
[54] {WOODEN FRAME ANTIQUE WHITE}          => {WOODEN PICTURE FRAME WHITE FINISH}    0.0202  0.5821326   0.0347 15.523535   202
[55] {WOODEN PICTURE FRAME WHITE FINISH}   => {WOODEN FRAME ANTIQUE WHITE}           0.0202  0.5386667   0.0375 15.523535   202
[56] {LUNCH BAG  BLACK SKULL,                                                                                                  
      LUNCH BAG PINK POLKADOT}             => {LUNCH BAG CARS BLUE}                  0.0106  0.5888889   0.0180 15.138532   106
[57] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG PINK POLKADOT}             => {LUNCH BAG SPACEBOY DESIGN}            0.0105  0.5737705   0.0183 15.099223   105
[58] {LUNCH BAG PINK POLKADOT,                                                                                                 
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG CARS BLUE}                  0.0118  0.5841584   0.0202 15.016926   118
[59] {LUNCH BAG  BLACK SKULL,                                                                                                  
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG PINK POLKADOT}              0.0118  0.5673077   0.0208 14.889966   118
[60] {LUNCH BAG PINK POLKADOT,                                                                                                 
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG WOODLAND}                   0.0103  0.5099010   0.0202 14.822703   103
[61] {LUNCH BAG DOLLY GIRL DESIGN}         => {LUNCH BAG SPACEBOY DESIGN}            0.0126  0.5478261   0.0230 14.416476   126
[62] {LUNCH BAG PINK POLKADOT,                                                                                                 
      LUNCH BAG WOODLAND}                  => {LUNCH BAG RED RETROSPOT}              0.0103  0.7463768   0.0138 14.381056   103
[63] {LUNCH BAG VINTAGE LEAF DESIGN}       => {LUNCH BAG APPLE DESIGN}               0.0114  0.5112108   0.0223 14.279630   114
[64] {LUNCH BAG PINK POLKADOT,                                                                                                 
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG  BLACK SKULL}               0.0118  0.5841584   0.0202 13.680525   118
[65] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG PINK POLKADOT}             => {LUNCH BAG  BLACK SKULL}               0.0106  0.5792350   0.0183 13.565222   106
[66] {PAINTED METAL PEARS ASSORTED}        => {ASSORTED COLOUR BIRD ORNAMENT}        0.0111  0.7302632   0.0152 13.448677   111
[67] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG RED RETROSPOT}             => {LUNCH BAG  BLACK SKULL}               0.0103  0.5690608   0.0181 13.326950   103
[68] {LUNCH BAG  BLACK SKULL,                                                                                                  
      LUNCH BAG PINK POLKADOT}             => {LUNCH BAG RED RETROSPOT}              0.0118  0.6555556   0.0180 12.631128   118
[69] {LUNCH BAG CARS BLUE,                                                                                                     
      LUNCH BAG PINK POLKADOT}             => {LUNCH BAG RED RETROSPOT}              0.0118  0.6448087   0.0183 12.424061   118
[70] {60 TEATIME FAIRY CAKE CASES}         => {PACK OF 72 RETROSPOT CAKE CASES}      0.0145  0.5350554   0.0271 12.414277   145
[71] {LUNCH BAG SPACEBOY DESIGN,                                                                                               
      LUNCH BAG WOODLAND}                  => {LUNCH BAG RED RETROSPOT}              0.0105  0.6250000   0.0168 12.042389   105
[72] {LUNCH BAG  BLACK SKULL,                                                                                                  
      LUNCH BAG CARS BLUE}                 => {LUNCH BAG RED RETROSPOT}              0.0103  0.6204819   0.0166 11.955336   103
[73] {PACK OF 72 SKULL CAKE CASES}         => {PACK OF 72 RETROSPOT CAKE CASES}      0.0108  0.5046729   0.0214 11.709348   108
[74] {LUNCH BAG PINK POLKADOT}             => {LUNCH BAG RED RETROSPOT}              0.0202  0.5301837   0.0381 10.215486   202
[75] {JUMBO BAG STRAWBERRY}                => {JUMBO BAG RED RETROSPOT}              0.0176  0.6641509   0.0265  9.897928   176
[76] {LUNCH BAG DOLLY GIRL DESIGN}         => {LUNCH BAG RED RETROSPOT}              0.0117  0.5086957   0.0230  9.801458   117
[77] {JUMBO BAG PINK POLKADOT}             => {JUMBO BAG RED RETROSPOT}              0.0222  0.5951743   0.0373  8.869959   222
[78] {JUMBO BAG SCANDINAVIAN BLUE PAISLEY} => {JUMBO BAG RED RETROSPOT}              0.0104  0.5683060   0.0183  8.469538   104
[79] {JUMBO  BAG BAROQUE BLACK WHITE}      => {JUMBO BAG RED RETROSPOT}              0.0149  0.5539033   0.0269  8.254893   149
[80] {JUMBO STORAGE BAG SUKI}              => {JUMBO BAG RED RETROSPOT}              0.0161  0.5457627   0.0295  8.133572   161
[81] {RED HANGING HEART T-LIGHT HOLDER}    => {WHITE HANGING HEART T-LIGHT HOLDER}   0.0175  0.6481481   0.0270  7.734465   175
[82] {JUMBO BAG SPACEBOY DESIGN}           => {JUMBO BAG RED RETROSPOT}              0.0102  0.5125628   0.0199  7.638790   102
[83] {PINK REGENCY TEACUP AND SAUCER,                                                                                          
      ROSES REGENCY TEACUP AND SAUCER}     => {REGENCY CAKESTAND 3 TIER}             0.0102  0.5862069   0.0174  7.563960   102
[84] {JUMBO BAG PINK VINTAGE PAISLEY}      => {JUMBO BAG RED RETROSPOT}              0.0125  0.5040323   0.0248  7.511658   125
[85] {JUMBO SHOPPER VINTAGE RED PAISLEY}   => {JUMBO BAG RED RETROSPOT}              0.0157  0.5000000   0.0314  7.451565   157
[86] {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      PINK REGENCY TEACUP AND SAUCER}      => {REGENCY CAKESTAND 3 TIER}             0.0108  0.5744681   0.0188  7.412491   108
[87] {GREEN REGENCY TEACUP AND SAUCER,                                                                                         
      ROSES REGENCY TEACUP AND SAUCER}     => {REGENCY CAKESTAND 3 TIER}             0.0122  0.5470852   0.0223  7.059164   122
[88] {PINK REGENCY TEACUP AND SAUCER}      => {REGENCY CAKESTAND 3 TIER}             0.0124  0.5321888   0.0233  6.866953   124
[89] {GREEN REGENCY TEACUP AND SAUCER}     => {REGENCY CAKESTAND 3 TIER}             0.0152  0.5083612   0.0299  6.559499   152
[90] {ROSES REGENCY TEACUP AND SAUCER}     => {REGENCY CAKESTAND 3 TIER}             0.0167  0.5000000   0.0334  6.451613   167

{Shed} -> {Key Fob} means that if someone buys a shed, then it is implied they will also buy a key fob.

Support = 0.01 -> meaning this rule covers 1% of all purchases.

Confidence = 1.000 -> meaning this rule is correct 100% of the time for purchases involving a Shed.

Lift = 66.666667 -> meaning that a transaction including a shed makes it 66.666667 times more likely they also buy a key fob.

Trivial Rules:

Back Door -> Key Fob

If a customer buys a back door, they will need to lock it so they will most likely buy a key fob for the door along with the door.

Green Regency teacup and saucer, Pink Regency teacup and saucer -> Regency Cakestand 3 tier

If a customer is buying two sets of teacups and saucers, they may be having a party, at which they will need a cake, so they will most likely buy a cakestand for it.

Actionable Rules:

60 Teatime fairy cake cases -> Pack of 72 Retrospot cake cases

Pack of 72 skull cake cases -> Pack of 72 Retrospot cake cases

Both of these rules are cake cases, and implies the customer will also buy more cake cases. In this case they could sell them as part of a deal that comes with a box of cake mix or cake decorations

teacup_saucer_rules <- subset(retail_rules, items %in% "PINK REGENCY TEACUP AND SAUCER")
inspect(teacup_saucer_rules)
     lhs                                   rhs                               support confidence coverage      lift count
[1]  {PINK REGENCY TEACUP AND SAUCER}   => {GREEN REGENCY TEACUP AND SAUCER}  0.0188  0.8068670   0.0233 26.985517   188
[2]  {GREEN REGENCY TEACUP AND SAUCER}  => {PINK REGENCY TEACUP AND SAUCER}   0.0188  0.6287625   0.0299 26.985517   188
[3]  {PINK REGENCY TEACUP AND SAUCER}   => {ROSES REGENCY TEACUP AND SAUCER}  0.0174  0.7467811   0.0233 22.358716   174
[4]  {ROSES REGENCY TEACUP AND SAUCER}  => {PINK REGENCY TEACUP AND SAUCER}   0.0174  0.5209581   0.0334 22.358716   174
[5]  {PINK REGENCY TEACUP AND SAUCER}   => {REGENCY CAKESTAND 3 TIER}         0.0124  0.5321888   0.0233  6.866953   124
[6]  {GREEN REGENCY TEACUP AND SAUCER,                                                                                  
      PINK REGENCY TEACUP AND SAUCER}   => {ROSES REGENCY TEACUP AND SAUCER}  0.0153  0.8138298   0.0188 24.366161   153
[7]  {PINK REGENCY TEACUP AND SAUCER,                                                                                   
      ROSES REGENCY TEACUP AND SAUCER}  => {GREEN REGENCY TEACUP AND SAUCER}  0.0153  0.8793103   0.0174 29.408373   153
[8]  {GREEN REGENCY TEACUP AND SAUCER,                                                                                  
      ROSES REGENCY TEACUP AND SAUCER}  => {PINK REGENCY TEACUP AND SAUCER}   0.0153  0.6860987   0.0223 29.446294   153
[9]  {GREEN REGENCY TEACUP AND SAUCER,                                                                                  
      PINK REGENCY TEACUP AND SAUCER}   => {REGENCY CAKESTAND 3 TIER}         0.0108  0.5744681   0.0188  7.412491   108
[10] {PINK REGENCY TEACUP AND SAUCER,                                                                                   
      REGENCY CAKESTAND 3 TIER}         => {GREEN REGENCY TEACUP AND SAUCER}  0.0108  0.8709677   0.0124 29.129356   108
[11] {GREEN REGENCY TEACUP AND SAUCER,                                                                                  
      REGENCY CAKESTAND 3 TIER}         => {PINK REGENCY TEACUP AND SAUCER}   0.0108  0.7105263   0.0152 30.494692   108
[12] {PINK REGENCY TEACUP AND SAUCER,                                                                                   
      ROSES REGENCY TEACUP AND SAUCER}  => {REGENCY CAKESTAND 3 TIER}         0.0102  0.5862069   0.0174  7.563960   102
[13] {PINK REGENCY TEACUP AND SAUCER,                                                                                   
      REGENCY CAKESTAND 3 TIER}         => {ROSES REGENCY TEACUP AND SAUCER}  0.0102  0.8225806   0.0124 24.628163   102
[14] {REGENCY CAKESTAND 3 TIER,                                                                                         
      ROSES REGENCY TEACUP AND SAUCER}  => {PINK REGENCY TEACUP AND SAUCER}   0.0102  0.6107784   0.0167 26.213667   102

If a customer buys a Pink Regency Teacup and Saucer, then they are also most likely to buy:

  • Green Regency Teacup and Saucer

  • Roses Regency Teacup and Saucer

  • Regency Cake Stand 3 Tier

Question 2

library(recommenderlab)
steam_ratings <- read_csv("steam_ratings.csv")
steam_ratings <- as(steam_ratings, "matrix")
steam_ratings <- as(steam_ratings, "realRatingMatrix")
View(as(steam_ratings, "matrix"))

vector_ratings <- as.vector(steam_ratings@data)
table(vector_ratings)
  1. 4773

  2. 12500

  3. 19762

  4. 10655

  5. 4724

colMeans(steam_ratings) %>% 
  tibble::enframe(name = "games", value = "vector_rating") %>% 
  ggplot() +
  geom_histogram(mapping = aes(x = vector_rating), bins = 30)

user_review_counts <- rowCounts(steam_ratings)

user_review_counts %>%
  tibble::enframe(name = "user", value = "user_review_counts") %>% 
  ggplot() +
  geom_histogram(mapping = aes(x = user_review_counts), bins = 30)

set.seed(101)

game_eval <- evaluationScheme(data = steam_ratings,
                              method = "split",
                              train = 0.8,
                              given = 6,
                              goodRating = 3)

train_games <- getData(game_eval, "train")
known_games <- getData(game_eval, "known")
unknown_games <- getData(game_eval, "unknown")

UBFC Models

ubcf_1 <- Recommender(data = train_games,
                          method = "UBCF",
                          parameter = list(normalise = "center",
                                           method = "Cosine"))

ubcf_2 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = "center",
                                       method = "Euclidean"))

ubcf_3 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = "center",
                                       method = "pearson"))

ubcf_4 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = "Z-score",
                                       method = "Cosine"))

ubcf_5 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = "Z-score",
                                       method = "Euclidean"))

ubcf_6 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = "Z-score",
                                       method = "pearson"))

ubcf_7 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = NULL,
                                       method = "Cosine"))

ubcf_8 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = NULL,
                                       method = "Euclidean"))

ubcf_9 <- Recommender(data = train_games,
                      method = "UBCF",
                      parameter = list(normalise = NULL,
                                       method = "pearson"))
ubcf_predict_1 <- predict(object = ubcf_1,
                        newdata = known_games,
                        type = "ratings")

as(ubcf_predict_1, "matrix")

ubcf_eval <- calcPredictionAccuracy(x = ubcf_predict_1,
                                    data = unknown_games)
ubcf_eval

MAE = 0.9183398

ubcf_predict_2 <- predict(object = ubcf_2,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_2, "matrix")

ubcf_eval_2 <- calcPredictionAccuracy(x = ubcf_predict_2,
                                    data = unknown_games)
ubcf_eval_2

MAE = 0.9163087

ubcf_predict_3 <- predict(object = ubcf_3,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_3, "matrix")

ubcf_eval_3 <- calcPredictionAccuracy(x = ubcf_predict_3,
                                      data = unknown_games)
ubcf_eval_3

MAE = 0.8702777

ubcf_predict_4 <- predict(object = ubcf_4,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_4, "matrix")

ubcf_eval_4 <- calcPredictionAccuracy(x = ubcf_predict_4,
                                      data = unknown_games)
ubcf_eval_4

MAE = 0.9183398

ubcf_predict_5 <- predict(object = ubcf_5,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_5, "matrix")

ubcf_eval_5 <- calcPredictionAccuracy(x = ubcf_predict_5,
                                      data = unknown_games)
ubcf_eval_5

MAE = 0.9163087

ubcf_predict_6 <- predict(object = ubcf_6,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_6, "matrix")

ubcf_eval_6 <- calcPredictionAccuracy(x = ubcf_predict_6,
                                      data = unknown_games)
ubcf_eval_6

MAE = 0.8702777

ubcf_predict_7 <- predict(object = ubcf_7,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_7, "matrix")

ubcf_eval_7 <- calcPredictionAccuracy(x = ubcf_predict_7,
                                      data = unknown_games)
ubcf_eval_7

MAE = 0.9183398

ubcf_predict_8 <- predict(object = ubcf_8,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_8, "matrix")

ubcf_eval_8 <- calcPredictionAccuracy(x = ubcf_predict_8,
                                      data = unknown_games)
ubcf_eval_8

MAE = 0.9163087

ubcf_predict_9 <- predict(object = ubcf_9,
                          newdata = known_games,
                          type = "ratings")

as(ubcf_predict_9, "matrix")

ubcf_eval_9 <- calcPredictionAccuracy(x = ubcf_predict_9,
                                      data = unknown_games)
ubcf_eval_9

MAE = 0.8702777

IBCF Models

ibcf_1 <- Recommender(data = train_games,
                          method = "IBCF", 
                          parameter = list(normalize = "center", method = "Cosine"))

ibcf_2 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = "center", method = "Euclidean"))

ibcf_3 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = "center", method = "pearson"))

ibcf_4 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = "Z-score", method = "Cosine"))

ibcf_5 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = "Z-score", method = "Euclidean"))

ibcf_6 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = "Z-score", method = "pearson"))

ibcf_7 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = NULL, method = "Cosine"))

ibcf_8 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = NULL, method = "Euclidean"))

ibcf_9 <- Recommender(data = train_games,
                      method = "IBCF", 
                      parameter = list(normalize = NULL, method = "pearson"))
ibcf_predict_1 <- predict(object = ibcf_1,
                        newdata = known_games, 
                        type = "ratings")

as(ibcf_predict_1, "matrix")

ibcf_eval_1 <- calcPredictionAccuracy(x = ibcf_predict_1,
                                    data = unknown_games)
ibcf_eval_1

MAE = 1.165198

ibcf_predict_2 <- predict(object = ibcf_2,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_2, "matrix")

ibcf_eval_2 <- calcPredictionAccuracy(x = ibcf_predict_2,
                                      data = unknown_games)
ibcf_eval_2

MAE = 1.142542

ibcf_predict_3 <- predict(object = ibcf_3,
                        newdata = known_games, 
                        type = "ratings")

as(ibcf_predict_3, "matrix")

ibcf_eval_3 <- calcPredictionAccuracy(x = ibcf_predict_3,
                                    data = unknown_games)
ibcf_eval_3

MAE = 1.158908

ibcf_predict_4 <- predict(object = ibcf_4,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_4, "matrix")

ibcf_eval_4 <- calcPredictionAccuracy(x = ibcf_predict_4,
                                      data = unknown_games)
ibcf_eval_4

MAE = 1.163775

ibcf_predict_5 <- predict(object = ibcf_5,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_5, "matrix")

ibcf_eval_5 <- calcPredictionAccuracy(x = ibcf_predict_5,
                                      data = unknown_games)
ibcf_eval_5

MAE = 1.141132

ibcf_predict_6 <- predict(object = ibcf_6,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_6, "matrix")

ibcf_eval_6 <- calcPredictionAccuracy(x = ibcf_predict_6,
                                      data = unknown_games)
ibcf_eval_6

MAE = 1.158796

ibcf_predict_7 <- predict(object = ibcf_7,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_7, "matrix")

ibcf_eval_7 <- calcPredictionAccuracy(x = ibcf_predict_7,
                                      data = unknown_games)
ibcf_eval_7

MAE = 1.239649

ibcf_predict_8 <- predict(object = ibcf_8,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_8, "matrix")

ibcf_eval_8 <- calcPredictionAccuracy(x = ibcf_predict_8,
                                      data = unknown_games)
ibcf_eval_8

MAE = 1.140654

ibcf_predict_9 <- predict(object = ibcf_9,
                          newdata = known_games, 
                          type = "ratings")

as(ibcf_predict_9, "matrix")

ibcf_eval_9 <- calcPredictionAccuracy(x = ibcf_predict_9,
                                      data = unknown_games)
ibcf_eval_9

MAE = 1.152312

UBCF Models 3, 6 and 9 have the lowest MAE value of 0.8702777

Customer 0: EVE Online, Savant - Ascent, Burnout Paradise The Ultimate Box Customer 1: Sang-Froid - Tales of Werewolves, The Journey Down Chapter One, Syberia Customer 2: FINAL FANTASY VII, Back to the Future Ep 1 - It’s About Time, Indie Game The Movie Customer 3: Serious Sam HD The First Encounter, Farming Simulator 15, Airline Tycoon 2 Customer 4: Sid Meier’s Civilization IV Colonization, Serious Sam HD The First Encounter, The Maw

ubcf_rec <- predict(object = ubcf_3,
                    newdata = known_games,
                    type = "topNList",
                    n = 3)

as(ubcf_rec, "list")

Customer 0: EVE Online, Savant - Ascent, Burnout Paradise The Ultimate Box Customer 1: Sang-Froid - Tales of Werewolves, The Journey Down Chapter One, Syberia Customer 2: FINAL FANTASY VII, Back to the Future Ep 1 - It’s About Time, Indie Game The Movie Customer 3: Serious Sam HD The First Encounter, Farming Simulator 15, Airline Tycoon 2 Customer 4: Sid Meier’s Civilization IV Colonization, Serious Sam HD The First Encounter, The Maw