retail <- read.transactions("retail_transactions_1.csv", sep = ",")Assignment 2 - Adv Data Analysis 2
Part A
Question 1
Question 2
summary(retail)transactions as itemMatrix in sparse format with
10000 rows (elements/itemsets/transactions) and
5497 columns (items) and a density of 0.00277837
most frequent items:
WHITE HANGING HEART T-LIGHT HOLDER REGENCY CAKESTAND 3 TIER
838 775
JUMBO BAG RED RETROSPOT PARTY BUNTING
671 551
ASSORTED COLOUR BIRD ORNAMENT (Other)
543 149349
element (itemset/transaction) length distribution:
sizes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1622 684 518 406 399 352 304 313 291 265 268 234 236 221 249 259
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
220 201 223 185 176 138 145 115 110 114 96 101 109 100 74 73
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
68 66 62 40 59 42 50 43 43 55 36 28 31 28 29 23
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
29 21 34 13 15 25 17 20 13 16 20 14 11 11 13 14
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
9 15 12 8 2 10 4 7 6 9 1 7 6 3 3 4
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
7 6 3 2 4 3 5 5 1 2 1 4 2 1 1 2
97 98 99 101 103 105 107 108 109 110 111 113 114 116 117 118
1 3 1 2 2 1 2 2 3 1 1 2 1 1 3 2
120 121 122 123 125 126 135 143 147 149 154 157 158 168 171 177
1 1 1 1 2 3 1 1 1 1 4 1 1 1 1 1
202 204 249 320 428
1 1 1 1 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.00 10.00 15.27 21.00 428.00
includes extended item information - examples:
labels
1 1 HANGER
2 10 COLOUR SPACEBOY PEN
3 12 COLOURED PARTY BALLOONS
a) There are 10,000 transactions in the dataset
b) There are 5497 possible items available to purchase
c) The sparse matrix contains 54,970,000 cells. 152,727 of which are non-zero values.
d) The largest number of items purchased in a single transaction is 428
e) The mean number of items purchased in a single transaction is 15.27
Question 3
itemFrequencyPlot(retail, topN = 20, horiz = T)Question 4
retail_rules <- apriori(retail, parameter = list(support = 0.01,
confidence = 0.5,
minlen = 2))Apriori
Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen
0.5 0.1 1 none FALSE TRUE 5 0.01 2
maxlen target ext
10 rules TRUE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
Absolute minimum support count: 100
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[5497 item(s), 10000 transaction(s)] done [0.02s].
sorting and recoding items ... [401 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [90 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
retail_rulesset of 90 rules
a) There are 90 rules
b) A support threshold of 0.01 means that in order to generate a rule, an item must have appeared in 1% of the 10,000 transactions
c) A confidence threshold of 0.5 means that in order for a rule X -> Y to be included in the results, Y must appear in 50% of the transactions containing X
Question 5
summary(retail_rules)set of 90 rules
rule length distribution (lhs + rhs):sizes
2 3
58 32
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 2.000 2.000 2.356 3.000 3.000
summary of quality measures:
support confidence coverage lift
Min. :0.01000 Min. :0.5000 Min. :0.01000 Min. : 6.452
1st Qu.:0.01050 1st Qu.:0.5507 1st Qu.:0.01695 1st Qu.:12.805
Median :0.01175 Median :0.6124 Median :0.01965 Median :18.167
Mean :0.01309 Mean :0.6294 Mean :0.02131 Mean :20.694
3rd Qu.:0.01530 3rd Qu.:0.6733 3rd Qu.:0.02368 3rd Qu.:24.416
Max. :0.02230 Max. :1.0000 Max. :0.03810 Max. :66.667
count
Min. :100.0
1st Qu.:105.0
Median :117.5
Mean :130.9
3rd Qu.:153.0
Max. :223.0
mining info:
data ntransactions support confidence
retail 10000 0.01 0.5
call
apriori(data = retail, parameter = list(support = 0.01, confidence = 0.5, minlen = 2))
a) 58 of the rules have 2 items, while 32 of the rules have 3 items
b) The minimum lift value for a rule is 6.452, while the maximum lift value for a rule is 66.667
Question 6
inspect(sort(retail_rules, by = "lift")) lhs rhs support confidence coverage lift count
[1] {SHED} => {KEY FOB} 0.0100 1.0000000 0.0100 66.666667 100
[2] {BACK DOOR} => {KEY FOB} 0.0101 1.0000000 0.0101 66.666667 101
[3] {KEY FOB} => {SHED} 0.0100 0.6666667 0.0150 66.666667 100
[4] {KEY FOB} => {BACK DOOR} 0.0101 0.6733333 0.0150 66.666667 101
[5] {WOODEN STAR CHRISTMAS SCANDINAVIAN} => {WOODEN HEART CHRISTMAS SCANDINAVIAN} 0.0114 0.7755102 0.0147 47.577313 114
[6] {WOODEN HEART CHRISTMAS SCANDINAVIAN} => {WOODEN STAR CHRISTMAS SCANDINAVIAN} 0.0114 0.6993865 0.0163 47.577313 114
[7] {PINK HAPPY BIRTHDAY BUNTING} => {BLUE HAPPY BIRTHDAY BUNTING} 0.0101 0.6778523 0.0149 46.748438 101
[8] {BLUE HAPPY BIRTHDAY BUNTING} => {PINK HAPPY BIRTHDAY BUNTING} 0.0101 0.6965517 0.0145 46.748438 101
[9] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {PINK REGENCY TEACUP AND SAUCER} 0.0108 0.7105263 0.0152 30.494692 108
[10] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0153 0.6860987 0.0223 29.446294 153
[11] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0153 0.8793103 0.0174 29.408373 153
[12] {PINK REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0108 0.8709677 0.0124 29.129356 108
[13] {HAND WARMER SCOTTY DOG DESIGN} => {HAND WARMER OWL DESIGN} 0.0104 0.5621622 0.0185 27.556969 104
[14] {HAND WARMER OWL DESIGN} => {HAND WARMER SCOTTY DOG DESIGN} 0.0104 0.5098039 0.0204 27.556969 104
[15] {PINK REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0188 0.8068670 0.0233 26.985517 188
[16] {GREEN REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0188 0.6287625 0.0299 26.985517 188
[17] {REGENCY CAKESTAND 3 TIER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0102 0.6107784 0.0167 26.213667 102
[18] {GARDENERS KNEELING PAD CUP OF TEA} => {GARDENERS KNEELING PAD KEEP CALM} 0.0186 0.7440000 0.0250 25.220339 186
[19] {GARDENERS KNEELING PAD KEEP CALM} => {GARDENERS KNEELING PAD CUP OF TEA} 0.0186 0.6305085 0.0295 25.220339 186
[20] {PINK REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0102 0.8225806 0.0124 24.628163 102
[21] {SPACEBOY LUNCH BOX} => {DOLLY GIRL LUNCH BOX} 0.0161 0.5812274 0.0277 24.524364 161
[22] {DOLLY GIRL LUNCH BOX} => {SPACEBOY LUNCH BOX} 0.0161 0.6793249 0.0237 24.524364 161
[23] {REGENCY CAKESTAND 3 TIER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0122 0.7305389 0.0167 24.432740 122
[24] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0153 0.8138298 0.0188 24.366161 153
[25] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0122 0.8026316 0.0152 24.030886 122
[26] {PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0174 0.7467811 0.0233 22.358716 174
[27] {ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0174 0.5209581 0.0334 22.358716 174
[28] {ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0223 0.6676647 0.0334 22.329922 223
[29] {GREEN REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0223 0.7458194 0.0299 22.329922 223
[30] {PLASTERS IN TIN CIRCUS PARADE} => {PLASTERS IN TIN WOODLAND ANIMALS} 0.0110 0.5263158 0.0209 22.114109 110
[31] {CHARLOTTE BAG PINK POLKADOT} => {RED RETROSPOT CHARLOTTE BAG} 0.0120 0.6315789 0.0190 22.083180 120
[32] {ROUND SNACK BOXES SET OF 4 FRUITS} => {ROUND SNACK BOXES SET OF4 WOODLAND} 0.0106 0.5520833 0.0192 21.995352 106
[33] {HOT WATER BOTTLE I AM SO POORLY} => {CHOCOLATE HOT WATER BOTTLE} 0.0123 0.6275510 0.0196 21.865889 123
[34] {RED KITCHEN SCALES} => {IVORY KITCHEN SCALES} 0.0119 0.6010101 0.0198 21.854913 119
[35] {JUMBO BAG PEARS} => {JUMBO BAG APPLES} 0.0122 0.6455026 0.0189 21.233640 122
[36] {STRAWBERRY CHARLOTTE BAG} => {RED RETROSPOT CHARLOTTE BAG} 0.0115 0.5502392 0.0209 19.239134 115
[37] {BAKING SET SPACEBOY DESIGN} => {BAKING SET 9 PIECE RETROSPOT} 0.0110 0.6508876 0.0169 19.087612 110
[38] {ALARM CLOCK BAKELIKE GREEN} => {ALARM CLOCK BAKELIKE RED} 0.0195 0.6414474 0.0304 18.866099 195
[39] {ALARM CLOCK BAKELIKE RED} => {ALARM CLOCK BAKELIKE GREEN} 0.0195 0.5735294 0.0340 18.866099 195
[40] {ALARM CLOCK BAKELIKE IVORY} => {ALARM CLOCK BAKELIKE RED} 0.0115 0.6388889 0.0180 18.790850 115
[41] {ALARM CLOCK BAKELIKE ORANGE} => {ALARM CLOCK BAKELIKE RED} 0.0101 0.6352201 0.0159 18.682945 101
[42] {ALARM CLOCK BAKELIKE IVORY} => {ALARM CLOCK BAKELIKE GREEN} 0.0102 0.5666667 0.0180 18.640351 102
[43] {ALARM CLOCK BAKELIKE PINK} => {ALARM CLOCK BAKELIKE RED} 0.0148 0.6271186 0.0236 18.444666 148
[44] {LOVE BUILDING BLOCK WORD} => {HOME BUILDING BLOCK WORD} 0.0107 0.5270936 0.0203 18.429846 107
[45] {HOT WATER BOTTLE TEA AND SYMPATHY} => {CHOCOLATE HOT WATER BOTTLE} 0.0103 0.5228426 0.0197 18.217514 103
[46] {LUNCH BAG PINK POLKADOT,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG CARS BLUE} 0.0105 0.7046980 0.0149 18.115629 105
[47] {LUNCH BAG CARS BLUE,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG PINK POLKADOT} 0.0105 0.6730769 0.0156 17.666061 105
[48] {LUNCH BAG RED RETROSPOT,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG WOODLAND} 0.0105 0.6069364 0.0173 17.643500 105
[49] {ALARM CLOCK BAKELIKE PINK} => {ALARM CLOCK BAKELIKE GREEN} 0.0123 0.5211864 0.0236 17.144291 123
[50] {LUNCH BAG CARS BLUE,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG PINK POLKADOT} 0.0118 0.6519337 0.0181 17.111121 118
[51] {LUNCH BAG BLACK SKULL,
LUNCH BAG CARS BLUE} => {LUNCH BAG PINK POLKADOT} 0.0106 0.6385542 0.0166 16.759953 106
[52] {LUNCH BAG RED RETROSPOT,
LUNCH BAG WOODLAND} => {LUNCH BAG SPACEBOY DESIGN} 0.0105 0.6140351 0.0171 16.158818 105
[53] {LUNCH BAG RED RETROSPOT,
LUNCH BAG WOODLAND} => {LUNCH BAG PINK POLKADOT} 0.0103 0.6023392 0.0171 15.809427 103
[54] {WOODEN FRAME ANTIQUE WHITE} => {WOODEN PICTURE FRAME WHITE FINISH} 0.0202 0.5821326 0.0347 15.523535 202
[55] {WOODEN PICTURE FRAME WHITE FINISH} => {WOODEN FRAME ANTIQUE WHITE} 0.0202 0.5386667 0.0375 15.523535 202
[56] {LUNCH BAG BLACK SKULL,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG CARS BLUE} 0.0106 0.5888889 0.0180 15.138532 106
[57] {LUNCH BAG CARS BLUE,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG SPACEBOY DESIGN} 0.0105 0.5737705 0.0183 15.099223 105
[58] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG CARS BLUE} 0.0118 0.5841584 0.0202 15.016926 118
[59] {LUNCH BAG BLACK SKULL,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG PINK POLKADOT} 0.0118 0.5673077 0.0208 14.889966 118
[60] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG WOODLAND} 0.0103 0.5099010 0.0202 14.822703 103
[61] {LUNCH BAG DOLLY GIRL DESIGN} => {LUNCH BAG SPACEBOY DESIGN} 0.0126 0.5478261 0.0230 14.416476 126
[62] {LUNCH BAG PINK POLKADOT,
LUNCH BAG WOODLAND} => {LUNCH BAG RED RETROSPOT} 0.0103 0.7463768 0.0138 14.381056 103
[63] {LUNCH BAG VINTAGE LEAF DESIGN} => {LUNCH BAG APPLE DESIGN} 0.0114 0.5112108 0.0223 14.279630 114
[64] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG BLACK SKULL} 0.0118 0.5841584 0.0202 13.680525 118
[65] {LUNCH BAG CARS BLUE,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG BLACK SKULL} 0.0106 0.5792350 0.0183 13.565222 106
[66] {PAINTED METAL PEARS ASSORTED} => {ASSORTED COLOUR BIRD ORNAMENT} 0.0111 0.7302632 0.0152 13.448677 111
[67] {LUNCH BAG CARS BLUE,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG BLACK SKULL} 0.0103 0.5690608 0.0181 13.326950 103
[68] {LUNCH BAG BLACK SKULL,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0118 0.6555556 0.0180 12.631128 118
[69] {LUNCH BAG CARS BLUE,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0118 0.6448087 0.0183 12.424061 118
[70] {60 TEATIME FAIRY CAKE CASES} => {PACK OF 72 RETROSPOT CAKE CASES} 0.0145 0.5350554 0.0271 12.414277 145
[71] {LUNCH BAG SPACEBOY DESIGN,
LUNCH BAG WOODLAND} => {LUNCH BAG RED RETROSPOT} 0.0105 0.6250000 0.0168 12.042389 105
[72] {LUNCH BAG BLACK SKULL,
LUNCH BAG CARS BLUE} => {LUNCH BAG RED RETROSPOT} 0.0103 0.6204819 0.0166 11.955336 103
[73] {PACK OF 72 SKULL CAKE CASES} => {PACK OF 72 RETROSPOT CAKE CASES} 0.0108 0.5046729 0.0214 11.709348 108
[74] {LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0202 0.5301837 0.0381 10.215486 202
[75] {JUMBO BAG STRAWBERRY} => {JUMBO BAG RED RETROSPOT} 0.0176 0.6641509 0.0265 9.897928 176
[76] {LUNCH BAG DOLLY GIRL DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0117 0.5086957 0.0230 9.801458 117
[77] {JUMBO BAG PINK POLKADOT} => {JUMBO BAG RED RETROSPOT} 0.0222 0.5951743 0.0373 8.869959 222
[78] {JUMBO BAG SCANDINAVIAN BLUE PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0104 0.5683060 0.0183 8.469538 104
[79] {JUMBO BAG BAROQUE BLACK WHITE} => {JUMBO BAG RED RETROSPOT} 0.0149 0.5539033 0.0269 8.254893 149
[80] {JUMBO STORAGE BAG SUKI} => {JUMBO BAG RED RETROSPOT} 0.0161 0.5457627 0.0295 8.133572 161
[81] {RED HANGING HEART T-LIGHT HOLDER} => {WHITE HANGING HEART T-LIGHT HOLDER} 0.0175 0.6481481 0.0270 7.734465 175
[82] {JUMBO BAG SPACEBOY DESIGN} => {JUMBO BAG RED RETROSPOT} 0.0102 0.5125628 0.0199 7.638790 102
[83] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0102 0.5862069 0.0174 7.563960 102
[84] {JUMBO BAG PINK VINTAGE PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0125 0.5040323 0.0248 7.511658 125
[85] {JUMBO SHOPPER VINTAGE RED PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0157 0.5000000 0.0314 7.451565 157
[86] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0108 0.5744681 0.0188 7.412491 108
[87] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0122 0.5470852 0.0223 7.059164 122
[88] {PINK REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0124 0.5321888 0.0233 6.866953 124
[89] {GREEN REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0152 0.5083612 0.0299 6.559499 152
[90] {ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0167 0.5000000 0.0334 6.451613 167
a)
i) The rule {SHED} -> {KEY FOB} means that if somebody buys a shed, it has also been paired with a key fob.
ii) The support value for this rule is 0.0100, meaning that this rule covers 1% of transactions. The confidence value for this rule is 1.0000, meaning that it is correct in 100% of purchases involving sheds.
iii) The chances of someone buying a Key Fob after they bought a Shed is 66.66667 times more likely than someone buying a Key Fob on its own.
b) Trivial rules may describe things that are already well-known or common sense. Some rules that would be trivial include purchasing Christmas decorations together, and colour pairings of items such as Green Teacups & Saucers with Pink Teacups & Saucers.
c) Actionable rules may describe things that the business can use to their advantage to make changes to their strategy. This includes Sheds with Key Fobs and Back Doors with Key Fobs. The business could offer a free key fob with their Sheds and Doors to entice customers to shop with them in comparison to their competitors.
Question 7
pink_regency_rules <- subset(retail_rules, items %in% "PINK REGENCY TEACUP AND SAUCER")
inspect(pink_regency_rules) lhs rhs support confidence coverage lift count
[1] {PINK REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0188 0.8068670 0.0233 26.985517 188
[2] {GREEN REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0188 0.6287625 0.0299 26.985517 188
[3] {PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0174 0.7467811 0.0233 22.358716 174
[4] {ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0174 0.5209581 0.0334 22.358716 174
[5] {PINK REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0124 0.5321888 0.0233 6.866953 124
[6] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0153 0.8138298 0.0188 24.366161 153
[7] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0153 0.8793103 0.0174 29.408373 153
[8] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0153 0.6860987 0.0223 29.446294 153
[9] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0108 0.5744681 0.0188 7.412491 108
[10] {PINK REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0108 0.8709677 0.0124 29.129356 108
[11] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {PINK REGENCY TEACUP AND SAUCER} 0.0108 0.7105263 0.0152 30.494692 108
[12] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0102 0.5862069 0.0174 7.563960 102
[13] {PINK REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0102 0.8225806 0.0124 24.628163 102
[14] {REGENCY CAKESTAND 3 TIER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0102 0.6107784 0.0167 26.213667 102
If a customer purchases a PINK REGENCY TEACUP AND SAUCER they are most likely to buy the following along with it:
GREEN REGENCY TEACUP AND SAUCER
ROSES REGENCY TEACUP AND SAUCER
REGENCY CAKESTAND 3 TIER
Part B
Question 1
library(tidyverse)
library(recommenderlab)Question 2
steam_ratings <- read_csv("steam_ratings.csv")
steam_ratings <- as(steam_ratings, "matrix")
steam_ratings <- as(steam_ratings, "realRatingMatrix")Question 3
a)
vector_ratings <- as.vector(steam_ratings@data)
table(vector_ratings)The ratings follow a Normal Distribution; with a rating of 3 being the most common (19762), while 2 (12500) and 4 (10655) are very similar, as well as 1 (4773) and 5 (4724) being very similar.
b)
colMeans(steam_ratings) %>%
tibble::enframe(name = "game", value = "game_rating") %>%
ggplot() +
geom_histogram(mapping = aes(x = game_rating), color = "white")c)
rowCounts(steam_ratings) %>%
tibble::enframe(name = "game", value = "game_rating") %>%
ggplot() +
geom_histogram(mapping = aes(x = game_rating), color = "white")Question 4
a)
set.seed(101)b)
eval_games = evaluationScheme(data = steam_ratings,
method = "split",
train = 0.8,
given = 6,
goodRating = 3)c)
train_games <- getData(eval_games, "train")
known_games <- getData(eval_games, "known")
unknown_games <- getData(eval_games, "unknown")Question 5
a)
ubcf_model <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "center", method = "Cosine"))
ubcf_predict <- predict(object = ubcf_model,
newdata = known_games,
type = "ratings")
ubcf_eval <- calcPredictionAccuracy(x = ubcf_predict,
data = unknown_games)
ubcf_eval RMSE MSE MAE
1.1697655 1.3683514 0.9183398
b) The Mean Absolute Error (MAE) of the model is 0.9183
Question 6
a)
ibcf_model <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "center", method = "cosine"))
ibcf_predict <- predict(object = ibcf_model,
newdata = known_games,
type = "ratings")
ibcf_eval <- calcPredictionAccuracy(x = ibcf_predict,
data = unknown_games)
ibcf_eval RMSE MSE MAE
1.500713 2.252139 1.165198
b) The Mean Absolute Error (MAE) for this model is 1.1652
Question 7
ubcf_recs <- predict(object = ubcf_model,
newdata = known_games,
type = "topNList",
n = 3)
rec_list <- as(ubcf_recs, "list")
rec_list[1:5]$`0`
[1] "Frozen Hearth" "FINAL FANTASY VII" "HAWKEN"
$`1`
[1] "Loadout Campaign Beta" "Royal Quest" "Villagers and Heroes"
$`2`
[1] "Hitman Blood Money" "Sonic Adventure 2" "Time Clickers"
$`3`
[1] "The Ultimate DOOM" "Door Kickers" "Train Fever"
$`4`
[1] "Sonic Adventure 2" "Quake Live" "Royal Quest"
The top 3 game recommendations for the first 5 users are the following:
User 1
Frozen Hearth
FINAL FANTASY VII
HAWKEN
User 2
Loadout Campaign Beta
Royal Quest
Villagers and Heroes
User 3
Hitman Blood Money
Sonic Adventure 2
Time Clickers
User 4
The Ultimate DOOM
Door Kickers
Train Fever
User 5
Sonic Adventure 2
Quake Live
Royal Quest
Question 8
Steam could use this Collaborative Filtering model to increase user engagement and sales by promoting specific games to users, tailored by their current behaviours. This would keep the users online and playing their games, as well as purchasing games which they may be interested in. This provides an experience which is customised to each user.