library(arules)
library(recommenderlab)
library(tidyverse)Recommendation Engines, Assignment 3
Question 1
1
retail <- read.transactions("retail_transactions_3.csv", sep = ",")2
summary(retail)transactions as itemMatrix in sparse format with
10000 rows (elements/itemsets/transactions) and
5479 columns (items) and a density of 0.002744552
most frequent items:
WHITE HANGING HEART T-LIGHT HOLDER REGENCY CAKESTAND 3 TIER
822 776
JUMBO BAG RED RETROSPOT PARTY BUNTING
663 561
ASSORTED COLOUR BIRD ORNAMENT (Other)
544 147008
element (itemset/transaction) length distribution:
sizes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1658 707 498 413 365 341 316 310 309 290 261 227 227 242 260 227
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
204 199 233 189 184 149 138 125 104 111 112 98 113 99 78 68
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
55 62 65 44 48 41 52 41 44 26 45 27 27 35 30 24
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
26 23 22 21 16 19 21 13 11 16 13 14 15 10 13 13
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
3 13 13 7 6 9 9 6 6 4 3 8 5 6 3 5
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
3 4 5 8 3 5 8 2 4 4 1 3 2 3 1 2
97 98 100 101 102 103 104 105 107 108 109 110 111 112 113 119
5 1 2 2 2 1 2 1 1 3 1 2 1 1 2 1
120 121 122 123 125 127 134 142 146 147 150 154 157 171 193 204
1 1 1 1 1 1 1 2 1 1 1 2 1 2 1 1
235 249
1 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.00 10.00 15.04 21.00 249.00
includes extended item information - examples:
labels
1 1 HANGER
2 10 COLOUR SPACEBOY PEN
3 12 COLOURED PARTY BALLOONS
(A)
10,000
(B)
5,479
(c)
10,000 x 5,479
= 54,790,000
((i))
54,790,000 x 0.002744552
Zero values = 1,503,740
(D)
249
(E)
15
3
itemFrequencyPlot(retail, topN = 20, horiz = T)4
retail_rules <- apriori(retail, parameter = list(support = 0.01,
confidence = 0.5,
minlen = 2))Apriori
Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen
0.5 0.1 1 none FALSE TRUE 5 0.01 2
maxlen target ext
10 rules TRUE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
Absolute minimum support count: 100
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[5479 item(s), 10000 transaction(s)] done [0.06s].
sorting and recoding items ... [384 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [86 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
summary(retail_rules)set of 86 rules
rule length distribution (lhs + rhs):sizes
2 3
47 39
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 2.000 2.000 2.453 3.000 3.000
summary of quality measures:
support confidence coverage lift
Min. :0.01000 Min. :0.5021 Min. :0.01200 Min. : 6.585
1st Qu.:0.01063 1st Qu.:0.5510 1st Qu.:0.01733 1st Qu.:12.347
Median :0.01160 Median :0.6022 Median :0.02000 Median :15.266
Mean :0.01315 Mean :0.6198 Mean :0.02152 Mean :17.144
3rd Qu.:0.01515 3rd Qu.:0.6652 3rd Qu.:0.02428 3rd Qu.:21.864
Max. :0.02270 Max. :0.8814 Max. :0.03770 Max. :44.055
count
Min. :100.0
1st Qu.:106.2
Median :116.0
Mean :131.5
3rd Qu.:151.5
Max. :227.0
mining info:
data ntransactions support confidence
retail 10000 0.01 0.5
call
apriori(data = retail, parameter = list(support = 0.01, confidence = 0.5, minlen = 2))
(A)
86
(B)
A support threshold of 0.01 means that a variable or item within a data set must appear in at least 1% of all transactions to be considered significant in the analysis.
(C)
A confidence threshold of 0.5 means that a rule must be true in 50% of the cases when “if” condition is applied.
5
(A)
Rules 47 have 2 items. Rules 39 have 3 items.
(B)
Minimum = 6.585 Maximum = 44.055
6
inspect(sort(retail_rules, by = "lift")) lhs rhs support confidence coverage lift count
[1] {WOODEN STAR CHRISTMAS SCANDINAVIAN} => {WOODEN HEART CHRISTMAS SCANDINAVIAN} 0.0113 0.7533333 0.0150 44.054581 113
[2] {WOODEN HEART CHRISTMAS SCANDINAVIAN} => {WOODEN STAR CHRISTMAS SCANDINAVIAN} 0.0113 0.6608187 0.0171 44.054581 113
[3] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0156 0.8813559 0.0177 28.708662 156
[4] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {PINK REGENCY TEACUP AND SAUCER} 0.0102 0.6938776 0.0147 28.672626 102
[5] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0156 0.6872247 0.0227 28.397714 156
[6] {PINK REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0102 0.8500000 0.0120 27.687296 102
[7] {GARDENERS KNEELING PAD KEEP CALM} => {GARDENERS KNEELING PAD CUP OF TEA} 0.0163 0.5821429 0.0280 26.105061 163
[8] {GARDENERS KNEELING PAD CUP OF TEA} => {GARDENERS KNEELING PAD KEEP CALM} 0.0163 0.7309417 0.0223 26.105061 163
[9] {PINK REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0192 0.7933884 0.0242 25.843271 192
[10] {GREEN REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0192 0.6254072 0.0307 25.843271 192
[11] {DOLLY GIRL LUNCH BOX} => {SPACEBOY LUNCH BOX} 0.0138 0.6359447 0.0217 25.642931 138
[12] {SPACEBOY LUNCH BOX} => {DOLLY GIRL LUNCH BOX} 0.0138 0.5564516 0.0248 25.642931 138
[13] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0156 0.8125000 0.0192 24.399399 156
[14] {REGENCY CAKESTAND 3 TIER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0116 0.7483871 0.0155 24.377430 116
[15] {JUMBO BAG PEARS} => {JUMBO BAG APPLES} 0.0111 0.6529412 0.0170 24.272906 111
[16] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0116 0.7891156 0.0147 23.697167 116
[17] {GREEN REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0227 0.7394137 0.0307 22.204615 227
[18] {ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0227 0.6816817 0.0333 22.204615 227
[19] {ROUND SNACK BOXES SET OF 4 FRUITS} => {ROUND SNACK BOXES SET OF4 WOODLAND} 0.0100 0.5524862 0.0181 22.188200 100
[20] {PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0177 0.7314050 0.0242 21.964113 177
[21] {ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0177 0.5315315 0.0333 21.964113 177
[22] {ALARM CLOCK BAKELIKE GREEN,
ALARM CLOCK BAKELIKE PINK} => {ALARM CLOCK BAKELIKE RED} 0.0100 0.7575758 0.0132 21.895253 100
[23] {LARGE WHITE HEART OF WICKER} => {SMALL WHITE HEART OF WICKER} 0.0113 0.5159817 0.0219 21.771381 113
[24] {ALARM CLOCK BAKELIKE PINK,
ALARM CLOCK BAKELIKE RED} => {ALARM CLOCK BAKELIKE GREEN} 0.0100 0.6666667 0.0150 21.299255 100
[25] {CHARLOTTE BAG PINK POLKADOT} => {RED RETROSPOT CHARLOTTE BAG} 0.0127 0.6256158 0.0203 20.923604 127
[26] {HOT WATER BOTTLE I AM SO POORLY} => {CHOCOLATE HOT WATER BOTTLE} 0.0102 0.5454545 0.0187 20.661157 102
[27] {STRAWBERRY CHARLOTTE BAG} => {RED RETROSPOT CHARLOTTE BAG} 0.0117 0.6157895 0.0190 20.594966 117
[28] {ALARM CLOCK BAKELIKE GREEN,
ALARM CLOCK BAKELIKE RED} => {ALARM CLOCK BAKELIKE PINK} 0.0100 0.5025126 0.0199 20.344638 100
[29] {ALARM CLOCK BAKELIKE IVORY} => {ALARM CLOCK BAKELIKE RED} 0.0120 0.6896552 0.0174 19.932230 120
[30] {BAKING SET SPACEBOY DESIGN} => {BAKING SET 9 PIECE RETROSPOT} 0.0115 0.6388889 0.0180 19.597819 115
[31] {ALARM CLOCK BAKELIKE GREEN} => {ALARM CLOCK BAKELIKE RED} 0.0199 0.6357827 0.0313 18.375224 199
[32] {ALARM CLOCK BAKELIKE RED} => {ALARM CLOCK BAKELIKE GREEN} 0.0199 0.5751445 0.0346 18.375224 199
[33] {LUNCH BAG CARS BLUE,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG WOODLAND} 0.0101 0.5343915 0.0189 17.578669 101
[34] {ALARM CLOCK BAKELIKE PINK} => {ALARM CLOCK BAKELIKE RED} 0.0150 0.6072874 0.0247 17.551660 150
[35] {CHARLOTTE BAG SUKI DESIGN} => {RED RETROSPOT CHARLOTTE BAG} 0.0114 0.5181818 0.0220 17.330496 114
[36] {WOODLAND CHARLOTTE BAG} => {RED RETROSPOT CHARLOTTE BAG} 0.0111 0.5115207 0.0217 17.107717 111
[37] {ALARM CLOCK BAKELIKE PINK} => {ALARM CLOCK BAKELIKE GREEN} 0.0132 0.5344130 0.0247 17.073896 132
[38] {WOODEN PICTURE FRAME WHITE FINISH} => {WOODEN FRAME ANTIQUE WHITE} 0.0197 0.5487465 0.0359 16.283279 197
[39] {WOODEN FRAME ANTIQUE WHITE} => {WOODEN PICTURE FRAME WHITE FINISH} 0.0197 0.5845697 0.0337 16.283279 197
[40] {LUNCH BAG CARS BLUE,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG PINK POLKADOT} 0.0116 0.6137566 0.0189 16.280016 116
[41] {LUNCH BAG BLACK SKULL,
LUNCH BAG CARS BLUE} => {LUNCH BAG PINK POLKADOT} 0.0108 0.6000000 0.0180 15.915119 108
[42] {LUNCH BAG RED RETROSPOT,
LUNCH BAG SUKI DESIGN} => {LUNCH BAG PINK POLKADOT} 0.0103 0.5953757 0.0173 15.792459 103
[43] {LUNCH BAG RED RETROSPOT,
LUNCH BAG WOODLAND} => {LUNCH BAG CARS BLUE} 0.0101 0.5906433 0.0171 15.301639 101
[44] {LUNCH BAG BLACK SKULL,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG PINK POLKADOT} 0.0120 0.5741627 0.0209 15.229779 120
[45] {LUNCH BAG DOLLY GIRL DESIGN} => {LUNCH BAG SPACEBOY DESIGN} 0.0131 0.5796460 0.0226 15.213806 131
[46] {LUNCH BAG BLACK SKULL,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG CARS BLUE} 0.0108 0.5775401 0.0187 14.962179 108
[47] {LUNCH BAG CARS BLUE,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG BLACK SKULL} 0.0118 0.6243386 0.0189 14.829896 118
[48] {LUNCH BAG RED RETROSPOT,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG PINK POLKADOT} 0.0101 0.5580110 0.0181 14.801354 101
[49] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG CARS BLUE} 0.0116 0.5686275 0.0204 14.731281 116
[50] {LUNCH BAG BLACK SKULL,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG CARS BLUE} 0.0118 0.5645933 0.0209 14.626769 118
[51] {LUNCH BAG CARS BLUE,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG BLACK SKULL} 0.0108 0.6101695 0.0177 14.493337 108
[52] {LUNCH BAG VINTAGE LEAF DESIGN} => {LUNCH BAG APPLE DESIGN} 0.0122 0.5020576 0.0243 14.468519 122
[53] {LUNCH BAG RED RETROSPOT,
LUNCH BAG WOODLAND} => {LUNCH BAG BLACK SKULL} 0.0102 0.5964912 0.0171 14.168438 102
[54] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG SUKI DESIGN} 0.0103 0.5049020 0.0204 14.064121 103
[55] {LUNCH BAG RED RETROSPOT,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG BLACK SKULL} 0.0107 0.5911602 0.0181 14.041810 107
[56] {LUNCH BAG RED RETROSPOT,
LUNCH BAG SUKI DESIGN} => {LUNCH BAG BLACK SKULL} 0.0102 0.5895954 0.0173 14.004641 102
[57] {LUNCH BAG PINK POLKADOT,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG BLACK SKULL} 0.0120 0.5882353 0.0204 13.972335 120
[58] {LUNCH BAG CARS BLUE,
LUNCH BAG WOODLAND} => {LUNCH BAG RED RETROSPOT} 0.0101 0.7266187 0.0139 13.735703 101
[59] {LUNCH BAG BLACK SKULL,
LUNCH BAG WOODLAND} => {LUNCH BAG RED RETROSPOT} 0.0102 0.7183099 0.0142 13.578636 102
[60] {LUNCH BAG BLACK SKULL,
LUNCH BAG RED RETROSPOT} => {LUNCH BAG SPACEBOY DESIGN} 0.0107 0.5119617 0.0209 13.437316 107
[61] {LUNCH BAG PINK POLKADOT,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0101 0.7013889 0.0144 13.258769 101
[62] {PAINTED METAL PEARS ASSORTED} => {ASSORTED COLOUR BIRD ORNAMENT} 0.0106 0.7210884 0.0147 13.255302 106
[63] {LUNCH BAG BLACK SKULL,
LUNCH BAG CARS BLUE} => {LUNCH BAG RED RETROSPOT} 0.0118 0.6555556 0.0180 12.392355 118
[64] {LUNCH BAG CARS BLUE,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0116 0.6553672 0.0177 12.388795 116
[65] {LUNCH BAG BLACK SKULL,
LUNCH BAG SPACEBOY DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0107 0.6524390 0.0164 12.333441 107
[66] {LUNCH BAG PINK POLKADOT,
LUNCH BAG SUKI DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0103 0.6518987 0.0158 12.323227 103
[67] {60 TEATIME FAIRY CAKE CASES} => {PACK OF 72 RETROSPOT CAKE CASES} 0.0134 0.5056604 0.0265 12.184587 134
[68] {LUNCH BAG BLACK SKULL,
LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0120 0.6417112 0.0187 12.130647 120
[69] {LUNCH BAG BLACK SKULL,
LUNCH BAG SUKI DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0102 0.6144578 0.0166 11.615460 102
[70] {LUNCH BAG WOODLAND} => {LUNCH BAG RED RETROSPOT} 0.0171 0.5625000 0.0304 10.633270 171
[71] {LUNCH BAG PINK POLKADOT} => {LUNCH BAG RED RETROSPOT} 0.0204 0.5411141 0.0377 10.228999 204
[72] {LUNCH BAG DOLLY GIRL DESIGN} => {LUNCH BAG RED RETROSPOT} 0.0121 0.5353982 0.0226 10.120950 121
[73] {JUMBO BAG STRAWBERRY} => {JUMBO BAG RED RETROSPOT} 0.0172 0.6515152 0.0264 9.826775 172
[74] {JUMBO BAG SCANDINAVIAN BLUE PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0108 0.6352941 0.0170 9.582113 108
[75] {JUMBO BAG PINK POLKADOT} => {JUMBO BAG RED RETROSPOT} 0.0223 0.6043360 0.0369 9.115174 223
[76] {CANDLEHOLDER PINK HANGING HEART} => {WHITE HANGING HEART T-LIGHT HOLDER} 0.0111 0.7449664 0.0149 9.062852 111
[77] {JUMBO BAG SPACEBOY DESIGN} => {JUMBO BAG RED RETROSPOT} 0.0113 0.5978836 0.0189 9.017852 113
[78] {JUMBO STORAGE BAG SUKI} => {JUMBO BAG RED RETROSPOT} 0.0186 0.5904762 0.0315 8.906127 186
[79] {RED HANGING HEART T-LIGHT HOLDER} => {WHITE HANGING HEART T-LIGHT HOLDER} 0.0186 0.6888889 0.0270 8.380643 186
[80] {JUMBO BAG BAROQUE BLACK WHITE} => {JUMBO BAG RED RETROSPOT} 0.0147 0.5505618 0.0267 8.304100 147
[81] {JUMBO BAG PINK VINTAGE PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0128 0.5493562 0.0233 8.285916 128
[82] {JUMBO BAG VINTAGE DOILY} => {JUMBO BAG RED RETROSPOT} 0.0116 0.5155556 0.0225 7.776102 116
[83] {JUMBO SHOPPER VINTAGE RED PAISLEY} => {JUMBO BAG RED RETROSPOT} 0.0152 0.5033113 0.0302 7.591422 152
[84] {JUMBO BAG WOODLAND ANIMALS} => {JUMBO BAG RED RETROSPOT} 0.0101 0.5024876 0.0201 7.578998 101
[85] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0102 0.5312500 0.0192 6.846005 102
[86] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0116 0.5110132 0.0227 6.585222 116
((i))
If a customer makes the purchase of a Wooden Star Christmas Scandinavian, then they are also likely to purchase a Wooden Heart Christmas Scandinavian.
((ii))
Support Value (0.0113) displays how common it is that both items are bought together out of all transactions.
Confidence value (0.5733333) outlines how often the second item is bought when the first item is already bought.
((iii)
A lift of 44.05 means that customers that purchase the Wooden Star Christmas Scandinavian are x44 times more likely to simultaneously purchase the Wooden Heart Christmas Scandinavian than if the two items were unrelated.
(B)
{gardener’s kneeling pad keep calm} => {gardeners kneeling pad cup of tea}
This is considered a trivial rule as all items are generally quite closely linked and unsurprising when purchased simultaneously.
(C)
{jumbo bag strawberry} => {jumbo bag red retrospot}
Actionable rules are rules that can potentially help stores or businesses improve, as the customers simultaneous purchasing of two items may not seem that closely related, but can often offer insights into consumers spending habits.
7
rose_rule <- subset(retail_rules, items %in% "ROSES REGENCY TEACUP AND SAUCER")
inspect(rose_rule) lhs rhs support confidence coverage lift count
[1] {PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0177 0.7314050 0.0242 21.964113 177
[2] {ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0177 0.5315315 0.0333 21.964113 177
[3] {GREEN REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0227 0.7394137 0.0307 22.204615 227
[4] {ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0227 0.6816817 0.0333 22.204615 227
[5] {GREEN REGENCY TEACUP AND SAUCER,
PINK REGENCY TEACUP AND SAUCER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0156 0.8125000 0.0192 24.399399 156
[6] {PINK REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0156 0.8813559 0.0177 28.708662 156
[7] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {PINK REGENCY TEACUP AND SAUCER} 0.0156 0.6872247 0.0227 28.397714 156
[8] {GREEN REGENCY TEACUP AND SAUCER,
ROSES REGENCY TEACUP AND SAUCER} => {REGENCY CAKESTAND 3 TIER} 0.0116 0.5110132 0.0227 6.585222 116
[9] {GREEN REGENCY TEACUP AND SAUCER,
REGENCY CAKESTAND 3 TIER} => {ROSES REGENCY TEACUP AND SAUCER} 0.0116 0.7891156 0.0147 23.697167 116
[10] {REGENCY CAKESTAND 3 TIER,
ROSES REGENCY TEACUP AND SAUCER} => {GREEN REGENCY TEACUP AND SAUCER} 0.0116 0.7483871 0.0155 24.377430 116
green regency tea cup , saucer and cake stand
Pink regency tea cup and saucer
3 tier is another other item that some customers are also likely to buy with the regency tea cup and saucer.
Question 2
(1)
library(recommenderlab)
library(tidyverse)(2)
steam_ratings <- read_csv("steam_ratings.csv")
steam_ratings <- as(steam_ratings, "matrix")
steam_ratings <- as(steam_ratings, "realRatingMatrix")(3)
vector_ratings <- as.vector(steam_ratings@data)
table(vector_ratings)vector_ratings
0 1 2 3 4 5
3236066 4773 12500 19762 10655 4724
(B)
colMeans(steam_ratings) %>%
tibble::enframe(name = "games", value = "steam_ratings") %>%
ggplot() +
geom_histogram(mapping = aes(x = steam_ratings), color = "blue") +
scale_x_continuous(limits = c(1, 5), breaks = c(1, 2, 3, 4, 5),
labels = c('1','2', '3', '4', '5'))`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_bar()`).
(C)
counts <- rowCounts(steam_ratings, value = TRUE, na.rm = FALSE)
ggplot() +
geom_histogram(mapping = aes(x = counts), color = "blue")`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
(4)
set.seed(101)
eval_games <- evaluationScheme(data = steam_ratings,
method = "split",
train = 0.8,
given = 6,
goodRating = 3)
train_games <- getData(eval_games, "train")
known_games <- getData(eval_games, "known")
unknown_games <- getData(eval_games, "unknown")(5)
(a)
ubcf_model_cc <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "center", method = "Cosine"))
ubcf_model_ce <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "center", method = "Euclidean"))
ubcf_model_cp <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "center", method = "Pearson"))ubcf_model_zc <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "z-score", method = "Cosine"))
ubcf_model_ze <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "z-score", method = "Euclidean"))
ubcf_model_zp <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = "z-score", method = "Pearson"))ubcf_model_nc <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = NULL, method = "Cosine"))
ubcf_model_ne <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = NULL, method = "Euclidean"))
ubcf_model_np <- Recommender(data = train_games,
method = "UBCF",
parameter = list(normalize = NULL, method = "Pearson"))(b)
ubcf_predict_cc <- predict(object = ubcf_model_cc,
newdata = known_games,
type = "ratings")
ubcf_cc_eval <- calcPredictionAccuracy(x = ubcf_predict_cc,
data = unknown_games)
ubcf_cc_eval RMSE MSE MAE
1.1697655 1.3683514 0.9183398
ubcf_predict_ce <- predict(object = ubcf_model_ce,
newdata = known_games,
type = "ratings")
ubcf_ce_eval <- calcPredictionAccuracy(x = ubcf_predict_ce,
data = unknown_games)
ubcf_ce_eval RMSE MSE MAE
1.1910345 1.4185633 0.9163087
ubcf_predict_cp <- predict(object = ubcf_model_cp,
newdata = known_games,
type = "ratings")
ubcf_cp_eval <- calcPredictionAccuracy(x = ubcf_predict_cp,
data = unknown_games)
ubcf_cp_eval RMSE MSE MAE
1.1212624 1.2572293 0.8702777
ubcf_predict_zc <- predict(object = ubcf_model_zc,
newdata = known_games,
type = "ratings")
ubcf_zc_eval <- calcPredictionAccuracy(x = ubcf_predict_zc,
data = unknown_games)
ubcf_zc_eval RMSE MSE MAE
1.184555 1.403170 0.923375
ubcf_predict_ze <- predict(object = ubcf_model_ze,
newdata = known_games,
type = "ratings")
ubcf_ze_eval <- calcPredictionAccuracy(x = ubcf_predict_ze,
data = unknown_games)
ubcf_ze_eval RMSE MSE MAE
1.2103032 1.4648339 0.9309624
ubcf_predict_zp <- predict(object = ubcf_model_zp,
newdata = known_games,
type = "ratings")
ubcf_zp_eval <- calcPredictionAccuracy(x = ubcf_predict_zp,
data = unknown_games)
ubcf_zp_eval RMSE MSE MAE
1.1345807 1.2872733 0.8790968
ubcf_predict_nc <- predict(object = ubcf_model_nc,
newdata = known_games,
type = "ratings")
ubcf_nc_eval <- calcPredictionAccuracy(x = ubcf_predict_nc,
data = unknown_games)
ubcf_nc_eval RMSE MSE MAE
1.0793268 1.1649463 0.8189319
ubcf_predict_ne <- predict(object = ubcf_model_ne,
newdata = known_games,
type = "ratings")
ubcf_ne_eval <- calcPredictionAccuracy(x = ubcf_predict_ne,
data = unknown_games)
ubcf_ne_eval RMSE MSE MAE
1.0990975 1.2080152 0.8294308
ubcf_predict_np <- predict(object = ubcf_model_np,
newdata = known_games,
type = "ratings")
ubcf_np_eval <- calcPredictionAccuracy(x = ubcf_predict_np,
data = unknown_games)
ubcf_np_eval RMSE MSE MAE
1.1086429 1.2290892 0.8349371
(6)
(A)
#centering#
ibcf_model_cc <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "center", method = "Cosine"))
ibcf_model_ce <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "center", method = "Euclidean"))
ibcf_model_cp <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "center", method = "Pearson"))#z-score#
ibcf_model_zc <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "z-score", method = "Cosine"))
ibcf_model_ze <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "z-score", method = "Euclidean"))
ibcf_model_zp <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = "z-score", method = "Pearson"))#null#
ibcf_model_nc <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = NULL, method = "Cosine"))
ibcf_model_ne <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = NULL, method = "Euclidean"))
ibcf_model_np <- Recommender(data = train_games,
method = "IBCF",
parameter = list(normalize = NULL, method = "Pearson"))(b)
ibcf_predict_cc <- predict(object = ibcf_model_cc,
newdata = known_games,
type = "ratings")
ibcf_cc_eval <- calcPredictionAccuracy(x = ibcf_predict_cc,
data = unknown_games)
ibcf_cc_eval RMSE MSE MAE
1.500713 2.252139 1.165198
ibcf_predict_ce <- predict(object = ibcf_model_ce,
newdata = known_games,
type = "ratings")
ibcf_ce_eval <- calcPredictionAccuracy(x = ibcf_predict_ce,
data = unknown_games)
ibcf_ce_eval RMSE MSE MAE
1.477274 2.182339 1.142542
ibcf_predict_cp <- predict(object = ibcf_model_cp,
newdata = known_games,
type = "ratings")
ibcf_cp_eval <- calcPredictionAccuracy(x = ibcf_predict_cp,
data = unknown_games)
ibcf_cp_eval RMSE MSE MAE
1.470169 2.161397 1.158908
ibcf_predict_zc <- predict(object = ibcf_model_zc,
newdata = known_games,
type = "ratings")
ibcf_zc_eval <- calcPredictionAccuracy(x = ibcf_predict_zc,
data = unknown_games)
ibcf_zc_eval RMSE MSE MAE
1.500976 2.252928 1.163775
ibcf_predict_ze <- predict(object = ibcf_model_ze,
newdata = known_games,
type = "ratings")
ibcf_ze_eval <- calcPredictionAccuracy(x = ibcf_predict_ze,
data = unknown_games)
ibcf_ze_eval RMSE MSE MAE
1.475157 2.176087 1.141132
ibcf_predict_zp <- predict(object = ibcf_model_zp,
newdata = known_games,
type = "ratings")
ibcf_zp_eval <- calcPredictionAccuracy(x = ibcf_predict_zp,
data = unknown_games)
ibcf_zp_eval RMSE MSE MAE
1.467355 2.153130 1.158796
ibcf_predict_nc <- predict(object = ibcf_model_nc,
newdata = known_games,
type = "ratings")
ibcf_nc_eval <- calcPredictionAccuracy(x = ibcf_predict_nc,
data = unknown_games)
ibcf_nc_eval RMSE MSE MAE
1.587257 2.519385 1.239649
ibcf_predict_ne <- predict(object = ibcf_model_ne,
newdata = known_games,
type = "ratings")
ibcf_ne_eval <- calcPredictionAccuracy(x = ibcf_predict_ne,
data = unknown_games)
ibcf_ne_eval RMSE MSE MAE
1.476175 2.179092 1.140654
ibcf_predict_np <- predict(object = ibcf_model_np,
newdata = known_games,
type = "ratings")
ibcf_np_eval <- calcPredictionAccuracy(x = ibcf_predict_np,
data = unknown_games)
ibcf_np_eval RMSE MSE MAE
1.456788 2.122230 1.152312
(7)
Best average scores:
UBFC_ze - 0.9309624
IBFC_nc - 1.239649
ubcf_ze_recs <- predict(object = ubcf_model_ze,
newdata = known_games,
type = "topNList",
n = 3)
recommendation_list <- as(ubcf_ze_recs, "list")
recommendation_list[1:5]$`0`
[1] "Pro Evolution Soccer 2015" "Deadpool"
[3] "Guns of Icarus Online"
$`1`
[1] "Valkyria Chronicles"
[2] "Lara Croft and the Guardian of Light"
[3] "Panzar"
$`2`
[1] "Duke Nukem 3D Megaton Edition" "The Ultimate DOOM"
[3] "Synergy"
$`3`
[1] "Sparkle 2 Evo" "Sang-Froid - Tales of Werewolves"
[3] "The Journey Down Chapter One"
$`4`
[1] "Assassin's Creed" "Sonic Adventure 2"
[3] "Galaxy on Fire 2 Full HD"
user 0: Pro evolution soccer 2015, Deadpool, and Guns of Icarus online
user 1: Valkyria chronicles, lara croft and the guardian or light, panzar
user 2: Duke nukem 3d megaton edition, the ultimate doom, synergy
user 3: sparkle 2 evo, sang froid- tales of werewolves , the journey down chapter one
user 4: assassins creed, sonic adventure 2, galaxy on fire 2 full hd
ibcf_nc_recs <- predict(object = ibcf_model_nc,
newdata = known_games,
type = "topNList",
n = 3)
recommendation_list2 <- as(ibcf_nc_recs, "list")
recommendation_list2[1:5]$`0`
[1] "Wind of Luck Arena" "404Sight" "8BitMMO"
$`1`
[1] "3DMark" "AdVenture Capitalist" "Age of Wonders III"
$`2`
[1] "Heroes of Might & Magic III - HD Edition"
[2] "Age of Conan Unchained - EU version"
[3] "Alien Rage - Unlimited"
$`3`
[1] "60 Seconds!" "Batla" "Bus Driver"
$`4`
[1] "Anno 1404" "Axis Game Factory's AGFPRO 3.0"
[3] "Blood Bowl Chaos Edition"
user 0: Wind of luck arena, 404sight, 8bitmmo
user 1: 3dmark, adventure capitalist, age of wonders 3
user 2: heroes of might & magic 3, age of conan unchained, alien rage
user 3: 60 seconds , batla, bus driver
user 4: anno 1404, axis game factory agfpro 3.0, blood bowl.
(8)
Steam could use this output to enhance user engagement and drive sales by leveraging personalized recommendations to create a more tailored experience for each user. By having a good recommendation engine, Steam could convert their current customers from 1 time buyers, to repeat customers and build relationships with each customer.