#option: warning=FALSE
library(arules)
## Warning: package 'arules' was built under R version 4.5.2
## Loading required package: Matrix
##
## Attaching package: 'arules'
## The following objects are masked from 'package:base':
##
## abbreviate, write
library(arulesViz)
In the modern world, streaming platforms and streamers themselves become increasingly more popular. This also increases the impact they have and the profit they make for themselves and the streaming platforms such as Twitch.
In this project, we will try to analyze the Twitch user dataset to identify the trends that users follow when interacting with live streamers. By examining patterns of co-interaction between users and streamers, the project aims to uncover underlying behavioral structures within the Twitch ecosystem. In particular, we investigate whether users tend to follow groups of streamers with shared characteristics, such as content type, language, or community overlap.
To achieve this, association rule mining is applied to model user behavior in a transaction-based manner. The extracted association rules are then evaluated to determine their potential usefulness in building recommendation systems for streaming platforms. Such systems could assist platforms like Twitch in suggesting relevant streamers to users based on observed interaction patterns, thereby improving user engagement and content discoverability, which is crucial for the business.
There are multiple questions that the project aims to answer:
Which streamers are most frequently co-watched by the same users?
Are there identifiable clusters of streamers that share a common audience?
Do the extracted rules suggest potential streamer recommendation strategies for users?
This is a dataset of users consuming streaming content on Twitch. Authors retrieved all streamers, and all users connected in their respective chats, every 10 minutes during 43 days. Link: https://cseweb.ucsd.edu/~jmcauley/datasets.html?utm_source=chatgpt.com#twitch
Start and stop times are provided as integers and represent periods of 10 minutes. Stream ID could be used to retrieve a single broadcast segment from a streamer (not used in our work).
User ID (anonymized)
Stream ID
Streamer username
Time start
Time stop
| User ID | Stream ID | Streamer username | Time start | Time stop |
|---|---|---|---|---|
| 1 | 34347669376 | grimnax | 5415 | 5419 |
| 1 | 34391109664 | jtgtv | 5869 | 5870 |
| 1 | 34395247264 | towshun | 5898 | 5899 |
| 1 | 34405646144 | mithrain | 6024 | 6025 |
| 2 | 33848559952 | chfhdtpgus1 | 206 | 207 |
| 2 | 33881429664 | sal_gu | 519 | 524 |
| 2 | 33921292016 | chfhdtpgus1 | 922 | 924 |
data <- read.csv("100k_a.csv")
head(data, 20)
## X1 X33842865744 mithrain X154 X156
## 1 1 33846768288 alptv 166 169
## 2 1 33886469056 mithrain 587 588
## 3 1 33887624992 wtcn 589 591
## 4 1 33890145056 jrokezftw 591 594
## 5 1 33903958784 berkriptepe 734 737
## 6 1 33929318864 kendinemuzisyen 1021 1036
## 7 1 33942837056 wtcn 1165 1167
## 8 1 33955351648 kendinemuzisyen 1295 1297
## 9 1 34060922080 mithrain 2458 2459
## 10 1 34062621584 unlostv 2454 2456
## 11 1 34077379792 mithrain 2601 2603
## 12 1 34078096176 zeon 2603 2604
## 13 1 34079135968 elraenn 2600 2601
## 14 1 34082259232 zeon 2604 2605
## 15 1 34157036272 mithrain 3459 3460
## 16 1 34169481232 kendinemuzisyen 3600 3601
## 17 1 34185325968 unlostv 3739 3743
## 18 1 34188146896 wtcn 3755 3757
## 19 1 34188931888 jahrein 3757 3760
## 20 1 34195515568 mithrain 3874 3875
After reading the dataset and displaying first rows, we see that there are no proper column names, which has to be fixed.
We are looking for the user ID and the streamer nickname, so we are only interested in 1st and 3rd columns. Also, the rows correspond to the times that the user typed a message in the chat, so one streamer can appear multiple times for the same user. Therefore, we have to select only unique entries.
data <- data[,c(1,3)]
colnames(data) <- c("user_id", "streamer_nickname")
data <- unique(data)
head(data)
## user_id streamer_nickname
## 1 1 alptv
## 2 1 mithrain
## 3 1 wtcn
## 4 1 jrokezftw
## 5 1 berkriptepe
## 6 1 kendinemuzisyen
Note: After selecting only unique entries, we got rid of almost half of the redundant rows.
Now, we wish to remove streamers with low support (<1%) and leave only those transactions that contain more than 1 streamer. As a result, we remove 45% of transactions. This does not change the direction of the analysis, because the data was collected for only 43 days and did not reflect the whole picture of Twitch, but rather a time period. Therefore, I decided to focus on relatively popular items.
trans <- as(split(data$streamer_nickname, data$user_id), "transactions")
trans <- trans[, itemFrequency(trans) > 0.01]
trans <- trans[size(trans) > 1]
summary(trans)
## transactions as itemMatrix in sparse format with
## 55535 rows (elements/itemsets/transactions) and
## 137 columns (items) and a density of 0.04048542
##
## most frequent items:
## ninja tfue shroud riotgames nickmercs (Other)
## 16213 13896 10289 6975 5894 254758
##
## element (itemset/transaction) length distribution:
## sizes
## 2 3 4 5 6 7 8 9 10 11 12 13 14
## 12878 10104 7567 5556 4090 3120 2414 1902 1513 1284 1011 789 639
## 15 16 17 18 19 20 21 22 23 24 25 26 27
## 507 417 342 268 222 200 156 119 104 77 53 40 37
## 28 29 30 31 32 33 34 35 36 37 38 39 40
## 37 23 16 12 7 7 7 5 3 3 2 1 2
## 42
## 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 3.000 4.000 5.547 7.000 42.000
##
## includes extended item information - examples:
## labels
## 1167 72hrs
## 1443 a_seagull
## 2559 admiralbahroo
##
## includes extended transaction information - examples:
## transactionID
## 1 1
## 2 2
## 4 4
From the summary, we can already note that the data reflects the reality, since the most popular creators are Ninja, Tfue, and Shroud. These are game streamers, so I would expect an association rule between them. Ninja and Tfue mainly play Fortnite, so they are more connected, while Shroud is general video-game streamer. We also removed transactions with 1 streamer. Mean is 5.5, which indicates that the users are moderately active. There is also a person who interacted with 42 different streamers in 43 days, which is very impressive. Around 4% density indicates that out of (55535 x 137 = 7608295) possible unique user-streamer pairs, there are around 300000 such pairs.
Displaying example transactions to confirm that the structure is proper.
inspect(trans[1:3])
## items transactionID
## [1] {esl_csgo,
## kendinemuzisyen,
## mithrain,
## wtcn} 1
## [2] {hanryang1125,
## lol_ambition} 2
## [3] {kendinemuzisyen,
## mithrain,
## mrsavage,
## ninja,
## solaryfortnite,
## tfue,
## timthetatman,
## wtcn} 4
itemFrequencyPlot(
trans,
topN = 20,
type = "absolute",
main = "Top 20 Most Watched Streamers"
)
In this stage of the project, association rule mining is applied to discover recurring patterns in user interaction behavior on the Twitch platform. Association rules describe relationships between sets of items that frequently occur together within a collection of transactions. In the context of this project, each transaction represents a single user, while the items correspond to the streamers that the user has interacted with. The objective is to identify combinations of streamers that tend to share the same audience and to evaluate the strength of these relationships.
To assess the relevance and reliability of the extracted rules, three standard measures are used:
Support reflects how often a particular combination of streamers appears across all users in the dataset, indicating the overall prevalence of a pattern.
Confidence measures the likelihood that a user interacting with a given set of streamers will also interact with another specific streamer, thus capturing the predictive strength of the rule.
Lift compares the observed confidence of a rule to the expected probability of the consequent occurring independently, allowing us to determine whether the association represents a meaningful relationship or merely a coincidental overlap. Lift values greater than one indicate a positive association between streamers, while values close to one suggest independence.
To efficiently discover meaningful associations, the Apriori algorithm is used in this project. Apriori is well suited for this type of analysis because it gradually narrows down the number of possible item combinations by removing those that do not meet a minimum support requirement. The key idea behind the algorithm is that if a group of streamers is rarely watched together, then any larger group containing those streamers is also unlikely to be common.
Now, we run the algorithm to obtain trends by looking at the rules that appear.
rules <- apriori(
trans,
parameter = list(
supp = 0.04,
conf = 0.4
)
)
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.4 0.1 1 none FALSE TRUE 5 0.04 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 2221
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[137 item(s), 55535 transaction(s)] done [0.01s].
## sorting and recoding items ... [41 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [53 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
rules_support <- sort(rules, by = "support", decreasing = TRUE)
inspect(rules_support[1:10])
## lhs rhs support confidence coverage lift count
## [1] {tfue} => {ninja} 0.16441883 0.6570956 0.25022058 2.250774 9131
## [2] {ninja} => {tfue} 0.16441883 0.5631900 0.29194202 2.250774 9131
## [3] {nickmercs} => {ninja} 0.08214639 0.7740075 0.10613127 2.651237 4562
## [4] {nickmercs} => {tfue} 0.08124606 0.7655243 0.10613127 3.059398 4512
## [5] {dakotaz} => {ninja} 0.07685244 0.7348485 0.10458270 2.517104 4268
## [6] {fortnite} => {tfue} 0.07415144 0.7333927 0.10110741 2.930985 4118
## [7] {symfuhny} => {tfue} 0.07325110 0.7812560 0.09376069 3.122269 4068
## [8] {fortnite} => {ninja} 0.07206266 0.7127337 0.10110741 2.441354 4002
## [9] {timthetatman} => {ninja} 0.07080220 0.7395148 0.09574142 2.533088 3932
## [10] {symfuhny} => {ninja} 0.07056811 0.7526407 0.09376069 2.578048 3919
rules_conf <- sort(rules, by = "confidence", decreasing = TRUE)
inspect(rules_conf[1:10])
## lhs rhs support confidence coverage lift
## [1] {drlupo, tfue} => {ninja} 0.04013685 0.8937450 0.04490862 3.061378
## [2] {tfue, timthetatman} => {ninja} 0.05016656 0.8847253 0.05670298 3.030483
## [3] {aydan, ninja} => {tfue} 0.04211758 0.8634182 0.04878005 3.450628
## [4] {chap} => {tfue} 0.04582696 0.8416005 0.05445215 3.363434
## [5] {cloakzy} => {tfue} 0.04256775 0.8350406 0.05097686 3.337218
## [6] {cloakzy} => {ninja} 0.04256775 0.8350406 0.05097686 2.860296
## [7] {ninja, symfuhny} => {tfue} 0.05866571 0.8313345 0.07056811 3.322407
## [8] {nickmercs, tfue} => {ninja} 0.06716485 0.8266844 0.08124606 2.831673
## [9] {drlupo} => {ninja} 0.05587467 0.8206824 0.06808319 2.811114
## [10] {tfue, tsm_myth} => {ninja} 0.04208157 0.8200000 0.05131899 2.808777
## count
## [1] 2229
## [2] 2786
## [3] 2339
## [4] 2545
## [5] 2364
## [6] 2364
## [7] 3258
## [8] 3730
## [9] 3103
## [10] 2337
rules_lift <- sort(rules, by = "lift", decreasing = TRUE)
inspect(rules_lift[1:10])
## lhs rhs support confidence coverage
## [1] {asmongold} => {sodapoppin} 0.05371387 0.6642173 0.08086792
## [2] {sodapoppin} => {asmongold} 0.05371387 0.5127191 0.10476276
## [3] {symfuhny} => {nickmercs} 0.04179346 0.4457461 0.09376069
## [4] {ninja, tfue} => {nickmercs} 0.06716485 0.4084985 0.16441883
## [5] {fortnite} => {nickmercs} 0.04116323 0.4071238 0.10110741
## [6] {aydan, ninja} => {tfue} 0.04211758 0.8634182 0.04878005
## [7] {chap} => {tfue} 0.04582696 0.8416005 0.05445215
## [8] {cloakzy} => {tfue} 0.04256775 0.8350406 0.05097686
## [9] {ninja, symfuhny} => {tfue} 0.05866571 0.8313345 0.07056811
## [10] {drdisrespect} => {shroud} 0.04150536 0.6145028 0.06754299
## lift count
## [1] 6.340204 2983
## [2] 6.340204 2983
## [3] 4.199951 2321
## [4] 3.848993 3730
## [5] 3.836040 2286
## [6] 3.450628 2339
## [7] 3.363434 2545
## [8] 3.337218 2364
## [9] 3.322407 3258
## [10] 3.316786 2305
Instead of going through all generated association rules one by one, it is much more informative to analyze them using the key measures of support, confidence, and lift. Each of these metrics highlights a different aspect of the relationships between streamers and helps us better understand user viewing behavior on Twitch.
Rules with the highest support represent streamer combinations that appear most frequently among users. A high support value means that a given pair or group of streamers is commonly watched together by a large portion of the audience. In this dataset, such rules mainly reflect mainstream viewing patterns and popular streamer combinations that could be used for platform-wide recommendations. It is easy to notice that the streamers Tfue and Ninja dominate this group and appear in symmetric rules. This is not surprising, as both streamers focus on similar gaming content and have attracted overlapping communities, largely due to their popularity in Fortnite. Based on these rules, we can infer that users who watch NickMercs, Dakotaz, or the official Fortnite channel are also very likely to watch streams by Tfue and Ninja.
The confidence measure shows how reliably the presence of one streamer (or a group of streamers) predicts the presence of another. High-confidence rules indicate that when users watch the antecedent streamer(s), they are very likely to also watch the consequent streamer. In this analysis, the highest confidence values are usually associated with rules that include multiple streamers on the left-hand side, suggesting more specific and focused viewing behavior. Although these rules often have lower support and therefore apply to a smaller group of users, they are particularly valuable for personalized recommendation systems, where accuracy is more important than reaching a broad audience. A good example of such a rule is that users who watch both DrLupo and Tfue are also very likely to watch Ninja. Similarly, users who watch Ninja together with Aydan tend to also watch Tfue.
The lift metric helps assess whether a relationship between streamers is stronger than what would be expected by chance. Lift values greater than one indicate a positive association, meaning that the streamers are watched together more often than if user choices were independent. In this dataset, the highest lift values are often observed for rules involving less popular streamers that form more niche communities. Even though these rules may have relatively low support, their high lift values suggest strong and meaningful relationships. From a recommendation perspective, such rules are especially interesting, as they can reveal hidden connections between streamers and support content discovery within specific user segments. One example of this type of relationship is the pair Asmongold and Sodapoppin, which did not stand out in the previous analyses based on support or confidence alone.
plot(rules,
measure = c("support", "confidence"),
shading = "lift",
main = "Scatter Plot for Twitch rules")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
From the above plot we can note that there are 2 rules that are particularly high in both confidence and support, which are the rule nickmercs => ninja and nickmercs => tfue, which indicates the most important finding, as the confidence level is over 0.8 and the support is around 0.06. We can see two rules with very high lift value relative to others. This is the symmetric sodapoppin and asmongold rule.
rules_top <- head(sort(rules, by = "lift"), 20)
plot(
rules_top,
method = "graph",
engine = "igraph",
control = list(
type = "items",
edge.arrow.size = 0.5,
node.label.cex = 0.8
),
shading = "lift"
)
## Warning: Unknown control parameters: type, edge.arrow.size, node.label.cex
## Available control parameters (with default values):
## main = Graph for 20 rules
## max = 100
## nodeCol = c("#EE0000FF", "#EE0303FF", "#EE0606FF", "#EE0909FF", "#EE0C0CFF", "#EE0F0FFF", "#EE1212FF", "#EE1515FF", "#EE1818FF", "#EE1B1BFF", "#EE1E1EFF", "#EE2222FF", "#EE2525FF", "#EE2828FF", "#EE2B2BFF", "#EE2E2EFF", "#EE3131FF", "#EE3434FF", "#EE3737FF", "#EE3A3AFF", "#EE3D3DFF", "#EE4040FF", "#EE4444FF", "#EE4747FF", "#EE4A4AFF", "#EE4D4DFF", "#EE5050FF", "#EE5353FF", "#EE5656FF", "#EE5959FF", "#EE5C5CFF", "#EE5F5FFF", "#EE6262FF", "#EE6666FF", "#EE6969FF", "#EE6C6CFF", "#EE6F6FFF", "#EE7272FF", "#EE7575FF", "#EE7878FF", "#EE7B7BFF", "#EE7E7EFF", "#EE8181FF", "#EE8484FF", "#EE8888FF", "#EE8B8BFF", "#EE8E8EFF", "#EE9191FF", "#EE9494FF", "#EE9797FF", "#EE9999FF", "#EE9B9BFF", "#EE9D9DFF", "#EE9F9FFF", "#EEA0A0FF", "#EEA2A2FF", "#EEA4A4FF", "#EEA5A5FF", "#EEA7A7FF", "#EEA9A9FF", "#EEABABFF", "#EEACACFF", "#EEAEAEFF", "#EEB0B0FF", "#EEB1B1FF", "#EEB3B3FF", "#EEB5B5FF", "#EEB7B7FF", "#EEB8B8FF", "#EEBABAFF", "#EEBCBCFF", "#EEBDBDFF", "#EEBFBFFF", "#EEC1C1FF", "#EEC3C3FF", "#EEC4C4FF", "#EEC6C6FF", "#EEC8C8FF", "#EEC9C9FF", "#EECBCBFF", "#EECDCDFF", "#EECFCFFF", "#EED0D0FF", "#EED2D2FF", "#EED4D4FF", "#EED5D5FF", "#EED7D7FF", "#EED9D9FF", "#EEDBDBFF", "#EEDCDCFF", "#EEDEDEFF", "#EEE0E0FF", "#EEE1E1FF", "#EEE3E3FF", "#EEE5E5FF", "#EEE7E7FF", "#EEE8E8FF", "#EEEAEAFF", "#EEECECFF", "#EEEEEEFF")
## itemnodeCol = #66CC66FF
## edgeCol = #ABABABFF
## labelCol = #000000B3
## measureLabels = FALSE
## precision = 3
## arrowSize = 0.5
## alpha = 0.5
## cex = 1
## layout = NULL
## layoutParams = list()
## engine = igraph
## plot = TRUE
## plot_options = list()
## verbose = FALSE
From the above graph, we can even notice clusters of the streamers. the big cluster in the middle are the streamers that focus on shooter games such as Fortnite, Call of Duty Warzone, and etc. The cluster on the bottom-right contains streamers that come from older generation and focus on variety of games. On the top-right, we have the previously mentioned pair Asmongold and sodapoppin, who create similar content, and therefore are watched together.
rules_focus <- subset(rules, rhs %in% "tfue" | rhs %in% "ninja")
plot(rules_focus, method = "grouped")
The grouped matrix plot highlights rules where Tfue and Ninja appear as consequents. The size of each circle represents how frequently a combination of antecedent streamers occurs (support), while the color indicates the strength of the association relative to chance (lift). This visualization makes it easy to see which streamer combinations are most strongly associated with these popular streamers, and can help identify patterns for targeted recommendations. We can note from the matrix that rules with Tfue as a consequent are especially strong in terms of lift.
I will quickly run ECLAT algorithm to see whether it will result in different rules or give us more insights into the data.
itemsets <- eclat(
trans,
parameter = list(
supp = 0.04
)
)
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.04 1 10 frequent itemsets TRUE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 2221
##
## create itemset ...
## set transactions ...[137 item(s), 55535 transaction(s)] done [0.01s].
## sorting and recoding items ... [41 item(s)] done [0.00s].
## creating sparse bit matrix ... [41 row(s), 55535 column(s)] done [0.00s].
## writing ... [85 set(s)] done [0.02s].
## Creating S4 object ... done [0.00s].
rules <- ruleInduction(itemsets, trans, confidence = 0.4)
inspect(sort(rules, by = "support")[1:10])
## lhs rhs support confidence lift itemset
## [1] {tfue} => {ninja} 0.16441883 0.6570956 2.250774 44
## [2] {ninja} => {tfue} 0.16441883 0.5631900 2.250774 44
## [3] {nickmercs} => {ninja} 0.08214639 0.7740075 2.651237 40
## [4] {nickmercs} => {tfue} 0.08124606 0.7655243 3.059398 41
## [5] {dakotaz} => {ninja} 0.07685244 0.7348485 2.517104 37
## [6] {fortnite} => {tfue} 0.07415144 0.7333927 2.930985 27
## [7] {symfuhny} => {tfue} 0.07325110 0.7812560 3.122269 34
## [8] {fortnite} => {ninja} 0.07206266 0.7127337 2.441354 26
## [9] {timthetatman} => {ninja} 0.07080220 0.7395148 2.533088 30
## [10] {symfuhny} => {ninja} 0.07056811 0.7526407 2.578048 33
inspect(sort(rules, by = "confidence")[1:10])
## lhs rhs support confidence lift itemset
## [1] {drlupo, tfue} => {ninja} 0.04013685 0.8937450 3.061378 17
## [2] {tfue, timthetatman} => {ninja} 0.05016656 0.8847253 3.030483 29
## [3] {aydan, ninja} => {tfue} 0.04211758 0.8634182 3.450628 14
## [4] {chap} => {tfue} 0.04582696 0.8416005 3.363434 11
## [5] {cloakzy} => {ninja} 0.04256775 0.8350406 2.860296 3
## [6] {cloakzy} => {tfue} 0.04256775 0.8350406 3.337218 4
## [7] {ninja, symfuhny} => {tfue} 0.05866571 0.8313345 3.322407 32
## [8] {nickmercs, tfue} => {ninja} 0.06716485 0.8266844 2.831673 39
## [9] {drlupo} => {ninja} 0.05587467 0.8206824 2.811114 18
## [10] {tfue, tsm_myth} => {ninja} 0.04208157 0.8200000 2.808777 20
inspect(sort(rules, by = "lift")[1:10])
## lhs rhs support confidence lift itemset
## [1] {sodapoppin} => {asmongold} 0.05371387 0.5127191 6.340204 7
## [2] {asmongold} => {sodapoppin} 0.05371387 0.6642173 6.340204 7
## [3] {symfuhny} => {nickmercs} 0.04179346 0.4457461 4.199951 35
## [4] {ninja, tfue} => {nickmercs} 0.06716485 0.4084985 3.848993 39
## [5] {fortnite} => {nickmercs} 0.04116323 0.4071238 3.836040 28
## [6] {aydan, ninja} => {tfue} 0.04211758 0.8634182 3.450628 14
## [7] {chap} => {tfue} 0.04582696 0.8416005 3.363434 11
## [8] {cloakzy} => {tfue} 0.04256775 0.8350406 3.337218 4
## [9] {ninja, symfuhny} => {tfue} 0.05866571 0.8313345 3.322407 32
## [10] {drdisrespect} => {shroud} 0.04150536 0.6145028 3.316786 2
After viewing the rules, we can note that there is practically little to no difference between ECLAT and Apriori algorithms for this project.
This analysis provides clear insights into user behavior on Twitch by addressing the main research questions:
Which streamers are most frequently co-watched by the
same users?
Both Apriori and Eclat algorithms revealed that popular streamers like
Tfue and Ninja are frequently watched together. Niche pairs, such as
Asmongold and Sodapoppin, were also identified, highlighting smaller
communities with shared audiences.
Are there identifiable clusters of streamers that share a
common audience?
Visualizations of the rules and frequent itemsets revealed distinct
clusters of streamers. For example, a large cluster centers around
shooter game streamers, while other clusters represent variety content
or niche communities, demonstrating patterns of audience
overlap.
Do the extracted rules suggest potential streamer
recommendation strategies for users?
The rules with high confidence and lift indicate strong associations
that could inform recommendation strategies. Users who watch certain
streamers are likely to watch specific others, suggesting opportunities
to improve personalized content suggestions based on observed
co-interaction patterns.
Overall, both Apriori and Eclat produced consistent results, confirming the reliability of the findings. This analysis shows that association rule mining can effectively discover viewing trends and provide insights for recommendation systems on streaming platforms.