The data frame below represents selected columns from a larger data table that is the by-product of a many-to-many relationship between two indirectly related data sets. The data that we are looking at is video viewing across multiple websites, for the same video asset.
## Warning: NAs introduced by coercion
## ExternalID Series.Title program Views Videostart
## 1 UBC_ANV_2910958 Blindspark NULL 3559 0
## 2 UBC_ANV_2910958 Blindspark NULL 39713 0
## 3 UBC_ANV_2936407 Blindspark Blindspark 60823 82786
## 4 UBC_ANV_2936407 Blindspark Blindspark 666185 82786
## 5 UBC_ANV_2918316 Blindspark Blindspark 20198 21972
## 6 UBC_ANV_2918316 Blindspark Blindspark 188628 21972
## 7 UBC_ANV_2925087 Blindspark Blindspark 54931 64923
## 8 UBC_ANV_2925087 Blindspark Blindspark 524440 64923
## 9 UBC_ANV_2915086 Blindspark Blindspark 12005 15073
## 10 UBC_ANV_2915086 Blindspark Blindspark 118552 15073
## 11 UBC_ANV_2931827 Blindspark Blindspark 106807 128461
## 12 UBC_ANV_2931827 Blindspark Blindspark 1220074 128461
## 13 UBC_ANV_2928783 Blindspark Blindspark 122596 138270
## 14 UBC_ANV_2928783 Blindspark Blindspark 1315377 138270
## 15 UBC_ANV_2904996 Blindspark NULL 1 0
## 16 UBC_ANV_2922034 Blindspark Blindspark 28561 31847
## 17 UBC_ANV_2922034 Blindspark Blindspark 278625 31847
The data.table package allows for factorization of vector values, aggregating across multiple instances of the same value (Video ID) in the original data table.
The list of IDs by Videostarts, are the number of video views generated on the UBC’s own website. The IDs that have zero views, are actually “NULL” values and represent videos that were only available offdomain. These IDs should be excluded from any totals that compare domain and offdomain performance.
## Loading required package: data.table
## ExternalID domainviews
## 1: UBC_ANV_2910958 0
## 2: UBC_ANV_2936407 165572
## 3: UBC_ANV_2918316 43944
## 4: UBC_ANV_2925087 129846
## 5: UBC_ANV_2915086 30146
## 6: UBC_ANV_2931827 256922
## 7: UBC_ANV_2928783 276540
## 8: UBC_ANV_2904996 0
## 9: UBC_ANV_2922034 63694
## ExternalID offdomainviews
## 1: UBC_ANV_2910958 43272
## 2: UBC_ANV_2936407 727008
## 3: UBC_ANV_2918316 208826
## 4: UBC_ANV_2925087 579371
## 5: UBC_ANV_2915086 130557
## 6: UBC_ANV_2931827 1326881
## 7: UBC_ANV_2928783 1437973
## 8: UBC_ANV_2904996 1
## 9: UBC_ANV_2922034 307186
The graphs and data tables below represent total video views, an aggregation by VideoID, Series/Program and Episode Title across UBC’s on website as well as offdomain platforms (i.e. Hulu Classic, Hulu Plus etc.).
The boxplot below, displaying means values, readily displays that the “Plus” subsription no-ad product, has many more views on average than the either of the other two ad supported models - Classir or ondomain. The second graph, a bar chart displays total views by platform, again pointing to the dominance of the subscription “Plus” model. The data table ranked on average views, also tells the same story.
## Loading required package: ggplot2
## Type avgviews sumviews
## 1: Plus 543949.25 4351594
## 2: Classic 45497.89 409481
## 3: ondomain 69047.43 483332
The table below shows a descending order ranker of total views (across all platforms) by individual episode titles for the “Blindspark” series on UBC.
## Episode.Title avgviews sumviews
## 1: Sent On Tour 718986.50 1437973
## 2: Persecute Envoys 485114.00 1455342
## 3: Authentic Flirt 269931.33 809794
## 4: Cede Your Soul 214764.67 644294
## 5: Split the Law 113011.00 339033
## 6: Bone May Rot 76932.67 230798
## 7: Sent on Tour 138270.00 138270
## 8: Eight Slim Grins 48543.33 145630
## 9: A Stray Howl 21636.00 43272
## 10: Series Premiere 1.00 1
The following stats give us an idea of the variablilty across our viewcounts for Blindspark view, demonstrated by the substantial difference in median (73.9K) and mean (218.5K) values.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 26910 73850 218500 150900 1315000
Source: SQL Data Tables and Transformations.