The data frame below represents selected columns from a larger data table that is the by-product of a many-to-many relationship between two indirectly related data sets. The data that we are looking at is video viewing across multiple websites, for the same video asset.

## Warning: NAs introduced by coercion
##         ExternalID Series.Title    program   Views Videostart
## 1  UBC_ANV_2910958   Blindspark       NULL    3559          0
## 2  UBC_ANV_2910958   Blindspark       NULL   39713          0
## 3  UBC_ANV_2936407   Blindspark Blindspark   60823      82786
## 4  UBC_ANV_2936407   Blindspark Blindspark  666185      82786
## 5  UBC_ANV_2918316   Blindspark Blindspark   20198      21972
## 6  UBC_ANV_2918316   Blindspark Blindspark  188628      21972
## 7  UBC_ANV_2925087   Blindspark Blindspark   54931      64923
## 8  UBC_ANV_2925087   Blindspark Blindspark  524440      64923
## 9  UBC_ANV_2915086   Blindspark Blindspark   12005      15073
## 10 UBC_ANV_2915086   Blindspark Blindspark  118552      15073
## 11 UBC_ANV_2931827   Blindspark Blindspark  106807     128461
## 12 UBC_ANV_2931827   Blindspark Blindspark 1220074     128461
## 13 UBC_ANV_2928783   Blindspark Blindspark  122596     138270
## 14 UBC_ANV_2928783   Blindspark Blindspark 1315377     138270
## 15 UBC_ANV_2904996   Blindspark       NULL       1          0
## 16 UBC_ANV_2922034   Blindspark Blindspark   28561      31847
## 17 UBC_ANV_2922034   Blindspark Blindspark  278625      31847

The data.table package allows for factorization of vector values, aggregating across multiple instances of the same value (Video ID) in the original data table.

The list of IDs by Videostarts, are the number of video views generated on the UBC’s own website. The IDs that have zero views, are actually “NULL” values and represent videos that were only available offdomain. These IDs should be excluded from any totals that compare domain and offdomain performance.

## Loading required package: data.table
##         ExternalID domainviews
## 1: UBC_ANV_2910958           0
## 2: UBC_ANV_2936407      165572
## 3: UBC_ANV_2918316       43944
## 4: UBC_ANV_2925087      129846
## 5: UBC_ANV_2915086       30146
## 6: UBC_ANV_2931827      256922
## 7: UBC_ANV_2928783      276540
## 8: UBC_ANV_2904996           0
## 9: UBC_ANV_2922034       63694
##         ExternalID offdomainviews
## 1: UBC_ANV_2910958          43272
## 2: UBC_ANV_2936407         727008
## 3: UBC_ANV_2918316         208826
## 4: UBC_ANV_2925087         579371
## 5: UBC_ANV_2915086         130557
## 6: UBC_ANV_2931827        1326881
## 7: UBC_ANV_2928783        1437973
## 8: UBC_ANV_2904996              1
## 9: UBC_ANV_2922034         307186

The graphs and data tables below represent total video views, an aggregation by VideoID, Series/Program and Episode Title across UBC’s on website as well as offdomain platforms (i.e. Hulu Classic, Hulu Plus etc.).

The boxplot below, displaying means values, readily displays that the “Plus” subsription no-ad product, has many more views on average than the either of the other two ad supported models - Classir or ondomain. The second graph, a bar chart displays total views by platform, again pointing to the dominance of the subscription “Plus” model. The data table ranked on average views, also tells the same story.

## Loading required package: ggplot2

##        Type  avgviews sumviews
## 1:     Plus 543949.25  4351594
## 2:  Classic  45497.89   409481
## 3: ondomain  69047.43   483332

The table below shows a descending order ranker of total views (across all platforms) by individual episode titles for the “Blindspark” series on UBC.

##        Episode.Title  avgviews sumviews
##  1:     Sent On Tour 718986.50  1437973
##  2: Persecute Envoys 485114.00  1455342
##  3:  Authentic Flirt 269931.33   809794
##  4:   Cede Your Soul 214764.67   644294
##  5:    Split the Law 113011.00   339033
##  6:     Bone May Rot  76932.67   230798
##  7:     Sent on Tour 138270.00   138270
##  8: Eight Slim Grins  48543.33   145630
##  9:     A Stray Howl  21636.00    43272
## 10:  Series Premiere      1.00        1

The following stats give us an idea of the variablilty across our viewcounts for Blindspark view, demonstrated by the substantial difference in median (73.9K) and mean (218.5K) values.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       1   26910   73850  218500  150900 1315000

Source: SQL Data Tables and Transformations.