This is an initial run at demonstrating the potential of R and its packages in exploring Facebook usage patterns of both producers and users. For this example, I’m using the WGN America fan page for the program Salem. I am using the R package “Rfacebook”" to grab all publicly available data on every post that WGN America has posted to the Salem fan page. Because the data is publicly avaiable it is limited by the changes Facebook has made to its API. Much richer user data is available to anyone who has access to the page’s API.
WGN first posted to the Salem page on 12/19/2013, and since that time has posted a total of 723 times, most recently on 1/21/2016. With this data, we can map out the growth of WGN’s Facebook presence and popularity over time. The figure below displays the average user engagement with WGN’s Facebook content for each month. Beginning in 04/2014 (month of premier) and continuing through the fall of 2015, user engagement demonstrates a consistent trend - the number of likes incrementally increases over time, reaching a high in 09/2015 (~8100 likes per post). As expected, the number of user comments and shares are far below likes, but there is still a slight upward trend.
NOTE: It’s important to point out that the black dots represent the mean engagement level across all posts for any given month. WGN did not post anything to its Salem page in either 11/2015 or 12/2015. Thus, the lines connecting the black dots over this period do not actually represent a strong trend increase in the winter of 2015/2016, but instead represent the results of a single post made by WGN in 01/2016. WGN updated the page to let its fans know that Marilyn Manson was going to be a special guest star on the upcoming season. This information proved to be popular and the post is now the 4th most liked post on the page.
The overall upward trend in user engagement occurs despite WGN not consistently posting to the page throughout the year. WGN slowly ramps up their posting, reaching a peak during that season’s premier month, and then drops off throughout the rest of the show’s season.
The table below breaks down the types of posts made during the existence of the Salem page. Just under 90% of posts to the page have some sort of media attached, with photos posted more often than video at approximately a 2:1 ratio.
##
## event link photo status video
## 1 74 417 7 224
The types of post made to Facebook can be broken out further to provide additional information on what posts gain traction with the audience. (Note that the y-axes on each plot vary.) These plots indicate that in terms of likes, the most consistently popular posts have been photo posts - they have been trending upwards on average since 2014. Video posts demonstrate more fluctuation in terms of engagement, but user engagement with video posts has generally declined over the past year (although note the number of shares for video compared to other post-types). Link posts draw similar levels of engagement to video posts (The recent Marilyn Manson post is the outlier here). Finally, one can see that standard status updates do not fare well with users and are only sparingly used by WGN.
Caveat: There are a number of factors that are unaccounted for in this break down of user engagement. For instance, we don’t know which, if any, of these posts are paid versus organic. Nor do we know if other exogenous factors, such as the popularity of the show, are driving engagement independent of WGN’s media strategy. Without such information, we can note trends and differences between types of post, but we can’t come to any conclusions as to why the trends are occurring or compare the levels of engagement with other pages on an apples-to-apples basis.
Viewing this data in another way, we break down the type of engagement by type of post and summarize the overall user behavior. The first table is focused solely on the link post. As noted above, the Marilyn Manson post is an outlier in it’s popularity, but due to the overall lack of link posts, it is also highly influential. The extent of the influence if evident on the link statistics - note the difference in the mean and median engagement levels when the Manson post is kept in the dataset (“with”) and excluded (“without”). (I apologize for the ugly tables.)
| type | metric | with: mean(n) | with: median(n) | without: mean(n) | without: median(n) |
|---|---|---|---|---|---|
| link | comments | 166.6622 | 76.500 | 97.20948 | 70.46875 |
| link | likes | 3586.1454 | 2018.714 | 2576.15578 | 1754.98214 |
| link | shares | 758.1171 | 199.000 | 278.55404 | 145.57143 |
The summary statistics for all other types of posts are below, and they tell a story that is quite similar overall to the trend plots above - photos garner the largest share of likes, and videos are, on average, the most likely to be shared by users. Note the gap between the mean and median on video shares however; not every video is shared widely, but when they gain traction with users, the number of shares increase significantly.
| type | metric | mean(n) | median(n) |
|---|---|---|---|
| photo | comments | 108.2275 | 76.00000 |
| photo | likes | 3236.6061 | 2751.40000 |
| photo | shares | 251.5357 | 173.08696 |
| status | comments | 67.5000 | 16.00000 |
| status | likes | 798.5000 | 596.00000 |
| status | shares | 55.5000 | 45.00000 |
| video | comments | 117.2661 | 76.68571 |
| video | likes | 2175.7641 | 1884.96774 |
| video | shares | 378.3414 | 159.92308 |
The figures and tables above break down the average engagement over each month. If each post is looked at individually however, it is not surprising to find that posts that are popular are popular across all types of engagement. Breaking down the posts by their rate of likes, comments, and shares, there is a strong correlation (~0.90) between methods of engagement. Thus, if a post is more likely to elicit a “like” from users, it is also more likely to draw a comment or a share compared to those posts that are less popular.
WGN Salem boasts over 400,000 fans of its page, and its 723 posts to the page have been liked almost 2 million times, commented on approximately 78,000 times, and shared almost 200,000 times. Likes and shares are both fairly simple actions for Facebook users - it’s one click. But comments are of interest - how many people are commenting on WGN posts, how often, and what type of engagement are they finding from other users? To begin, we note that of the approximate 78,000 comments posted, there are only 34,000 commenters. This helps to explain the mean comment rate just above 2 in the summary statistics table below.
The table below provides summary statistics on the users that are commenting on WGN’s posts. Of those who have commented on a post, the median number of comments left on the page is 1.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 1.000 2.119 2.000 320.000
But there are a select group of fans that are truly engaged - way, way, way out in the tail of the fan population is a special group that is going above and beyond the norm. The top 150 commenters to the page are contributing 10% of all comments. These 150 users represents .004% of the commenters on the page, and .0004% of the overall fans of the page.
Further, this select group of commenters are also drawing out engagement from other users. The approximate 78,000 user comments have drawn a collective 106,000 likes from users. 15% of all of these user comment likes are earned by these top 150 posters.
These 150 users can be further broken down by their level of activity. If these 150 are the top users, then the top 10 commenters are the super users - posting from (rank #1) 320 to (rank #10) 96 times respectively. Once we move away from the top 10 commenters, engagement drops off considerably and then flattens out into a range of 25 to 50 posts. In comparison to the top 10 posters, 25 posts is minimal, but when compared to the rest of the fans, 25 posts is still in the 99th percentile of all users.
While the comment totals of the top 150 fits a nice decline curve, there is far less order to the top 150 commenters’ like habits. The correlation between comments and likes for the top 150 is only a moderate 0.45. Unfortunately, it’s difficult to dig much deeper into either the likes or shares of users. As the API is currently constituted, neither of these can be tied directly to users. Further, users can’t like a post more than once, and individual share rates are not tracked.
While I would normally not recommend using a word cloud in any type of anlytics work, its usage in this case allows a quick and dirty comparison of how fans of the Salem page, across levels of engagement, are discussing and conversing about the show. This is merely a time saver at this point, but further text analysis can be completed with a bit more work.
In making the word clouds, users were segmented by their comment rate - the top 150 compared to the rest. For each group, a random sample of 1000 comments were selected and word clouds were created based upon the terms that were used at least five times in each sample. Visual comparison of the two word clouds indicates that the language users are posting to the Salem page leans heavily show-centric, and each group appears to be using common terms.
Also of interest is the length over which WGN posts still draw comments. Obviously, WGN wants their posts to draw lots of engagement, but they should also not want their fans to quickly lose interest.
The overall duration of fan engagement can be found with this data and is summarized below. Because there are so few status posts, we ignore those. Link posts draw attention from users for the shortest amount of time - a median range of 3 days between WGN posting to the account and the last comment from a user. Photos do better, with the median duration lasting 7 days before comments cease, however as can be seen from the average duration, there are still some posts that draw comments over a longer period of time. The tail of videos is much longer than other types of posts. The average time frame over which WGN video posts draw comments is over 140 days, and the median is 195, indicating that a majority of posts are outperforming the average - many of these video posts are drawing user comments after 250 days.
## type mean(max_tail) median(max_tail)
## 1 link 32.34722 days 3 days
## 2 photo 43.86747 days 7 days
## 3 status 147.28571 days 195 days
## 4 video 104.60987 days 28 days
This was intended to be a quick synopsis of R and facebook data. Other questions that can be investigated are sentiment and content analysis of both the user and content producer as well as some simple segmentation of users based upon their engagement. Although these projects would require a little more time to carry out.
Were we to have the producer’s API, there is a wealth of data that would expand segmentation capabilities. I’m not sure if the new API allows the App developer to see user networks, but if it does, then there is also the opportunity for social network analysis.