Original


Visual Capitalist. (2021).


Objective

A number of companies have lunched streaming services for a few years ago. However, between Covid-19 pandemic, there are increased rapidly of subscribers. So, the visualisation illustrates the number of subscribers between 2019 and 2020 in order to represent percentage of subscribers’ growth.

Target audiences of this visualisation include record label, film producer, and people who are interested in online marketing and advertisement as well. As there has some advantages to those people, it can help them to make a decision which platform they would like to stream their product.

The visualisation chosen had the following three main issues:

  • Difficult to compare percentage of the services. For example, comparing percentage of the highest and lowest of subscribers. It take a long time to find the highest and the lowest percentage because we need to read all percentage of the data.
  • Color issue. Some words color is almost same as background color, such as percentage of Pandora and Deezer. It may impact to people who have some problems of visual.
  • Area and size issue. The area and size of the visualisation is not balance, the space of Netflix is too big compare by New York Times which impact to the small size of services is difficult to look at.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(readxl)
library(magrittr)
library(dplyr)
library(tidyr)

stream <- read_xlsx("StreamingService2020.xlsx")
stream_per <- read_xlsx("StreamingServicePercent.xlsx")

stream_service <- stream %>% left_join(stream_per, by = c("Service","Type"))

# Calculating Subscribers in 2019
stream_service$SubQ4_2020 <- as.numeric(stream_service$SubQ4_2020)

stream_service <- stream_service %>% mutate(SubQ4_2019 =
               ifelse(Service == "Disney+" | Service == "Apple TV+" , 0, #New services
               ifelse(Service != "Disney+" | Service != "Apple TV+" , round(SubQ4_2020*100/(PercentGrowth2019+100)), 0)))


# Bar Plot 
stream_sub <- stream_service %>% gather(key = "Year", value = "Subscribers", SubQ4_2019, SubQ4_2020 )

sub_plot <- ggplot(stream_sub, aes(x = reorder(Service, Subscribers), y = Subscribers, fill = Year)) + 
  geom_bar( position = "dodge", stat = "identity", width = 0.75) +
  coord_flip() + 
  scale_y_continuous(breaks = c(0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200, 220)) +
  scale_fill_manual(values=c("#67A8E8","#E86779"), labels = c("2019", "2020")) +
  geom_text(aes(label = Subscribers),position = position_dodge(width = 1), color ="black", hjust = -0.2,  size = 2.1)+
  facet_grid(rows = vars(Type), scales = "free", space = "free") +
  xlab("Streaming services") +
  ylab("Subscribers (million)") +
  labs(title = "Streaming services' subscribers between 2019 and 2020") +
  theme(plot.title = element_text(size = 13, hjust = 0.5, face = 'bold'), strip.text.y = element_text(angle = 0)) 

                                                                           

# Line Plot
per_plot <- stream_service %>% ggplot(aes(x = Service, y = PercentGrowth2019, group = 1)) +
  geom_line(position = "identity", stat = "identity", color = "#781A19", inherit.aes = TRUE) +
  geom_point(color = "#781A19") +
  labs(title = "Percentage of subscribers' growth between 2019 and 2020", 
       x = "Streaming services", y = "Growing(%)") +
  theme(axis.text.x = element_text(vjust = 0.5, angle = 90, hjust = 1),
        plot.title = element_text(size = 13, hjust = 0.5, face = 'bold'))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.