Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: RAJYA SABHA SESSION – 249 STARRED QUESTION NO. 136; BSNL: Bharat Sanchar Nigam Limited.


Objective

The objective of the visualization is to showcase the different internet service providers in India in terms of their market share.

Target Audiance : Members of Rajya Sabha(Parliament) and general public

The visualisation chosen had the following three main issues:

  • Pie charts use angle to represent proportions or percentages.Hence it gets easier if there are very few number of categories ideally(3-5) but with a large no. of categories the pie chart becomes too congested with data like in this case and hence each part of the data cannot be studied clearly as required.

  • In this, even though the smaller portions data can be visualized using an interactive chart, the divisions are so small that selecting a particular provider becomes a painful task which fails the purpose of interactive charts.

  • This chart fails the purpose of visualization and comparison at first look as only the top 3 are marked with their shares and the rest are left to compare by guessing their area and even the areas are not clearly visible for the providers with a smaller market shares.

Reference

Code

The following code was used to fix the issues identified in the original.

library(dplyr)
library(ggplot2)
library(readr)

#importing the dataset
df <- read_csv("market_Share.csv")

#We rename the columns
df <-df %>% rename(Provider = 'Internet Service Provider',
                   Share_Percent = 'Share (%)')

#We take the top 10 observations as the below ones are totals and grand totals
df2 <- head(df,10)

#We see that there is one observation of 'others' below the totals count 
df3 <- df[12,]

#We include the data of others into our dataframe for visualization
df2 <-rbind(df2,df3)

#Sorting the data in ascending market share values 
df2$Provider <- factor(df2$Provider,levels = df2$Provider[order(df2$Share_Percent)])

#Code to plot the graph
p<-ggplot(data=df2, aes(x=Provider, y=`Share_Percent`)) +
  geom_bar(stat="identity", fill="steelblue") + 
  geom_text(aes(label=Share_Percent), vjust=-0.3, size=3.5)

#we rename the xticks for better clarity
p <- p + scale_x_discrete(labels=c("Reliance JioInfocomm Ltd" = "JIO",
                              "Vodafone Idea Limited" = "Idea_Voda",
                              "Bharti Airtel Ltd." = "Airtel",
                              "Bharat Sanchar Nigam Ltd."="BSNL",
                              "Tata Teleservices Limited" = "Tata",
                              "Atria Convergence Technologies Pvt. Ltd." = "Atira",
                              "Mahanagar Telephone Nigam Ltd" = "MTNL",
                              "Hathway Cable & Datacom Pvt. Ltd." = "Hathway",
                              "You Broadband India Pvt. Ltd." = "YOU",
                              "GTPL Broadband Pvt. Ltd." = "GTPL",
                              "Others" ="Others"))

#adding the title and labels 
p <- p+ ggtitle("Market Share of Internet Subscribers of Various Internet Service Providers in India as on 31st March, 2019")+
  labs(y="Market Share in %", x = "Internet Provider")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.