Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
This visualization is primarily covering U.S. insurance market with objective to compare various insurance sector across their top 10 companies and to analyze if ranking is based on direct premiums written.
The visualization chosen had the following three main issues:
Reference
Ranking Insurance Companies by Direct Premiums Written in 2020 https://howmuch.net/articles/insurance-companies-ranking-by-direct-premiums-written-in-the-US
Data source : NAIC data, sourced from S&P Global Market Intelligence, Insurance Information Institute(III). https://www.iii.org/fact-statistic/facts-statistics-insurance-company-rankings
The following code was used to fix the issues identified in the original.
#install.packages("tidytext") #facet reordering
library(dplyr) # # Install and load this package to filter data
library(readr) # For reading datasets in other formats
library(magrittr) # For pipes
library(here) # For sensible file paths in the project folder
library(ggplot2) # For plotting
library(scales)
library("RColorBrewer") # Load RColorBrewer
library(tidytext)
#import the data
insurancedata <- read_csv(here( "insurancedata.csv"))
# define category for Marketshare
insurancedata <- mutate(insurancedata, marketshare_level = ifelse(Marketshare >= 0.15, "15% or more",
ifelse((Marketshare >= 0.10) & (Marketshare <= 0.149), "10% to 14.9%",
ifelse((Marketshare >= 0.05) & (Marketshare <= 0.099), "5% to 9.9%",
ifelse(Marketshare <= 0.049, "Less than 5%",
"NA")))))
#Order Market share by higher % to lower
insurancedata %<>%
mutate(marketshare_level = factor(marketshare_level,
levels = c("15% or more", "10% to 14.9%", "5% to 9.9%", "Less than 5%")
# , ordered = TRUE
))
str(insurancedata$marketshare_level)
## Factor w/ 4 levels "15% or more",..: 3 3 3 3 3 4 4 4 4 4 ...
#Facet bar chat for 7 insurance sector by company
data_to_plot <- insurancedata %>%
mutate(Company = reorder_within(Company, Directpremiumswritten, InsuranceSector))
p1 <- ggplot(data_to_plot, aes(y=Company, x=Directpremiumswritten, fill=marketshare_level)) +
geom_bar(stat="identity") +
scale_x_continuous(labels = label_number(suffix = " M", scale = 1e-6)) + # millions
facet_grid(rows = vars(InsuranceSector), scales = "free") +
labs(title = "U.S. Insurance Companies Ranking by Direct Premiums Written",
subtitle = "Top 10 Companies in 7 Different Insurance Sector",
x = "Direct Premium Written (in Millions) >>",
y = "Insurance Company >>",
fill = "Market Share")+
scale_y_reordered() +
scale_fill_manual(values = c("darkblue","steelblue4","steelblue3","steelblue1"),
labels = c("15% or more", "10% to 14.9%", "5% to 9.9%","Less than 5%"))
Data Reference
The following plot fixes the main issues in the original.