Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The original visualisation was used to show evolution of several variables (names) using an area chart and it shows evolution of 9 American baby names. I also found this visualisation used as an example for faceted plots at https://www.r-graph-gallery.com/223-faceting-with-ggplot2.html, to explain what faceting is.
Targetted audience is any person interested in these data visualisation techniques, but in using the title “Popularity of American baby names in previous 30 years”, it would also include any person who has at least a passing interest in baby names in America.
Above visualisation has the following three main issues:
Misleading
Title says “Popularity of American baby names in previous 30 years”, but inside the code (available along with the original visualisation), it is limited to an arbitrary list of nine female names. It is not clear as to why these 9 names were selected. A viewer may think these are the most popular names in America for girls. Y axis also misleading due to range issues.
Failing to answer a practical question
The visualisation does not answer the practical question as per it’s title - “Which name is popular in America at any given time?” The compression of X axis makes it difficult to even understand which of these nine names was popular at a point in time.
Arrangement
Uses default alphabetical ordering, instead of a meaningful ordering, say, by rank.
Reference
* Original data visualisation was found on https://www.data-to-viz.com/graph/area.html.
The following code was used to fix the issues identified in the original.
To see the Shiny app output, scroll inside the window (Re-size to smaller window in case Warning: Error in
library(babynames)
library(shiny)
library(tidyverse)
# ================================================================================================
# ================================================================================================
# ui
# ================================================================================================
# ================================================================================================
ui <- ui <- fluidPage(
fluidRow(
#Set the filters in UI
column(width = 12,
wellPanel(titlePanel("Top 10 popular American Baby Names by year - 1888 to 2017 "),
column(12,
sliderInput(inputId="year",
label=h3("Select year"),
min = 1888,
max = 2017,
value = 2005)
)
,column(4, offset = 0,
radioButtons(inputId="sex",label=h3("Select gender"),choices = list("male"="M","female"="F"),selected = "M")
)
)
)
),
#Main Plot
column(12,
plotOutput("distPlot")
)
)
# ================================================================================================
# ================================================================================================
# Server
# ================================================================================================
# ================================================================================================
server <- function(input, output, session) {
#Reactive function to filter data based on user input
don_reactive <- reactive({
don_filter <- babynames %>%
filter(sex==input$sex) %>%
filter(year==input$year) %>%
slice_max(n, n = 10) %>% arrange(n)
don_filter$name <- factor(don_filter$name, levels = don_filter$name)
print(don_filter)
})
output$distPlot <- renderPlot({
# Create the plot
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
p <- ggplot(data=don_reactive(), aes(x=name, y=n)) +
ggtitle(paste("Top 10 " ,
case_when(
input$sex == "M" ~ "boy",
input$sex == "F" ~ "girl"),"baby names in ",input$year)) +
coord_flip()
p <- p + geom_col()
p <- p + geom_bar(stat="identity",colour="black",fill=cbPalette[3])
p <- p + geom_text(aes(label = name), position = position_stack(vjust = 0.5))
p <- p+ geom_label(aes(label = n))
p <- p+theme(
plot.title = element_text(color="black", size=26, face="bold.italic")
)
p <- p + theme(axis.title = element_blank(),axis.text.y=element_blank())
p <- p + scale_x_discrete(labels = NULL, breaks = NULL) + labs(x = "")
p
})
}
# ================================================================================================
# ================================================================================================
# Shiny app construction
# ================================================================================================
# ================================================================================================
shinyApp(ui=ui, server=server)
Data Reference
The following plot fixes the main issues in the original.