Please carefully read the statements below and check each box if you agree with the declaration. If you do not check all boxes, your assignment will not be marked. If you make a false declaration on any of these points, you may be investigated for academic misconduct. Students found to have breached academic integrity may receive official warnings and/or serious academic penalties. Please read more about academic integrity here. If you are unsure about any of these points or feel your assessment might breach academic integrity, please contact your course coordinator for support. It is important that you DO NOT submit any assessment until you can complete the declaration truthfully.
By checking the boxes below, I declare the following:
I have not impersonated, or allowed myself to be impersonated by, any person for the purposes of this assessment
This assessment is my original work and no part of it has been copied from any other source except where due acknowledgement is made. Due acknowledgement means the following:
No part of this assessment has been written for me by any other person except where such collaboration has been authorised by the lecturer/teacher concerned.
I have not used generative “AI” tools for the purposes of this assessment.
Where this work is being submitted for individual assessment, I declare that it is my original work and that no part has been contributed by, produced by or in conjunction with another student.
I give permission for my assessment response to be reproduced, communicated, compared and archived for the purposes of detecting plagiarism.
I give permission for a copy of my assessment to be retained by the university for review and comparison, including review by external examiners.
I understand that:
Plagiarism is the presentation of the work, idea or creation of another person or machine as though it is your own. It is a form of cheating and is a very serious academic offence that may lead to exclusion from the University. Plagiarised material can be drawn from, and presented in, written, graphic and visual form, including electronic data and oral presentations. Plagiarism occurs when the origin of the material used is not appropriately cited.
Plagiarism includes the act of assisting or allowing another person to plagiarise or to copy my work.
I agree and acknowledge that:
I have read and understood the Declaration and Statement of Authorship above.
If I do not agree to the Declaration and Statement of Authorship in this context and all boxes are not checked, the assessment outcome is not valid for assessment purposes and will not be included in my final result for this course.
The original data visualisation selected for the assignment was as follows:
The objective and audience of the original data visualisation chosen can be summarised as follows:
Objective To compare USA household income thresholds at important points in two different years. We look at the income levels. These levels mark specific income percentiles. This shows income changes over time.
Audience This content is for people interested in income distribution. It is for students, journalists, or the general public.
The original data visualisation chosen could be improved in the three following ways:
The following code was used to improve the original.
library(tidyverse)
library(scales)
library(plotly)
df <- tribble(
~percentile, ~year, ~cutoff,
"10%", 2017, 14280, "10%", 2018, 13488,
"25%", 2017, 30001, "25%", 2018, 29010,
"50%", 2017, 61822, "50%", 2018, 56849,
"75%", 2017,111160, "75%", 2018,105680,
"90%", 2017,178793, "90%", 2018,170432,
"95%", 2017,236360, "95%", 2018,225186,
"99%", 2017,434455, "99%", 2018,430600
) %>%
mutate(percentile = factor(percentile, levels = c("10%","25%","50%","75%","90%","95%","99%")))
wide <- df %>%
pivot_wider(names_from = year, values_from = cutoff) %>%
mutate(delta = `2018` - `2017`,
dir = ifelse(delta >= 0, "Up", "Down"))
p_new <- ggplot(wide, aes(y = percentile)) +
geom_segment(aes(x = `2017`, xend = `2018`, yend = percentile),
linewidth = 2, colour = "grey70") +
geom_point(aes(x = `2017`), colour = "blue", size = 2.6) + # slightly smaller
geom_point(aes(x = `2018`), colour = "red", size = 2.6) +
geom_text(aes(x = pmax(`2017`, `2018`) + 10000, # was +15000
label = scales::dollar(delta, accuracy = 1),
colour = dir),
hjust = 0, size = 2.8, show.legend = FALSE) + # smaller text
scale_colour_manual(values = c("Up"="red","Down"="blue"), guide = "none") +
scale_x_continuous(
breaks = seq(0, 450000, 100000), # roomy ticks
labels = scales::dollar_format(accuracy = 1),
limits = c(0, 460000), # a touch more room
expand = expansion(mult = c(0, 0.02))
) +
coord_cartesian(clip = "off") +
labs(
title = "US Household Income Cut-offs by Percentile",
subtitle = "Comparison of 2017 (blue) vs 2018 (red); label shows change (2018–2017)",
x = "Dollar cutoff", y = "Percentile",
caption = "Source: IPUMS-CPS ASEC (University of Minnesota). Weighted household income percentiles."
) +
theme_minimal(base_size = 12) +
theme(panel.grid.minor = element_blank(),
plot.margin = margin(10, 90, 10, 10))
p_interactive <- ggplotly(p_new, tooltip = c("x","y")) %>%
layout(
xaxis = list(tickformat = "$~s", dtick = 100000, automargin = TRUE),
margin = list(l = 70, r = 120, b = 70, t = 60)
)
for (i in seq_along(p_interactive$x$data)) {
if (p_interactive$x$data[[i]]$mode == "text") {
p_interactive$x$data[[i]]$showlegend <- FALSE
}
}
The following plot improves the original data visualisation in the three ways previously explained.
The reference to the original data visualisation choose, the data source(s) used for the reconstruction and any other sources used for this assignment are as follows: