Following the directions on the Coursera assignment page, you will make four original visualizations. Note that the data for the CCES and CEL data are imported in code in the R Markdown file.
Using ggplot, create a scatter plot, a distribution figure (box plot, histogram, or density plot), bar plot, and line plot.
Use colors and facetting in at least one of your figures.
Add a text annotation to at least one of the your figures.
Use a ggtheme for one of your figures.
Make sure your figures have titles and appropriately labeled axes.
Explain what you are visualizing here: People who think religion is either very important or somewhat important. This visuzalization is by raw number and does not represent the portion of race nor response rate.
Put your figure here:
Explain what you are visualizing here: Correlation between the education level and their income. Used geom_jitter() to identify regions.
*Humbly sharing my code here – it seems pretty bulky and I would like to fix this. If anyone can suggest how to recode, factor and set the level at the same time, please hit me up!
Put your figure here:
cces_reg_inc <- cces_relig_race %>%
mutate(Income = recode(faminc_new, "1" = "Less than $10,000", "2" = "$10,000 - $19,999", "3" = "$20,000 - $29,999", "4" = "$30,000 - $39,999", "5" = "$40,000 - $49,999", "6" = "$50,000 - $59,999", "7" = "$60,000 - $69,999", "8" = "$70,000 - $79,999", "9" = "$80,000 - $99,999", "10" = "$100,000 - $119,999", "11" = "$120,000 - $149,999", "12" = "$150,000 - $199,999", "13" = "$200,000 - $249,999", "14" = "$250,000 - $349,999", "15" = "$350,000 - $499,999", "16" = "$500,000 or more")) %>%
mutate(Education = recode(educ, "1" = "No high school", "2" = "High school graduate", "3" = "Some college", "4" = "2-year college", "5" = "4-year college", "6" = "Post-grad"))
cces_reg_inc$Income <-
factor(cces_reg_inc$Income, levels = c("1" = "Less than $10,000", "2" = "$10,000 - $19,999", "3" = "$20,000 - $29,999", "4" = "$30,000 - $39,999", "5" = "$40,000 - $49,999", "6" = "$50,000 - $59,999", "7" = "$60,000 - $69,999", "8" = "$70,000 - $79,999", "9" = "$80,000 - $99,999", "10" = "$100,000 - $119,999", "11" = "$120,000 - $149,999", "12" = "$150,000 - $199,999", "13" = "$200,000 - $249,999", "14" = "$250,000 - $349,999", "15" = "$350,000 - $499,999", "16" = "$500,000 or more"))
cces_reg_inc$Education <-
factor(cces_reg_inc$Education, levels = c("1" = "No high school", "2" = "High school graduate", "3" = "Some college", "4" = "2-year college", "5" = "4-year college", "6" = "Post-grad"))
ggplot(cces_reg_inc, aes(x = Education, y = Income, color = Region)) +
geom_point(show.legend = F) +
geom_jitter(show.legend = F) +
facet_wrap(~ Region) +
labs(x = "Education Level", title = "Income by Education Level") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Explain what you are visualizing here: Intuitive income range by region. While all regions show similar pattern in IQR and median, the person who make $500,000 or more is considered an outlier in South region. I called this intuitive as the point fall in to the preset range, but it gives a sense of relatively low income in the South and the wider range in the West. Used geom_hline() to show overall median.
Put your figure here:
Explain what you are visualizing here: From Center for Effective Lawmaking Data. Percentage of African-American congress member. Used scale_y_continuous(labels = scales::percent) for y axis to represent percentage. The theme is theme_solarized from {ggthemes}.
Put your figure here: