Install and call up “dslabs” package
library(dslabs)
data(package="dslabs")
list.files(system.file("script", package = "dslabs"))
##  [1] "make-admissions.R"                   
##  [2] "make-brca.R"                         
##  [3] "make-brexit_polls.R"                 
##  [4] "make-death_prob.R"                   
##  [5] "make-divorce_margarine.R"            
##  [6] "make-gapminder-rdas.R"               
##  [7] "make-greenhouse_gases.R"             
##  [8] "make-historic_co2.R"                 
##  [9] "make-mnist_27.R"                     
## [10] "make-movielens.R"                    
## [11] "make-murders-rda.R"                  
## [12] "make-na_example-rda.R"               
## [13] "make-nyc_regents_scores.R"           
## [14] "make-olive.R"                        
## [15] "make-outlier_example.R"              
## [16] "make-polls_2008.R"                   
## [17] "make-polls_us_election_2016.R"       
## [18] "make-reported_heights-rda.R"         
## [19] "make-research_funding_rates.R"       
## [20] "make-stars.R"                        
## [21] "make-temp_carbon.R"                  
## [22] "make-tissue-gene-expression.R"       
## [23] "make-trump_tweets.R"                 
## [24] "make-weekly_us_contagious_diseases.R"
## [25] "save-gapminder-example-csv.R"

Work with the Research Funding Rates dataset

data(research_funding_rates)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.1     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   1.0.0
## ✓ tidyr   1.1.0     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggthemes)
library(ggrepel)
view(research_funding_rates)

Focus on Success Rates for Men and Women across all disciplines

Use Gather function to convert the data from wide to long

data("research_funding_rates")
research_funding_rates %>%
  select(discipline, success_rates_women, success_rates_men) %>%
  gather("gender", "success_rate", "success_rates_men":"success_rates_women")
##             discipline              gender success_rate
## 1    Chemical sciences   success_rates_men         26.5
## 2    Physical sciences   success_rates_men         19.3
## 3              Physics   success_rates_men         26.9
## 4           Humanities   success_rates_men         14.3
## 5   Technical sciences   success_rates_men         15.9
## 6    Interdisciplinary   success_rates_men         11.4
## 7  Earth/life sciences   success_rates_men         24.4
## 8      Social sciences   success_rates_men         15.3
## 9     Medical sciences   success_rates_men         18.8
## 10   Chemical sciences success_rates_women         25.6
## 11   Physical sciences success_rates_women         23.1
## 12             Physics success_rates_women         22.2
## 13          Humanities success_rates_women         19.3
## 14  Technical sciences success_rates_women         21.0
## 15   Interdisciplinary success_rates_women         21.8
## 16 Earth/life sciences success_rates_women         14.3
## 17     Social sciences success_rates_women         11.5
## 18    Medical sciences success_rates_women         11.2

Create barplot using ggplot

Put “Discipline” on the x-axis, “Success Rate” on the y-axis, and “Gender” as the third label on the legend. Create a title for the graph and label the x-axis and y-axis. Change the name of the labels on the legend. Change the theme and colors.

data("research_funding_rates")
research_funding_rates_long <- research_funding_rates %>%
  select(discipline, success_rates_women, success_rates_men) %>%
  gather("gender", "success_rate", "success_rates_men":"success_rates_women")

research_funding_rates_plot <-research_funding_rates_long %>% 
  ggplot() +
  geom_bar(aes(x=discipline, y=success_rate, fill = gender),
      position = "dodge", stat = "identity") +
  ggtitle("Gender Bias in Research Funding in the Netherlands") +
  xlab("Discipline") +
  ylab("Success Rate") + 
  labs(fill = "Gender")
  scale_y_continuous(limits = c(10,30))
## <ScaleContinuousPosition>
##  Range:  
##  Limits:   10 --   30
  scale_fill_discrete(name = "Gender", labels = c("Men", "Women"))
## <ggproto object: Class ScaleDiscrete, Scale, gg>
##     aesthetics: fill
##     axis_order: function
##     break_info: function
##     break_positions: function
##     breaks: waiver
##     call: call
##     clone: function
##     dimension: function
##     drop: TRUE
##     expand: waiver
##     get_breaks: function
##     get_breaks_minor: function
##     get_labels: function
##     get_limits: function
##     guide: legend
##     is_discrete: function
##     is_empty: function
##     labels: Men Women
##     limits: NULL
##     make_sec_title: function
##     make_title: function
##     map: function
##     map_df: function
##     n.breaks.cache: NULL
##     na.translate: TRUE
##     na.value: grey50
##     name: Gender
##     palette: function
##     palette.cache: NULL
##     position: left
##     range: <ggproto object: Class RangeDiscrete, Range, gg>
##         range: NULL
##         reset: function
##         train: function
##         super:  <ggproto object: Class RangeDiscrete, Range, gg>
##     rescale: function
##     reset: function
##     scale_name: hue
##     train: function
##     train_df: function
##     transform: function
##     transform_df: function
##     super:  <ggproto object: Class ScaleDiscrete, Scale, gg>
  theme_economist(base_size = 10, base_family = "Verdana", horizontal = TRUE, dkpanel = FALSE)
## List of 40
##  $ line                :List of 6
##   ..$ colour       : chr "black"
##   ..$ size         : NULL
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ rect                :List of 5
##   ..$ fill         : Named chr NA
##   .. ..- attr(*, "names")= chr NA
##   ..$ colour       : logi NA
##   ..$ size         : NULL
##   ..$ linetype     : num 1
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ text                :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : chr "black"
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title          :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title.x        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title.y        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : num 90
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text           :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.x         :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 0
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 10points 0points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.y         :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 0
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 10points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.ticks          :List of 6
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ axis.ticks.y        : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ axis.ticks.length   : 'simpleUnit' num -5points
##   ..- attr(*, "unit")= int 8
##  $ axis.line           :List of 6
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 0.8
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ axis.line.y         : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ legend.background   :List of 5
##   ..$ fill         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ linetype     : num 0
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ legend.spacing      : 'simpleUnit' num 15points
##   ..- attr(*, "unit")= int 8
##  $ legend.key          :List of 5
##   ..$ fill         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ linetype     : num 0
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ legend.key.size     : 'simpleUnit' num 1.2lines
##   ..- attr(*, "unit")= int 3
##  $ legend.key.height   : NULL
##  $ legend.key.width    : NULL
##  $ legend.text         :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1.25
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ legend.text.align   : NULL
##  $ legend.title        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1
##   ..$ hjust        : num 0
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ legend.title.align  : NULL
##  $ legend.position     : chr "top"
##  $ legend.direction    : NULL
##  $ legend.justification: chr "center"
##  $ panel.background    :List of 5
##   ..$ fill         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ linetype     : num 0
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ panel.border        : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ panel.spacing       : 'simpleUnit' num 0.25lines
##   ..- attr(*, "unit")= int 3
##  $ panel.grid.major    :List of 6
##   ..$ colour       : chr "white"
##   ..$ size         : 'rel' num 1.75
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ panel.grid.minor    : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ plot.background     :List of 5
##   ..$ fill         : Named chr "#d5e4eb"
##   .. ..- attr(*, "names")= chr "blue-gray"
##   ..$ colour       : logi NA
##   ..$ size         : NULL
##   ..$ linetype     : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ plot.title          :List of 11
##   ..$ family       : NULL
##   ..$ face         : chr "bold"
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1.5
##   ..$ hjust        : num 0
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ plot.margin         : 'simpleUnit' num [1:4] 12points 10points 12points 10points
##   ..- attr(*, "unit")= int 8
##  $ strip.background    :List of 5
##   ..$ fill         : Named chr NA
##   .. ..- attr(*, "names")= chr NA
##   ..$ colour       : logi NA
##   ..$ size         : NULL
##   ..$ linetype     : num 0
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ strip.text          :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1.25
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ strip.text.x        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ strip.text.y        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : num -90
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ panel.grid.major.x  : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  - attr(*, "class")= chr [1:2] "theme" "gg"
##  - attr(*, "complete")= logi TRUE
##  - attr(*, "validate")= logi TRUE
  scale_color_economist()
## <ggproto object: Class ScaleDiscrete, Scale, gg>
##     aesthetics: colour
##     axis_order: function
##     break_info: function
##     break_positions: function
##     breaks: waiver
##     call: call
##     clone: function
##     dimension: function
##     drop: TRUE
##     expand: waiver
##     get_breaks: function
##     get_breaks_minor: function
##     get_labels: function
##     get_limits: function
##     guide: legend
##     is_discrete: function
##     is_empty: function
##     labels: waiver
##     limits: NULL
##     make_sec_title: function
##     make_title: function
##     map: function
##     map_df: function
##     n.breaks.cache: NULL
##     na.translate: TRUE
##     na.value: NA
##     name: waiver
##     palette: function
##     palette.cache: NULL
##     position: left
##     range: <ggproto object: Class RangeDiscrete, Range, gg>
##         range: NULL
##         reset: function
##         train: function
##         super:  <ggproto object: Class RangeDiscrete, Range, gg>
##     rescale: function
##     reset: function
##     scale_name: economist
##     train: function
##     train_df: function
##     transform: function
##     transform_df: function
##     super:  <ggproto object: Class ScaleDiscrete, Scale, gg>
research_funding_rates_plot

A few things seem to have gone wrong here. The economist theme/colors aren’t showing up, the labels on the legend haven’t changed, and the labels on the x-axis are all on top of each other, making it impossible to read. I will now try again to fix these things and/or try to create an entirely different type of visualization and see if it works.

Call up RColorBrewer and highcharter

library(RColorBrewer)
library(highcharter)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Highcharts (www.highcharts.com) is a Highsoft software product which is
## not free for commercial and Governmental use
## 
## Attaching package: 'highcharter'
## The following object is masked from 'package:dslabs':
## 
##     stars

Use highcharter to create a barplot

data("research_funding_rates")
research_funding_rates_long <- research_funding_rates %>%
  select(discipline, success_rates_women, success_rates_men) %>%
  gather("gender", "success_rate", "success_rates_men":"success_rates_women")
  
research_funding_rates_plot2 <-research_funding_rates_long %>% 
highchart() %>%
  hc_add_series(data = research_funding_rates_long,
                   type = "column", hcaes(x = discipline,
                   y = success_rate, 
                   group = gender))
## Warning: `parse_quosure()` is deprecated as of rlang 0.2.0.
## Please use `parse_quo()` instead.
## This warning is displayed once per session.
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `select_()` is deprecated as of dplyr 0.7.0.
## Please use `select()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `as_data_frame()` is deprecated as of tibble 2.0.0.
## Please use `as_tibble()` instead.
## The signature and semantics have changed, see `?as_tibble`.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `rename_()` is deprecated as of dplyr 0.7.0.
## Please use `rename()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
research_funding_rates_plot2

Not sure what happened here, there are way more columns than there should be. Going to try again.

data("research_funding_rates")
research_funding_rates_long <- research_funding_rates %>%
  select(discipline, success_rates_women, success_rates_men) %>%
  gather("gender", "success_rate", "success_rates_men":"success_rates_women")

research_funding_rates_plot3 <-research_funding_rates_long %>% 
 hchart(type = "column", hcaes(x = discipline, y = success_rate, group = gender)) %>%
 hc_title(text = "Gender Bias in Research Funding in the Netherlands") %>%
 hc_yAxis(title = list(text = "Success Rate")) %>%
 hc_xAxis(title = list(text = "Discipline")) %>%
 hc_legend(align = "right", 
            verticalAlign = "top") %>%
 hc_add_theme(hc_theme_gridlight())
research_funding_rates_plot3

This is definitely the most successful visualization yet. Here I have used highcharter to create a barplot. I used a highcharter theme called “gridlight, moved the legend to the top right, re-named the x-axis and y-axis, and added a title to the top of the graph. I still can’t figure out how to change the variables of”success_rates_men" and “success_rates_women” to say “Men” and “Women.”

Final Results

It took many tries to create a visualization that showed the data in an interesting and understandable way, but I am happy with the results. Although I couldn’t figure out how to rename the Gender variables from “success_rates_men” and “success_rates_women” to say “Men” and “Women,” I think it is still easy to understand the data that the visualization is presenting. Highcharter was a bit intimidating at first, I was pretty confused when I initially tried it, but I think it ended up being easier to tweak with and customize than ggplot. I really enjoyed flipping through some of the available themes, and the hovering tool really adds to the ease of taking in the data and understanding what is being measured. Being able to customize the plot makes data visualization and communication much more enjoyable to learn and work with. The results of my barplot were pretty interesting. It is well known that women are still paid less and recognized less in almost every field, however in this dataset, women had more success than men in 4/9 fields. Important to note though that men still had the highest success rates overall, but it was interesting to see women having more success than men almost half the time, across a variety of disciplines–from the hard sciences to humanities. I partially chose this dataset because I studied for a semester at the University of Amsterdam in the social sciences department, so I was curious to see these results. Sad to see men had more success in getting awards/funding in social sciences! Overall, I think this plot presents the data clearly and in a nice visual way. Going forward, now that I know how to create and customize plots to this extent, I can focus on learning new ways to play around with creating visualizations for datasets.