knitr::opts_chunk$set(echo = F,
                      #warning = F,
                      #message = F,
                      fig.align = "center")

# Load the tidyverse, skimr, and gt packages
pacman::p_load(tidyverse, skimr, gt)

Instructions:

Data Description:

The supers2.csv file has 12 variables on 6961 characters with super abilities. The 12 variables are:

  1. Character: The primary name of the character
  2. Creator: If the owner is Marvel or DC
  3. Alignment: If the character is considered a good, neutral, or bad guy
  4. Alter Ego: If the character has an alter ego (IE Spider-Man = Peter Parker)
  5. Eye_color and Hair_color: The character’s eye and hair colors
  6. Species: The species of the character
  7. IQ: The character’s IQ
  8. Intelligence, Strength, Combat, Durability: Attribute scores on a scale from 1 - 100

Read in the “supers2.csv” data set and save it as a global object named supers.

Question 1: Keeping only Marvel and DC characters

Part 1A: Comics data set

Create a data set named comics that has:

  1. Creators as Marvel Comics and DC Comics
  2. Alignment of the characters is not missing (NA).

If done correctly, you should have 1899 rows. Display the resulting data set using the tibble() function.

## # A tibble: 1,899 × 12
##    Character     Creator Alignment Alter_Egos Eye_color Hair_color Species    IQ
##    <chr>         <chr>   <chr>     <chr>      <chr>     <chr>      <chr>   <int>
##  1 3-D Man       Marvel… Good      None       <NA>      <NA>       None      110
##  2 A-Bomb        Marvel… Good      None       Yellow (… No Hair    Human     130
##  3 A.I.M. Agent  Marvel… Bad       None       None      None       Human      70
##  4 A.M.A.Z.O     DC Com… Bad       None       Red       Orange     Android   115
##  5 A.M.A.Z.O.    DC Com… Bad       None       Red       Brown      Android   190
##  6 Abby          DC Com… Good      None       Black     White      Animal    106
##  7 Abel Cuvier   DC Com… None      None       None      None       None      160
##  8 Abomination   Marvel… Bad       None       <NA>      <NA>       Human     130
##  9 Above All Ot… Marvel… Neutral   None       None      None       Cosmic…   200
## 10 Abraxas       Marvel… Bad       None       Blue      Black      Cosmic…   200
## # ℹ 1,889 more rows
## # ℹ 4 more variables: Intelligence <int>, Strength <int>, Combat <int>,
## #   Durability <int>

Part 1B) Comics IQ

Using the comics data set created in 1A), create the density plots seen in the pdf in Brightspace. Make sure to have the area under the curves partly see through

Part 1C) Improved Density Plots

Using one of the dplyr verbs, create a new data set named comics2, with:

  • Alignment having 3 groups:

    1. Good = Good
    2. Neutral = Neutral or None
    3. Bad = Bad
    • Hint: Look at the slides for mutate for the appropriate dplyr verb for this question!
  • Alignment groups should be in order of Good, Neutral, Bad

  • Physical = (Strength + Combat + Durability)/3

  • Remove the word ” Comics” from the Creator columns using the str_remove() function

The resulting data set should still have 1899 characters. After saving comics2, display that the results worked using skim() on only the three columns mentioned above!

Data summary
Name dplyr::select(comics2, Cr…
Number of rows 1899
Number of columns 3
_______________________
Column type frequency:
character 1
factor 1
numeric 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Creator 0 1 2 6 0 2 0

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
Alignment 0 1 FALSE 3 Goo: 727, Bad: 593, Neu: 579

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Physical 0 1 61.14 30.03 1 34.33 60 91.67 100 ▂▆▃▂▇

Part 1D) IQ Density Plot

Recreate the density plot in 1B using comics2 for characters with an IQ at or above 50. Don’t save a new data set for the at least 50 IQ characters!

Question 2) Average Physical by Creator and Alignment

Calculate the sample size, mean, median, and standard deviation for Physical rating for each Creator and Alignment combination. Round each calculated summary to the nearest whole number, then sort the data set from largest to smallest average. Save it as comic_phys.

Display the resulting data set in the knitted document using the gt() function

Creator Alignment characters phys_avg phys_med phys_sd
Marvel Neutral 304 70 79 28
DC Bad 238 65 73 31
Marvel Bad 355 60 55 29
Marvel Good 470 58 50 30
DC Good 257 58 53 29
DC Neutral 275 58 55 33

Question 3) Average Character Scores Across Alignment and Creator

Part 3A) Creating the summarized data set

Using comics2 created in 1C and the different dplyr verbs seen so far, create a data set with 4 columns:

  1. Creator: Either DC Comics or Marvel Comics
  2. Alignment: Good/Neutral/Bad (in that order)
  3. Attribute: Intelligence, Strength, Combat, Durability (in that order) (No Physical)
  • Hint: Use as_factor() as a quick way to change the order of the groups!
  1. score_avg: Average score for the attribute in column 3 for each creator and alignment combination rounded to the nearest whole number

Save the results as supers_attr. Display the first 10 rows using tibble(). See the pdf in Brightspace for what the final data frame should look like

## # A tibble: 24 × 4
##    Creator Alignment Attribute    score_avg
##    <chr>   <fct>     <fct>            <dbl>
##  1 Marvel  Good      Intelligence        73
##  2 Marvel  Good      Strength            51
##  3 Marvel  Good      Combat              68
##  4 Marvel  Good      Durability          54
##  5 Marvel  Bad       Intelligence        74
##  6 Marvel  Bad       Strength            57
##  7 Marvel  Bad       Combat              67
##  8 Marvel  Bad       Durability          56
##  9 DC      Bad       Intelligence        78
## 10 DC      Bad       Strength            62
## # ℹ 14 more rows

Part 3B) Dumbbell Plots

Create the graph seen in the pdf in Brightspace.