See the Brightspace post for a description of the data.

If a question asks for any calculations (means, medians, tables, proportions, etc…) or graphs, make sure they appear in the knitted document

The final document should not show any warnings

Question 1: Skimming the data set

Skim the data set.

skim(bones)
Data summary
Name bones
Number of rows 1531
Number of columns 7
_______________________
Column type frequency:
character 2
numeric 5
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
sex 0 1 4 6 0 2 0
age 0 1 3 5 0 4 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
humerus 161 0.89 303.88 22.97 229.5 287.5 303.5 319 376.0 ▁▅▇▅▁
radius 217 0.86 233.14 18.98 179.0 219.0 233.0 247 290.5 ▁▅▇▅▁
femur 117 0.92 427.28 31.46 345.0 405.0 428.0 449 531.0 ▂▆▇▃▁
tibia 135 0.91 353.21 28.06 276.0 333.0 353.0 372 446.0 ▁▆▇▃▁
iliac 66 0.96 262.55 18.05 184.0 251.0 263.0 274 324.0 ▁▂▇▆▁

Answer the following questions:

There are two categorical variables: sex and age

The five other variables are numeric: humerus, radius, femur, tibia, iliac

Iliac has the fewest missing values

radius has the most missing values

Question 2: Arm Length

Start by creating a new column in bones called arm, the total length of the humerus and radius bones, then display the first 6 rows of the bones data set

bones$arm <- bones$humerus + bones$radius

head(bones)
##      sex   age humerus radius femur tibia iliac   arm
## 1   Male 20-29   289.0  223.5 398.0   312 240.5 512.5
## 2 Female 20-29      NA  197.5 376.0   288 246.5    NA
## 3   Male 20-29   305.5  228.5 421.0   337 295.0 534.0
## 4   Male 20-29   287.0  221.0 407.5   320 279.0 508.0
## 5   Male 20-29   352.5  259.0 513.5   377 287.0 611.5
## 6   Male 20-29   294.5  228.0 403.0   318 269.0 522.5

What happens in R when you add a missing value to a non-missing value?

If either value is missing, R returns NA (missing)

Part 2A: Graphs for Arm

2A i) Blank graph

Start by creating a blank graph saved as gg_arm with:

  • arm on the x-axis

  • A white background with grey grid lines

  • The x-axis labelled as “Arm Length (mm)”

gg_arm <- 
  ggplot(
    data = bones,
    mapping = aes(x = arm)
  ) + 
  
  theme_bw() +
  
  labs(x = "Arm Length (mm)")
  
gg_arm

Part 2A ii) Histogram

Create and save a histogram named gg_arm_hist with

  • Bars colored with “seagreen”

  • A black outline for each bar

  • Each bin 10 millimeters wide

gg_arm_hist <- 
  gg_arm +
  geom_histogram(
    color = "black",
    binwidth = 10,
    fill = "seagreen"
  )

gg_arm_hist
## Warning: Removed 299 rows containing non-finite outside the scale range
## (`stat_bin()`).

Part 2A iii) Density Plot

Create and save a density plot as gg_arm_den with the region under the line shaded “seagreen” that is partly transparent

gg_arm_den <- 
  gg_arm +
  geom_density(
    fill = "seagreen",
    alpha = 0.75
  )

gg_arm_den
## Warning: Removed 299 rows containing non-finite outside the scale range
## (`stat_density()`).

Part 2B: Arm Shape

  • Using the graphs created in part 2A, describe any important features of the arm variable.

  • How do you expect the mean and median to compare to each other?

Part 2C: Measures of Center

Calculate the mean and median of arm.

mean(x = bones$arm, na.rm = T)
## [1] 537.4375
median(x = bones$arm, na.rm = T)
## [1] 537.5
  • Do they meet your expectations from your answers in part 2B?

Question 3: Iliac Width

Part 3A: Five Number Summary

Calculate the 5 number summary for iliac. Are there any unusually narrow or wide iliacs?

fivenum(x = bones$iliac, na.rm = T)
## [1] 184 251 263 274 324

Part 3B: Boxplot for Iliac

Create a boxplot for iliac using ggplot so it appears as the plot in blackboard. The color of the box can be approximate. Are there any unusually narrow or wide iliacs? Explain your answer!

ggplot(
  data = bones,
  mapping = aes(x = iliac)
) + 
  
  geom_boxplot(fill = "orchid") + 
  
  theme_light() + 
  
  labs(x = "Width of Iliac (mm)",
       title = "Iliac Width Boxplot") +

# Add the following line of code to your boxplot to remove the labels on the y-axis:
  scale_y_continuous(breaks = NULL)
## Warning: Removed 66 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

Part 3C: Iliac by Age

Create boxplots as they appear in the attached pdf comparing iliac width between males and females. Describe any important differences between the two sexes

ggplot(
  data = bones,
  mapping = aes(
    y = iliac,
    x = sex,
    fill = sex
  )
) + 
  
  geom_boxplot(show.legend = F) + 
  
  labs(
    y = "Width of Iliac",
    x = NULL
  ) + 
  
  theme_light()
## Warning: Removed 66 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

Part 3D: Iliac by sex and age

Create vertical boxplots for iliac with age ranges on the x-axis and color of the boxes representing sex. Describe any associations you notice in the boxplots

ggplot(
  data = bones,
       mapping = aes(
         y = iliac,
         fill = sex,
         x = age
         )
  ) + 
  
  geom_boxplot() + 
  
  labs(
    y = "Width of Iliac",
    x = NULL,
    fill = NULL
    ) + 
  
  theme_bw()
## Warning: Removed 66 rows containing non-finite outside the scale range
## (`stat_boxplot()`).