HW 1 Instructions

Due Wednesday, September 3, 2025 at 11:59 PM

Purpose

This assignment will give you experience with:

  • Creating an R Quarto Project Directory

  • Creating img and data folders within R project to store files

  • Editing and using an Quarto (.qmd) file

  • Reviewing some basic R syntax from Week 1

  • Saving your work and exporting files to the img and data folders.

NOTES:

  • Familiarity with R, R Studio and Quarto projects and files is necessary for you to succeed in this course.

  • Throughout this course you will be managing and manipulating data for different goals.

  • You will not be tested directly on file management, but you can not manage data well until you can manage files well on your computer.

  • Good file management is necessary for good data management and will make all of your work more efficient.

HW 1 - First Steps

Instructions

  1. Create a BUA455 folder on your computer, NOT in Downloads.

    • ALL work for this class will be saved to this folder

    • In this example, my folder is BUA 455.

    • Create and name your first R Quarto project for your first HW Assignment:

    1. File > New Project > New Directory > Quarto Project

    2. In box, name this project:

    • HW 1 <first name> <last name>

    • My project name: HW 1 Penelope Pooler

    1. Click Create Project.
  2. Notice that when you create this project, a Quarto (.qmd) file with the same name is created.

    • Here are the default files shown in the created HW 1 Quart Project:

Quarto Project default Files
  1. Use the create folder button to create two additional folders

    • data, img

    • After creating the Quarto project with your name and adding these folders your files should appear as follows with your name (instead of mine):

HW 1 files after Step 3
  1. Edit the header of the Quarto (.qmd) file.

    • The default header in the .qmd file only includes the title.

Default Quarto File Header
-   Replace this full header, including dashed lines with this text:

-   Change `Penelope Pooler` to your name.
---
title: "HW 1 Penelope Pooler"
date: last-modified
toc: true
toc-depth: 3
toc-location: left
toc-title: "Table of Contents"
toc-expand: 1
format:
  html:
    code-line-numbers: true
    code-fold: true
    code-tools: true
execute:
  echo: fenced 
---
  1. Delete default text and copy and paste Setup header and R chunk into document.

    • The default Quarto file text includes a header, a little text, and one R chunk.

Default Quarto Text
-   Delete all of this text and this R chunk and replace it with the setup header (`## Setup`) and `setup` R chunk in the provided text file labeled `HW1_SETUP.txt`.

-   Note that that the provided `setup` R code chunk already includes a label but that in subsequent steps you will create new empty code chunks and add a label.
  1. Run setup Chunk

    • Click the green triangle in the upper right corner of the setup chunk to run it.

    • Examine output of setup chunk to verify that these 13 packages were loaded (and installed if needed).

    • If the knitr package (necessary for Quarto) or these other packages are not installed, this may take a few minutes.

List of Loaded Packages

HW 1 - Part 2

NOTES:

  • HW 1 - Part 1 includes the questions about the syllabus, hardware requirements and R/RStudio versions.

  • HW 1 - Part 2 includes three chunks that we review in the Week 1 Lecture material.

  • The first two chunks include provided coded that you will run and use to answer questions about the cars dataset.

  • The third chunk only includes comments but you will add the code specified in Blackboard Questions 6 and 7

Instructions

  1. Add this heading. Note that the ## creates a level 2 header (second largest).
## HW 1 - Part 2
  1. Create first empy chunk, add label, code, and comments to the chunk, and run it by clicking green triangle.

    • Create R chunk by typing Ctrl_Alt+i or clicking green C,

    • Note that the glimpse command is shown twice, without piping and then with piping.

#|label: import and examine cars data   
  
# cars is an internal R dataset
# this code saves a copy of the cars data in the Global Environment
my_cars <- cars

# examine the dataset mycars using glimpse
glimpse(my_cars)

# same command with piping:
# read |> as 'is sent to' or 'goes into'
my_cars |> glimpse()
  1. Create second empty chunk, add label, code, and comments to the chunk, and run it by clicking green triangle.
#|label: square bracket examples
  
my_cars[3:5,]  # select rows 3, 4 and 5 both columns
my_cars[,1]    # select all rows of column 1
my_cars[10:12, 1] # select obs 10, 11, and 12 within col 1
my_cars[c(20,30,40),2] # select obs 20, 30, and 40, within col 2
  1. Create third empty chunk, add label and comments, and create code lines as specified in Blackboard Questions 6 and 7.
#|label: square bracket exercises
  
# HW 1 Question 6                                                              

# HW 1 Question 7     

HW 1 - Part 3

NOTES:

  • In the Week 1 lectures, we also worked with the starwars dataset.

    • Recall that the starwars dataset is internal to the dplyr package (included in tidyverse) that we loaded in our setup chunk above.
  • Code from the R chunks we cover in Week 1 of BUA 455 is included below and more code is added.

    • Provided R code below will not run unless the setup chunk is run first.


  • These chunks are provided as a reference to show how to do basic R data manipulations and summaries.

  • R code with and without piping is included as a reference.

  • In future work, most R code will be shown with piping only.

Instructions

  1. Add this heading. Note that the ## creates a level 2 header (second largest).
## HW 1 - Part 3
  1. Create five new empty chunks

  2. label the first chunk by adding this code on the line below the top of the chunk:

    • Note that #| is referred to as a fence and it is used to add labels and options to the chunk

    • Some options can also be added within the curly brackets to modify output.

#|label: starwars1
  1. Add label fences to each of the four other chunks labeling them, starwars2, starwars3, starwars4, and starwars5.

    • You will not receive full credit if the chunks are not all labeled correctly.
  2. Add the provided code and comments, to each the five chunks and run them to answer the Blackboard Questions.

R Code for starwars 1 Chunk

# save a copy of starwars data to your global environment
my_starwars <- starwars

# examine the data with glimpse
glimpse(my_starwars)

# same glimpse command with piping
my_starwars |> glimpse()
  

R Code for starwars 2 Chunk

# examine the list of species with unique command
# to specify a variable WITHIN a dataset we use $, the accessor operator
unique(my_starwars$species)

# same unique command using piping and pull command
my_starwars |> pull(species) |> unique()

# examine the list of haircolors
# again, we use $ to specify hair_color within this dataset
unique(my_starwars$hair_color)
  

R Code for starwars 3 Chunk

# use table to summarize starwars data by sex and species
table(my_starwars$sex, my_starwars$hair_color)

# save this table as an object
sw_gender_hrclr_smry <- table(my_starwars$sex, my_starwars$hair_color)
  

R Code for starwars 4 Chunk

# summarize height of starwars characters
summary(my_starwars$height)

# calculate the mean height (also included in summary above)
# NA's must be excluded with na.rm=T
mean(my_starwars$height, na.rm=T)

# same mean command with piping and pull command
my_starwars |> pull(height) |> mean(na.rm=T)

# NA's must also be excluded for min(), max(), median, sd(), etc.
sd(my_starwars$height, na.rm=T)

# same sd command with piping and pull command
my_starwars |> pull(height) |> sd(na.rm=T)
  

R Code for starwars 5 Chunk

# summarize sex
summary(my_starwars$sex)

# summarize sex using table
table(my_starwars$sex)

# summarize sex but use as.factor
summary(as.factor(my_starwars$sex))

# save this last summary as an object and print it to screen
(sw_sex_smry <- summary(as.factor(my_starwars$sex)))
   

HW 1 Exporting Files

Data File Export

  1. Add these two headers. Note that the ## creates a level 2 header (second largest) and ### creates a level 3 header.
## HW 1 Exporting Files

### Data File Export
  1. Create empty chunk, add label, code, and comments to the chunk, and run it by clicking green triangle.

    • You are not responsible for understanding this code now, but it will be a good reference later.

    • If your file structure is correct, this code will

      • create the specified dataset.

      • save the dataset as a .csv file to the data folder in your HW 1 R Quarto Project.

#|label: create export dataset
  
# select, filter, and mutate commands are part of tidyverse suite
# bmi = weight(kg)/height(m)^2

my_starwars_plot_dat <- my_starwars |>         # my_starwars_plot_dat created for plot
  select(species, sex, height, mass) |>        # select specific variables
  filter(species %in% c("Human", "Droid")) |>  # filter data to humans and droids only
  mutate(bmi = mass/((height/100))^2) |>       # use mutate to create new variable, bmi
  filter(!is.na(bmi)) |>                       # filter data to remove missing BMI values
  mutate(sexF = factor(sex,                    # create factor variable, sexF
                       levels = c("male", "female", "none"),     # specify order (levels)
                       labels =c("Male", "Female", "None"))) |>  # specify labels
  write_csv("data/Star_Wars_Week1.csv")
   

Plot Export

  1. Add this level 3 header.
### Plot Export
  1. Create empty chunk, add label, code, and comments to the chunk, and run it by clicking green triangle.

    • You are not responsible for understanding this code now, but it will be a good reference later.

    • If your file structure is correct, this code will:

    • create the specified formatted plot.

    • save the plot as a .png file to the img folder in your HW 1 R Quarto Project.

#|label: create explot boxplot

sw_box_final <- my_starwars_plot_dat |>                        # first 3 lines create
  ggplot() +                                                   # basic minimal plot
  geom_boxplot(aes(x=species, y=bmi, fill=sexF)) + 
  theme_classic() + 
  labs(title="Comparison of Human and Droid BMI",              # labs specifies text labels
       subtitle="22 Humans and 4 Droids from Star Wars Universe",
       caption="Data Source: dplyr package in R",
       x="",y="BMI", fill="Sex") + 
  theme(plot.title = element_text(size = 20),                  # theme formats plot elements
        plot.subtitle = element_text(size = 15),
        axis.title = element_text(size=18),
        axis.text = element_text(size=15),
        plot.caption = element_text(size = 10),
        legend.text = element_text(size = 12),
        legend.title = element_text(size = 15),
        panel.border = element_rect(colour = "lightgrey", fill=NA, linewidth=2),
        plot.background = element_rect(colour = "darkgrey", fill="grey100", linewidth=2))

ggsave("img/Star_Wars_Week_1_Boxplot.png")                    # exports last plot created
  

HW 1 - Parts 2 and 3 - Final Steps

Instructions

  1. Save your HW 1 Quarto File

    • You are NOT required to render this file to create an HTML file .

    • You are welcome to try and we (myself and/or a TA) can assist you.

  2. Answer all 12 Blackboard questions associated with this assignment.

  3. Zip your entire Project Directory into a compressed File and submit it.

    • NOTE: In HW 2 and subsequent HW assignments, you will be asked to create a README file for your R Project, but that is NOT required for this assignment.

Grading Criteria

Pre-class Survey (10%)

These required questions help me gauge your prior knowledge of R and other information that is helpful.

HW 1 - Part 1 (20%)

Completing this set of Blackboard Questions is required and verifies your knowledge of the syllabus and the course hardware and software requirements.

HW 1 - Parts 2 and 3 (70%)

The Blackboard (BB) Questions (1-12) and the submitted R project together comprise 60% of HW 1.

  • (12 pts.) Each Blackboard question for HW1 - Parts 2 and 3 is worth 1 point.

  • (2 pts.) Completing HW 1 - First Steps as specified.

  • (4 pts.) Part 2: Full credit for

    • 2 points for creating the first two chunks with provided labels, comments, and code

    • 2 points for creating the third chunk with provided label and comments and correctly writing the code as specified in Questions 6 and 7 on Blackboard.

  • (4 pts.) Part 3: Full credit for correctly creating FIVE new chunks, adding labels and adding provided code and comments.

  • (2 pts.) Exporting Files: 2 points for creating the final chunks as specified that save a plot to the img folder and a dataset to the data folder.

  • (2 pts.) Completing the HW 1 - Final Steps and correctly submitting your zipped HW1 project directory.