Combining Skills and Quiz Review

Author

Penelope Pooler Eisenbies

Published

February 6, 2025

Housekeeping

Reminders from Week 3

HW 3 is Due 2/5/25

  • Practice Questions from Spring of 2023 are Posted.

    • Note: I updated BUA 455 this summer

    • I will talk more about the test on Thursday

    • I have created a second UPDATED version of all practice questions that reflects the text adaptations due to AI availability.

Quiz 1 on Thursday 2/13/24

  • Weeks 1 - 4 (Lectures 1 - 8)

  • HW 1 - 3

Side Trip about piping

%>% vs. |>

  • What’s the difference?

  • For your purposes they are interchangeable, but |> is newer

  • %>% requires magrittr package but |> doesn’t

    • I load this package anyways as a precaution in case I need other pipe functions
  • |> may give you an error if are working on a machine with an old version of R or RStudio

  • |> is slightly more efficient because of what the computer is doing is slightly different

  • More information for those who are interested (not required)

Review: In-class Exercise from Week 3

Code
```{r}
#|label: import and prep bom2023

mojo_23_fall_wknd <- read_csv("data/Box_Office_Mojo_Week3_HW3.csv", show_col_types=F) |>   # import data
  mutate(Month = factor(month,                                                        # create factors
                         levels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
                                  "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")),
         Day = factor(day,     
                      levels=c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"),
                      labels= c("M", "T", "W", "Th", "F", "Sa", "Su"))) |>
  select(Month, Day, top10gross) |>                                                  # select variables
  group_by(Month, Day) |>                                                            # group by category
  summarize(max_top10g = max(top10gross, na.rm=T)) |>                                # summarize
  ungroup() |>                                                                       # ungroup
  filter(Day %in% c("F", "Sa", "Su") & Month %in% c("Sep", "Oct", "Nov", "Dec"))     # filter fall wknds
```
`summarise()` has grouped output by 'Month'. You can override using the
`.groups` argument.
Code
```{r}
mojo_23_fall_wknd |> kable()
```
Month Day max_top10g
Sep F 27292508
Sep Sa 32699795
Sep Su 27575389
Oct F 53212742
Oct Sa 47653696
Oct Su 32681165
Nov F 43491965
Nov Sa 41719271
Nov Su 29342997
Dec F 39891139
Dec Sa 40050370
Dec Su 23078184

Completed code from Week 3 Exercise

  • An alternative to the code below is to round data as desired in mutate statement before reshaping data with pivot_wider.
Code
```{r}
#|label:  completed code wk 3 exercise

mojo_23_fall_wknd_wide <- mojo_23_fall_wknd |>
  mutate(max_top10g = (max_top10g/1000000) |> round(4)) |>                # convert to millions
  pivot_wider(id_cols=Month, names_from = Day, values_from = max_top10g)  # reshape data

mojo_23_fall_wknd_wide[,2:4] <- round(mojo_23_fall_wknd_wide[,2:4],1)     # round cols 2-4 to one decimal

# mojo_23_fall_wknd_wide[,2:4] <- round(mojo_23_fall_wknd_wide[,2:4])     # round cols 2-4 to whole numbers

mojo_23_fall_wknd_wide |> write_csv("data/Week_4_In_Class_First_Name_Last_Name.csv") # export as .csv 

mojo_23_fall_wknd_wide |> kable()       # create kable table (was not required in Week 3)
```
Month F Sa Su
Sep 27.3 32.7 27.6
Oct 53.2 47.7 32.7
Nov 43.5 41.7 29.3
Dec 39.9 40.1 23.1

Week 4 In-class Exercises - Q1

NOT ON PointSolutions

If all the columns in a dataset are numeric, you can round the whole dataset at once with the command round(<name of dataset>).

Why wouldn’t that work for the dataset in the previous exercise, mojo_23_fall_wknd_wide?

Hint: To answer this question, you are encouraged to

  • try running the command round(mojo_23_fall_wknd_wide)

  • examine the data using glimpse

Review - Week 1

  • R, R Studio, R Projects, and Quarto files

    • Creating an R Project OR an R Quarto Project

    • Creating data and img folders.

  • Selecting data rows and columns by location using square brackets

  • Examining data using summary and unique, and table

  • Data types:

    • numeric (<dbl>, <int>)

    • character (<chr>)

    • logical (lgl)

    • factor(<fct>, <ord>)

      • In Week 3 we discussed how to convert character variables to factors.

      • Numeric variables can also be converted to factors.

Review - Week 2

  • Review of Week 1 PLUS

  • dplyr package commands to select, modify, and summarize data:

    • select - used to select variables

    • filter - used to filter observation by observation values

      • Can be used with

        • numeric values

        • character values

        • factor levels

    • slice - used to filter or select observations by location

    • mutate - used to modify variables or create new variables

    • factor - used to create a factor variable from another variable

Review - Week 3

  • Review of Weeks 1 and 2 PLUS

  • Coercion commands to coerce a variable to the type needed

    • as.integer, as.numeric, as.character
    • HW 3 included as.integer
    • Week 3 included a preview demo of as.numeric
  • dplyer commands

    • group_by and filter
    • group_by and summarize
  • Commands to reshape data:

    • pivot_widerand pivot_longer
  • Display data table using kable()

Review - ggplot

  • ggplot geometries (geom) covered so far:

    • boxplot: geom_boxplot
    • barplot: geom_bar
    • scatterplot: geom_point
    • line plot: geom_line
    • area plot: geom_area
Code
```{r eval=F}
#|label: ggplot review
#|include: false
set.seed(999)                  # standardizes sample                         
my_diamonds <- diamonds |> 
  slice(sample(1:53940, 1000)) # example dataset

# print to screen without saving
my_diamonds |> ggplot() + 
  geom_point(aes(x=carat, y=price, 
                 color=clarity))

# save but don't print to screen
diamonds_plot <- my_diamonds |> ggplot() + 
  geom_point(aes(x=carat, y=price, 
                 color=clarity))

# save AND print to screen, 
# enclose all plot code in parentheses
(diamonds_plot <- my_diamonds |> ggplot() + 
  geom_point(aes(x=carat, y=price, 
                 color=clarity)))

# export most recent ggplot to img folder
ggsave("img/diamonds_plot_Week4.png", 
       width=6, height=4)
```

Format of Quiz 1

  • Students will have 70 minutes

  • Students with a time accommodation: we will schedule an alternative.

    • Tentative Time: Friday 2/17 at 1:00 PM
  • All students must work alone.

  • Quiz intended to be long and students may not finish.

    • All questions are equally weighted and independent.
    • There will be approximately 7-9 multi-part questions on Blackboard.
    • Each question will have multiple versions.
    • The questions will include instructions and may include some partial R code embedded within the Blackboard question.

Format of Quiz 1 Continued

  • You will be provided with an R project with a Quarto file template and data and img to complete your work.

  • For each question you will:

    • Copy and paste provided R code into provided Quiz 1 template in provided Quiz 1 R project.
    • Complete the R code in the Quiz 1 template and save your work.
    • Answer the question on Blackboard.

Grading of Quiz 1

  • Grading will take a little time. In addition to your Blackboard answers you are required to submit

    • your Quarto file
    • specified data or plot files
    • NOTE: YOU DO NOT SUBMIT A ZIPPED PROJECT for the quiz.
  • I can not give you full credit if you do not show your work in the provided Quarto file.

  • Reminder: You can use different code than what is taught and will receive full credit if the result is correct.

  • For each question, the grade will be tallied as follows:

    • R code (.qmd file 10%) - quick check
    • Blackboard answers (90%)
  • Quiz 1 is worth 22.5% of your final grade in this course.

Practice Questions for Quiz 1

  • There are a set of 14 practice questions posted on Blackboard.

  • Quiz 1 questions will be similar to these and will use these same datasets or similar ones.

  • If other data are used:

    • I will post an announcement, so you can examine the data and documentation before the quiz.

Before Thursday

  • Download and save the available practice questions

  • Look over Blackboard Questions

    • Take notes on what is not clear for you.
    • I will answer questions on Thursday.
  • Video Playlist of available questions is posted

  • I will add a few new versions of these questions written to address AI.

    Videos of new questions will be posted this weekend.

Before Quiz 1:

  • Work through all the practice questions and write the code with comments to make sure you understand it.

  • Quiz 1 is open notes so you can use the code you create for the Practice Questions.

  • Make sure that your laptop has up-to-date versions of R and RStudio

  • Make sure all packages listed in the setup for Quiz 1 are successfully installed and loaded in R on your laptop.

  • You can use AI during the test, but again many questions will be modified so that AI can not answer the question without your understanding.

Examining Data and R Help files

Throughout the practice questions, you are asked to:

  • Examine data help files
  • Examine data using glimpse
  • Examine Data in the Global Environment
  • Examine and and sort the data.

Examining data help files

  • Type ?mtcars in the R Console (lower left pane) and click Enter.

  • Documentation will appear in the lower right Help window.

Examining Data with glimpse

Code
```{r mtcars}
my_mtcars <- mtcars |> glimpse()                # save data to Global Environment and examine
# ?mtcars                                       # request data help file
```
Rows: 32
Columns: 11
$ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
$ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
$ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
$ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
$ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
$ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
$ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
$ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
$ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
$ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
$ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

Week 4 In-class Exercises Q2-Q3

Session ID: bua455s25

Once the dataset is saved to the Global Environment you can click on it.

  • Click on dataset name in Global Environment.

    • This will open the dataset in a tab in upper right pane.

    • Click on tab to view data.

    • Click on variables to sort them.

Q2. The mtcarsdataset is saved as my_mtcars (my_pq1_cars) in the Global Environment.

This dataset has ____ observations (rows).


Q3. Examine this data set in the Global Environment to answer this question:

The car with the LOWEST fuel efficiency (mpg) is the _______.

Week 4 In-class Exercises Q4

Session ID: bua455s25


How many categories are in the cyl (cylinder) variable in the my_mtcars (my_pq1_cars) dataset you created?

Datasets in Quiz 1 Practice Questions

  • Chunk 1 of your Practice Questions Template Quarto file is the setup chunk.

    • Running Chunk 1 will load (and install if needed) the required packages.
  • The code below is Chunk 2.

    Running Chunk 2 will save these datasets to your Global Environment.

Code
```{r}
#|label: save datasets to global environment                                                            

# save R datasets for Quiz 1 to Global Environment
my_mtcars <- mtcars
my_diamonds <- diamonds
my_starwars <- starwars
my_orange <- Orange

# import these two summary datasets 
# mn_numreleases is the mean number of releases in movie_smry_w1
movie_smry_w1 <- read_csv("data/Movie_Summary_Wide_1.csv", 
                          show_col_types = F)
# mn_top10gross is the mean gross of the top 10 movies in movie_smry_w2
movie_smry_w2 <- read_csv("data/Movie_Summary_Wide_2.csv", 
                          show_col_types = F)

# mojo1999 is the full year of movie data from 1999
mojo1999 <- read_csv("data/Box_Office_Mojo_1999.csv", show_col_types = F)
```

Overview Of Practice Questions

  • 2 Questions about mtcars saved as my_mtcars (my_pq1_cars)

  • 3 Questions about diamonds saved as my_diamonds (my_pq1_diamonds)

  • 1 Question about starwars saved as my_starwars (my_pq1_stwars)

  • 1 Question about Stwars_smry_pq1.csv imported as starwars_smry (stwars_smry_pq1)

  • 3 Questions about Orange saved as my_pq1_orange

  • 2 Questions about 2 imported Movie Datasets

    • Movie_Wide1.csv imported as movie_wide1
    • Movie_Wide2.csv imported as movie_wide2
  • 2 Questions about 1 imported full year of Movie data, imported as mojo_1999.

Questions are designed to be short, but Quiz questions will be a little shorter.

Practice Question 1 (abridged)

Examine the my_mtcars dataset using glimpse.

  • Save a new version of my_mtcars to a new name such as my_mtcars1

  • Filter the new dataset to only include cars with BOTH a straight engine (variable is vs) and an automatic transmission (variable is am).

  • Examine the new filtered data set in the Global Environment or using glimpse.

  1. How many rows are in the original my_mtcars dataset?

  2. mtcars is an older dataset so although there are many kinds of variables they are all coded as one type of variable for simplicity.

    • All of the variables in the my_mtcars dataset are type ____?
  3. How many rows are in this new filtered dataset you created?

  4. Within the new filtered dataset, the highest mpg (miles per gallon) is ____.

Practice Question 1 (UPDATED)

  • Examine the my_pq1_cars dataset.

    • How many rows are in this dataset?

    • What type of variables are in this dataset?

    • What type of values are these?

  • Save a new version of my_pq1_cars to a new name and filter the new dataset to only include cars with BOTH a straight engine and an automatic transmission.

  • Then examine the new dataset you saved.

    • How many rows are in the new dataset?

    • Within the new dataset, what is the highest miles per gallon?

Plan for Thursday

  • I posted a video playlist for the practice questions

  • You should

    • Finish HW 3

    • Read through the practice questions.

      • Take notes of questions you don’t understand

      • Let me know if you see typos

    • Note the new questions I will post before Thursday.

Time for Questions

  • Questions about Practice Questions and Quiz 1

  • Individual Questions about HW 3

Key Points from This Week

Review of Weeks 1 - 3:

  • Use Practice Questions to Guide your Review

  • ALSO review Lecture Notes and HW assignments

  • Make sure you are comfortable with downloading, unzipping and saving R projects to your computer.

  • Come with questions about any and all skills and concepts we have covered

  • If there are few to no questions

    • I will use the Practice Questions to guide the lecture
    • I will remind you of key details
  • There will be polling questions

You may submit an ‘Engagement Question’ about each lecture until midnight on the day of the lecture. A minimum of four submissions are required during the semester.