HW 5 - Part 1 Instructions
Due 3/26/25
Introduction
HW Assignment 5 - Part 1 will give you experience with:
creating/Modifying a Quarto dashboard based on a similar example.
citing source data.
executing data management tasks such as:
reshaping data using
pivot_wider
andpivot_longer
(Review).filtering rows and selecting variables (Review).
modifying variable formats using
ifelse
andfactor
(ifelse
is new).Summarizing and presenting data in a table using
kable
(Review).
Instructions
HW 5.1 - First Steps
Steps to Follow:
Download, unzip, and save to your laptop the provided HW 5 - Part 1 R project.
Rename the project folder to include your name, e.g.
HW 5 - Part 1 - Penelope Pooler
.
This R project has the data saved to the data folder and has a
custom.scss
file that can be used to modify dashboard defaults (advanced).Notice that a template Quarto (
.qmd
) file is also included in the R project.
- Create a duplicate of this template Quarto file within the project.
Select the Quarto file, click the
More
option in theFiles
pane, and then clickCopy...
.You will delete the original template when you are done, but it is good to have in case you corrupt the provided code.
Change the file name of the copied
.qmd
file to be the following with your name:HW5_Part1_Dashboard_FirstName_LastName
.In the header of the dashboard, change the title to be
HW 5 - Part 1 - FirstName LastName
.- Note that in the video demo playlist I add my name to the .qmd file in ‘Final Steps’ video.
Copy and paste chunks one through five from the provided text file,
setup_data_mgmt_chunks.txt
directly under the header, before the the line that says## Nextflix and Amazon Stock Values
Run all five of these chunks.
This code was covered in class.
Notice that the chunk options for a dashboard are different because all code and extraneous output are hidden.
- At the end of Chunk 5:
Remove the
#
beforenflx_mv <-
Use the code directly above this line as an example to filter the data to only movies:
filter condition:
type == "Movie"
BB Question 1
Fill in the blanks:
The nflx_mv
dataset has ____
rows and ____
columns.
HW 5.1 - Page 1
Stock Data
Steps to Follow:
The first page of the example dashboard compares Netflix stock data to Amazon. You are going to change the amazon stock information to AMC because your dashboard will focus on movie content.
Change the page header to read
# Nextflix and AMC Stock Values
In Chunk 6,
import stock data
, changeAMZN
toAMC
throughout this chunk.In the text after
## Row
, update the text to indicate the dashboard will show AMC stock data instead of Amazon stock data.In Chunk 9, the third value box chunk, change
Amazon
toAMC
andAMZN
toAMC
.In Chunk 11,
pg1 amzn stock trends
, changeamzn
toamc
toAMZN
toAMC
throughout this chunk, including in the header.
BB Question 2
What type of dataset are the NFLX data when they are imported into R from Yahoo Finance?
BB Question 3
Given that these data are a time series, where is the time or date information located in the dataset?
BB Question 4
On June 2, 2021, AMC’s stock was at its highest value in this timespan. On June 2, 2021, which stock, Netflix or AMC, was valued higher?
- Use the plots to answer this question.
HW 5.1 - Page 2 - Part 1
Bar Chart Data Mgmt.
Steps to Follow:
Change
tv
tomv
in chunk headers and throughout these two chunks and change the page header to:# Bar Chart of Movie Trends
.
In Chunk 12,
pg2 nflx mv release period data mgmt
:
- Complete
release_period = ifelse()
statement in the mutate command to group data from"2001-2005"
and"2006-2010"
into one category,"2001-2010"
:
mutate(release_period = ifelse(release_period %in% c("2001-2005", "2006-2010"), "2001-2010", release_period))
- Notice that In the Netflix TV dashboard, data are filtered to most recent three release periods: 2001-2010”, “2011-2015”, and “2016-2021”.
- In the filter command in the data management , add one more release period:
"1981-2000"
.
Create a factor variable,
min_ageF
from the variablemin_age
with these factor levels,levels = c(0, 7, 13, 17)
.The current levels and labels for genreF are not in order of prevalence for movies.
The correct order (based on most recent time period) for the movies data from most prevalent to least is:
- International
- Drama
- Comedies
- Documentaries
- Kids
- Action and Adventure
current levels:
"international", "dramas", "action_adventr", "comedies", "kids", "docs"
current labels:
"Int","Dr","A/A","C","K","Do"
Reorder the genre
levels
andlabels
in the R code so that the categories are in the order of prevalence shown above.
- In Chunk 13, change
tv
tomv
in the header so that the header title ispg2 nflx mv release period bar chart
, and then make the following changes to thelabs
command in the plot code:
In the plot subtitle in the
labs
command:Update the order of the genres. Note there are 3 spaces between each genre in the subtitle.
Change
Docuseries
toDocumentaries
In the plot title and y-axis label, change ‘TV Shows’ to ‘Movies’
NOTES:
After completing the
pg2 nflx mv release period data mgmt
Chunk (Chunk 12), removeeval=F
from BOTH the bar plot chunk (Chunk 13) and the summary table chunk (Chunk 14).Figure dimensions,
fig.dim = c(10, 5)
should be left as is to utilize available space.
HW 5.1 - Page 2 - Part 2
Summary Table
Steps to Follow:
Change
tv
tomv
in this chunk (Chunk 14).Complete the summary table code chunk so that the summary table appears in the right side panel next to the bar plot:
- Complete
select
command to select these variables:
release_period, genreF, n
Complete
group_by
command to group data by these variables:release_period, genreF
Complete
summarize
command to sum n (number of movies):n=sum(n)
Complete the
pivot_wider
command to:
Maintain release_period as is:
id_cols = release_period
Create a column for each genre:
names_from = genreF
Use the values from n for each genre column:
values_from = n
Note that these options in pivot_wider are all be separated by commas.
- Enter name of summary dataset,
nflx_smry1
inkable()
command to output table.
- See completed example from class using Netflix TV Data and demo video
BB Question 5
The final filtered dataset used to create the barplot is nflx_mv_plot1
. This dataset has:
____
categories in therelease_period
variable____
categories inmin_ageF
, the minimum age factor variable____
categories ingenreF
, the genre factor variable
BB Question 6:
Based on the barplot and summary table, which genre has the most movies in the three most recent release periods?
HW 5.1 - Page 3
Area Plot
Steps to Follow:
Change
tv
tomv
in these last two chunks (Chunk 15 and 16) and change the page header to:# Netflix Movies Added Each Year
.
Run provided data management code chunk, Chunk 15,
pg3 nflx mv area plot data mgmt
.Answer Blackboard Question 7 (BB Question 7) based on the dataset used to create the plot in Panel 3,
nflx_mv_plot2
.
BB Question 7:
After completing the data management steps, the final dataset used for the plot, nflx_mv_plot2
, has
____
rows.____
columns.____
different years in the year_added variable.
- Complete the
geom_area()
statement as follows:
Add the aesthetic command within the parentheses:
aes()
.Within the aesthetic command,
aes()
, specify the following:
x is year_added:
x = year_added
y is total:
y = total
fill is min_ageF:
fill = min_ageF
NOTE:
x
,y
, andfill
should be separated by commas.See completed example from class using Netflix TV Data and demo video
- Complete the
scale_x_continuous()
command with abreaks
option:
- Within the parentheses add:
breaks =
x-axis should show every year from 2013 to 2021
One solution: use seq() command, e.g.
seq(2013, 2021, 1)
See completed example from class using Netflix TV Data and demo video
In the plot title and in the y-axis label in the
labs
command change ‘TV Shows’ to ‘Movies’.Remember to remove
eval=F
from this chunk (Chunk 16,pg3 nflx tv area plot
).
Optional Extra Credit (2 pts.)
NOTE: There is no partial credit on this extra credit, but this is not required.
The purpose of this Extra Credit is to experiment with dashboard themes, plot themes, and colors to examine choices and see what works well.
For 2 Extra Points:
- Change the dashboard theme.
In the header, the current theme is United:
theme: [United, custom.scss]
.NOTE: For HW 5, do not choose a dark theme which is not compatible with the default highchart.
- Change these two aspects in the Page 2 and Page 3 plots (plots must match).
the plot theme (chosen theme should NOT be theme_classic OR default)
the
palette =
option in the scale_fill_brewer commands (should not be “Spectral” OR default palette)The plot theme for both plots must match and fit each plot, i.e., not obscure any plot elements in either plot.
The palette chosen must show all 4 colors clearly and can not be the R default.
There is no “right answer”, but if you chose a theme that makes some of the plot elements, e.g legend, titles, not visible you will not get credit.
If you choose a palette with colors that are not clearly visible or distinguishable, you will not get credit.
Here are some helpful links:
HW 5.1 - Final Steps
- Once all code is complete and runs without errors, render Quarto (
.qmd
) file to create dashboard
- Don’t forget to remove eval = F from chunk headers (Chunks 13, 14, and 16).
- Verify that your project folder includes:
HW 5 - Part 1 Quarto (
.qmd
) file to create dashboard saved with your name.HW 5 - Part 1 Dashboard (
.html
) file saved with your name.a
custom.scss
file that adjusts the box size and font size of the value boxes.a
data
folder that contains the data file,netflix_titles.csv
.an empty
img
folder.an
.Rproj
file.
- Save the provided README template to your project folder and update it to list all of the files and folders above.
- You do not have to list files or folder that are not listed above.
Zip (Compress) project directory to submit it.
Answer all Blackboard Questions (7 Questions)
Grading Criteria
- (14 pts.) Each Blackboard question for this assignment is worth 2 points.
Dashboard Creation Steps:
(3 pts.) Completing HW 5.1 - First Steps as specified.
(2 pts.) Page 1 - Stock Page:
Full credit for
correctly updating Chunks 6, 9, and 11 from Amazon data to AMC data.
timsespan should be 2013-01-01 to 2021-12-31
(4 pts.) Page 2 - Part 1 - Bar Plot:
Full credit for correctly following all steps to
Create the barplot showing each movie genre in a separate bar
Have movie genres ordered by prevalence in 2016-2021 release period
Have bars correctly labeled as specified (labels must match levels)
Have stacked colors showing movies for each minimum age category
Have a different panel for each release period (4 panels)
have all plot text and accompanying text appearing correctly in dashboard
(4 pts.) Page 2 - Part 2 - Summary Table:
Full credit for correctly following all steps to
create correctly formatted and labeled table
place table and accompanying text correctly in right panel next to plot
(4 pts.) Page 3 - Area Plot:
Full credit for correctly following all steps to
create an area plot with a correctly labeled X-axis (each year showing)
show each minimum age category
have all parts of the plot labeled correctly
have accompanying text appearing correctly in dashboard
(2 pts.) Completing OPTIONAL EXTRA CREDIT as specified.
- There is no partial credit for the extra credit but this is not required.
(2 pts.) Completing the HW 5 - Part 1 - Final Steps as specified and correctly submitting your zipped project directory.