Weeks 7 and 8

Time Series, Data Formats, Output Formats, Project Introduction

Author

Penelope Pooler Eisenbies

Published

October 8, 2025

Housekeeping

Final grading in this course:
- adheres to Whitman grading policy, but is fairly gentle.
- takes into account assignments, course project, and class particpation.
Quiz 2 will be during Week 11 and will combine previous skills with material from weeks 6 through 10
- It will be similar to Quiz 1 but may have more questions and more steps in multi-step tasks.
If you have questions about your quiz, please let me know.

HW 4 is posted and is due on Wednesday, 10/15/25.

HW 4 - Part 1 is due on Wednesday 10/8/25 and is required in order for you to complete this course.
There are no office hours on Thursday this week, 10/9/25.

BUA 455 Group Dashboard Project

Group Assignments

Complete HW 4 - Part 1 TODAY, 10/8! (This should only take 5 min.)
Note: If you do not complete this Survey, I will not put you in a project group and you can not pass this class.
Groups of 5 or 6 will be determined and posted (Hopefully by Monday)
If you have a request to work with someone, include that information in your survey (Not required).
Friday, 10/10, is the last day I will accept any group requests.
I cannot guarantee that requests will be honored, but I will try.
I control group assignments to maintain some balance in skill level among groups.

BUA 455 Group Dashboard Project Information

Project Description
Interesting Data
- Students are also required to use AI tools to find data.
- I will provide a short demo of going from an obscure idea to good semi-related dataset using AI
Last year, I adapted the course to use the Quarto Dashboard because it became available in the late spring of 2024.
- I have posted examples from last year to give you ideas.
- Quarto provides a lot of flexibility BUT requires a little patience and iterative editing.
- Preview of HW 5 - Part 1 Example Using Quarto Dashboard

Upcoming Dates

Groups assigned by Wednesday, 10/15 at the latest.
Thu. 10/30 at 5:00 PM: Draft Proposals Due - NO GRACE PERIOD
- Proposals should be in bullet point format and include links to data sources
- It should take me 5 minutes to read your proposed ideas and check your data.
Proposal Meetings:
- Recommended but not required: Come with questions and be prepared to answer my questions (5-15 min. per groups)
- Meetings will take place outside of class. See sign-up sheet when it is posted.
Wed. 10/29: HW 5 - Part 1 Due
Thu. 11/6: Quiz 2
Tue. 11/11: Final Proposals Due
- Not much longer than draft proposal and also in bullet point.
- Questions and issues discussed during meeting should be addressed.

Reminders about HW 4

In Chunk 6 (Part 5), the chunk header in the the template appears as follows:

The eval=F prevents this chunk from being evaluated when it is knit.
eval=F was included in the template because original code was incomplete.
Remember to remove the text eval=F
Other helpful chunk header options for dashboard: echo=F, include=F
Chunk options can also be included as fences:
- e.g. #|label: import data and #|echo: false. See Quarto Cheat Sheet
NOTE: If two chunks have the SAME name or label, the file will not render.

Quarto Output Formats

So far, all Quarto files in this course have been rendered as HTML (.html) files or slides
- All slides for this course are created in Quarto.
Other common formats are Word documents, PDF documents, Powerpoint Slides, and dashboards
- This Quarto Reference site shows all the possible formats and provides details.
We will use the dashboard (next slide) format in HW 5 and in your projects.
Groups will also write their two project memos in Quarto and publish them as word documents.
- Writing the memos in Quarto files simplifies formatting R, RStudio and packages citations.

Quarto Dashboards

REQUIRED: Download the latest version of Quarto here
- You will not be able to complete HW 5 without having Quarto installed on your computer.
Quarto Dashboard is a new feature of Quarto that is extremely flexible and straightforward to use.
The Quarto Dashboard Gallery includes example dashboards made with R, Python, and other langaugages.
- In this course I will provide a simple template for HW 5 that can be used to build your dashboard.
- Once you understand how to add pages, rows, column, tabsets, and modify as needed you are welcome to tailor the template to your project.
- A Quarto dashboard is a flexible blank canvas that you can tailor to your project and future endeavors.

Types of Time Series Data in R

In recent weeks, we have worked with Box Office Mojo and Bureau of Labor Statistics Data
These datasets are time series data.
They all include a date variable and another quantitative variable that changes at each time period.
So far we have worked with data in an R format called a tibble.
Two common data formats in R, tibble and data.frame are needed for creating ggplots of time series.
- tibble is the more modern format and is more compatible with tidyverse commands to manage data.
Today, we’ll discuss a third data format, xts that can be used specifically for time series data.

Importing Stock Data as `xts` using `tidyquant` Package

Yahoo Finance, the Federal Reserve Bank, the Wall Street Journal, and others are excellent data sources that can be directly imported into R.
- The default for getsymbols in the tidyquant package is Yahoo Finance.
- Data format is xts which we will cover today

Code

```{r}
#|label: importing data from yahoo finance
#|output: false

# download data from Netflix, Amazon, Disney
# time series starts day after from date specified
# time series ends day before to date specified
 
getSymbols("NFLX", from = "2016-01-01", to = "2025-10-01")
getSymbols("AMZN", from = "2016-01-01", to = "2025-10-01")
getSymbols("DIS", from = "2016-01-01", to = "2025-10-01")
```

[1] "NFLX"
[1] "AMZN"
[1] "DIS"

Example of `hchart` for One Stock

hchart in the highcharter package is one way to plot xts data

This chunk not compatible with published slides or published html file but this code will work in a published dashboard (see posted examples).

Code

```{r hchart of 1 stock, fig.dim=c(15,4.5), echo=T, eval=F}
(hc_nflx <- hchart(NFLX$NFLX.Adjusted, name="Adjusted", color="green") |>   # plot adj. close
  hc_add_series(NFLX$NFLX.High, name="High" , color="blue") |>             # add daily high
  hc_add_series(NFLX$NFLX.Low, name="Low" , color="red"))                   # add daily low
```

R code for Multi-Panel `hcharts` display

Stocks can be shown in separate plots that can be shown side by side or in one stacked column
The command hw_grid is used to display them and ncol indicates how many columns.

Code

```{r separate stock plots, echo=T, eval=F}
nflx_plt <- hchart(NFLX$NFLX.Adjusted, name="Adjusted", color="green") |>
  hc_add_series(NFLX$NFLX.High, name="High" , color="darkgreen") |>
  hc_add_series(NFLX$NFLX.Low, name="Low" , color="lightgreen")

amzn_plt <- hchart(AMZN$AMZN.Adjusted, name="Adjusted", color="blue") |>
  hc_add_series(AMZN$AMZN.High, name="High" , color="darkblue") |>
  hc_add_series(AMZN$AMZN.Low, name="Low" , color="lightblue")

dis_plt <- hchart(DIS$DIS.Adjusted, name="Adjusted", color="mediumpurple") |>
  hc_add_series(DIS$DIS.High, name="High" , color="purple4") |>
  hc_add_series(DIS$DIS.Low, name="Low" , color="plum")
```

Multi-Panel `hcharts` Display

This chunk not compatible with published slides or published html file but this code will work in a published dashboard (see posted examples).

Code

```{r fig.dim=c(15,6), echo=T, eval=F}
#|label: display of hcharts
hw_grid(nflx_plt, amzn_plt, dis_plt, ncol=3)
```

Week 7 In-class Exercises - Q1

Poll Everywhere - My User Name: penelopepoolereisenbies685

In the example above, we use the hw_grid command to create a multi-plot composition of hcharts.

Previously, we covered another command to create a composition of non-interactive ggplots of tibble data.

What is that other command?

Hints:

This very useful command is in the gridExtra package which is loaded.

If gridExtra is loaded in R, start typing grid in the console, and the command and others will appear.

Week 7 In-class Exercises - Q2

Poll Everywhere - My User Name: penelopepoolereisenbies685

Use provided exampled of getSymbols code to write code to import the stock time series for Apple (AAPL)
- Use these dates: from = “2017-01-01”, to = “2025-10-06”
Open the imported xts file by clicking on it in the Global Environment
Sort the AAPL.Adjusted column by clicking on it.
Answer Question:
- On what recent date, was Apple (AAPL) report it’s highest adjusted closing value?

Code

```{r}
#|label: import aapl data
```

More Information about `xts`

When these stock datasets are imported, they are in xts format.
xts stands for Extensible Time Series which means they are self-aware.
The key feature is that date is NOT a variable, but instead the dates become row IDs.
- Any dataset with a date variable can be converted to an xts dataset.
- Any xts dataset can be converted a tibble or data.frame (two common R data formats).

Code

```{r}
#|label: examine xts data
head(NFLX)
```

           NFLX.Open NFLX.High NFLX.Low NFLX.Close NFLX.Volume NFLX.Adjusted
2016-01-04    109.00    110.00   105.21     109.96    20794800        109.96
2016-01-05    110.45    110.58   105.85     107.66    17664600        107.66
2016-01-06    105.29    117.91   104.96     117.68    33045700        117.68
2016-01-07    116.36    122.18   112.29     114.56    33636700        114.56
2016-01-08    116.33    117.72   111.10     111.39    18067100        111.39
2016-01-11    112.13    116.79   111.20     114.97    21920400        114.97

Merging `xts` datasets using merge

Converting xts to a tibble or dataframe (R data formats) is required if you want to create a ggplot or use other methods covered previously
A good first step is to create a merged xts dataset of the desired variables.

Code

```{r}
#|label: merge xts stock data 

# data are merged by matching dates
nflx_amzn_dis <- merge(NFLX$NFLX.Adjusted,
                       AMZN$AMZN.Adjusted,
                       DIS$DIS.Adjusted) 
head(nflx_amzn_dis)
```

           NFLX.Adjusted AMZN.Adjusted DIS.Adjusted
2016-01-04        109.96       31.8495     95.56268
2016-01-05        107.66       31.6895     93.63250
2016-01-06        117.68       31.6325     93.13140
2016-01-07        114.56       30.3970     92.33334
2016-01-08        111.39       30.3525     92.10133
2016-01-11        114.97       30.8870     92.72308

Converting `xts` datasets to tibble format

There are a few ways to convert an xts to a tibble.
In the code below I show the conversion and then I rename the the new date variable as date

Code

```{r convert xts to tibble}
# converting data to a tibble requires a couple lines of code 
# I prefer to rename the index as date 
nflx_amzn_dis_tibble <- nflx_amzn_dis |> 
  fortify.zoo() |> as_tibble(.name_repair = "minimal") |>
  rename("date" = "Index") 
head(nflx_amzn_dis_tibble)
```

# A tibble: 6 × 4
  date       NFLX.Adjusted AMZN.Adjusted DIS.Adjusted
  <date>             <dbl>         <dbl>        <dbl>
1 2016-01-04          110.          31.8         95.6
2 2016-01-05          108.          31.7         93.6
3 2016-01-06          118.          31.6         93.1
4 2016-01-07          115.          30.4         92.3
5 2016-01-08          111.          30.4         92.1
6 2016-01-11          115.          30.9         92.7

Converting tibble datasets to `xts`

Any dataset with a date formatted variable can be converted to an xts dataset
This means that we can create a hchart or dygraph (next topic) for any dataset with a date variable.

Code

```{r}
#|label: convert tibble to xts
exp_imp <- read_csv("data/export_import_tidy.csv", show_col_types=F)
exp_imp_xts <- xts(x=exp_imp[,2:3], order.by=exp_imp$date) # order.by must be a date variable
```

Code

```{r}
#|label: hchart code export import xts
exp_imp_hchart <- hchart(exp_imp_xts$exp_indx, 
                         name="Export Price Index", color="blue") |>
   hc_add_series(exp_imp_xts$imp_indx, 
                 name="Import Price Index" , color="red")
```

Export Import HighChart (`hchart`)

Code

```{r fig.dim=c(15,4)}
#|label: display of hchart
exp_imp_hchart
```

Dygraphs - An Alternative to `hchart`

dygraph is a more flexible alternative to hchart.
- Straightforward to modify, add reference lines and shaded regions
- Both dygraph and hchart allow viewer to interactively select date range

Here is the dataset we will use:

Code

```{r}
#|label: dataset for dygraphs example
three_stocks <- merge(AMZN$AMZN.Adjusted, DIS$DIS.Adjusted, NFLX$NFLX.Adjusted) 
names(three_stocks) <- c("AMZN.adj", "DIS.adj", "NFLX.adj")
head(three_stocks, 3) # print first three rows only
```

           AMZN.adj  DIS.adj NFLX.adj
2016-01-04  31.8495 95.56268   109.96
2016-01-05  31.6895 93.63250   107.66
2016-01-06  31.6325 93.13140   117.68

Basic unformatted plot of three stocks with the range selector option

Code

```{r fig.dim=c(15,4)}
#|label: dygraph with range selector
(dy3 <- dygraph(three_stocks, main="Streaming Company Stock Trends") |>
  dySeries("AMZN.adj", label="AMZN", color= "green") |>
  dySeries("DIS.adj", label="DIS", color= "red") |>
  dySeries("NFLX.adj", label="NFLX", color= "blue") |>
  dyRangeSelector())
```

Two useful formatting options (shown below) to make the plot more readable are: Removing the the grid lines Formatting the axis labels

Code

```{r fig.dim=c(15,3.5)}
#|label: dygraph with axes labeled and gridlines removed
(dy3 <- dy3 |>
  dyAxis("y", label = "Adjusted Close", drawGrid = FALSE) |>
  dyAxis("x", label = "Date", drawGrid = FALSE))
```

Vertical lines can be added at specific dates and can be labeled and formatted.

Code

```{r fig.dim=c(15,4)}
#|label: dygraph with event lines
(dy3 <- dy3 |>
  dyEvent("2020-3-12", label = "Theaters Closed", labelLoc = "bottom") |>
  dyEvent("2021-6-15", label = "Restrictions End", labelLoc = "bottom", strokePattern = "solid"))
```

Alternatively, it may be helpful to shade plot for a specific time range.

Code

```{r fig.dim=c(15,4)}
#|label:  dygraph with shaded region
(dy3 <- dy3 |>
  dyShading(from = "2020-3-12", to = "2021-6-15", axis = "x", color = "lightgrey"))
```

Review: `bls_tidy` Function - Labor Data

Before using our function on new data, we ALWAYS examine the .csv files
The number of rows to skip for these three labor datasets is 11.

Code

```{r run bls_tidy and import labor data}
bls_tidy <- function(data_file, skip_num, var_name){
  read_csv(data_file, skip = skip_num, show_col_types = F) |> 
  pivot_longer(cols = Jan:Dec,                      
               names_to = "month", 
               values_to = "value") |>
  filter(!is.na(value)) |>                    
  rename({{var_name}} := "value")                             
}

labor_force <- bls_tidy("data/bls_civ_lf.csv", skip_num=11, var_name="lf")
unemp <- bls_tidy("data/bls_civ_unemp.csv", skip_num=11, var_name="unemp")
emp <- bls_tidy("data/bls_civ_emp.csv", skip_num=11, var_name="emp")

head(unemp)
```

# A tibble: 6 × 3
   Year month unemp
  <dbl> <chr> <dbl>
1  2014 Jan   10202
2  2014 Feb   10349
3  2014 Mar   10380
4  2014 Apr    9702
5  2014 May    9859
6  2014 Jun    9460

Joining More than Two Datasets

Last Week and in HW 4 we covered joining TWO datasets.
The commands we covered (there are 4) all have the same limitation: datasets must be joined two at a time.

Joining with Piping

Code

```{r}
#|label: joining 3 datasets with pipes
# with piping
lf_all <- labor_force |>
  full_join(emp) |>
  full_join(unemp) |>
  write_csv("data/labor_tidy.csv") #export
```

Joining with `by = join_by(Year, month)`
Joining with `by = join_by(Year, month)`

Code

```{r}
head(lf_all)
```

# A tibble: 6 × 5
   Year month     lf    emp unemp
  <dbl> <chr>  <dbl>  <dbl> <dbl>
1  2014 Jan   155352 145150 10202
2  2014 Feb   155483 145134 10349
3  2014 Mar   156028 145648 10380
4  2014 Apr   155369 145667  9702
5  2014 May   155684 145825  9859
6  2014 Jun   155707 146247  9460

Joining without Piping

Code

```{r}
#|label: joining 3 datasets without pipes
lf_all <- full_join(labor_force, emp) 
```

Joining with `by = join_by(Year, month)`

Code

```{r}
lf_all <- full_join(lf_all, unemp) 
```

Joining with `by = join_by(Year, month)`

Code

```{r}
head(lf_all)
```

# A tibble: 6 × 5
   Year month     lf    emp unemp
  <dbl> <chr>  <dbl>  <dbl> <dbl>
1  2014 Jan   155352 145150 10202
2  2014 Feb   155483 145134 10349
3  2014 Mar   156028 145648 10380
4  2014 Apr   155369 145667  9702
5  2014 May   155684 145825  9859
6  2014 Jun   155707 146247  9460

Review: Dates and Plot Data

Chunk below includes code that is similar to Parts 3 and 4 of HW 4.
BONUS: Code modified to show how to get ‘End of Month’ (eom) date.
- Useful Link

Code

```{r}
#|label: dates and data mod for plot
lf_plt <- lf_all |>
  mutate(date_som = ym(paste(Year, month)),         # create som date var
         date = ceiling_date(date_som, "month")-1,  # create eom month date var
         empM = (emp/1000) |> round(2),             # convert counts to millions
         unempM = (unemp/1000) |> round(2)) |>
  select(date, empM, unempM) |>                     # select vars and reshape
  pivot_longer(cols=empM:unempM, names_to = "type", values_to = "count") |>
  mutate(type = factor(type,                        # create factor var for plot
                       levels = c("unempM", "empM"),
                       labels = c("Unemployed", "Employed"))) 

head(lf_plt, 4) # examine first 8 rows
```

# A tibble: 4 × 3
  date       type       count
  <date>     <fct>      <dbl>
1 2014-01-31 Employed   145. 
2 2014-01-31 Unemployed  10.2
3 2014-02-28 Employed   145. 
4 2014-02-28 Unemployed  10.4

Code for Polished Area Plot for Slides

Useful for data that sum to a whole: Employed + Unemployed = Total Labor Force

Code

```{r plot code for lf area plot}
lf_area_plt_slides <- lf_plt |>
  ggplot() +
  geom_area(aes(x=date, y=count, fill=type)) +
  theme_classic() +
  theme(legend.position="bottom") +
  scale_fill_manual(values=c("red", "blue")) + 
  scale_x_date(date_breaks = "year", date_labels = "%Y") +
  labs(x="Date", y = "Number of Peolple (Millions)", fill="",
       title="Total Labor Force: Employed and Unemployed ", 
       subtitle="Jan. 2014 - June 2024",
       caption="Data Source:www.bls.gov") + 
  theme(plot.title = element_text(size = 20),                    
        plot.subtitle = element_text(size = 15),
        axis.title = element_text(size=18),
        axis.text = element_text(size=15),
        plot.caption = element_text(size = 10),
        legend.text = element_text(size = 12),
        panel.border = element_rect(colour = "lightgrey", fill=NA, linewidth=2),
        plot.background = element_rect(colour = "darkgrey", fill=NA, linewidth=2))
```

Area Plot Formatted for Slides

Area Plot for HTML, Documents and Export

Additional formatting in previous slides can always be added
Plot exported using ggsave which by default exports last plot created

Code

```{r}
#|label: simpler plot code with ggsave export 

lf_area_plt <- lf_plt |>
  ggplot() +
  geom_area(aes(x=date, y=count, fill=type)) +
  theme_classic() +
  theme(legend.position="bottom") +
  scale_fill_manual(values=c("red", "blue")) + 
  scale_x_date(date_breaks = "year", date_labels = "%Y") +
  labs(x="Date", y = "Number of Peolple (Millions)", fill="",
       title="Total Labor Force: Employed and Unemployed ", 
       subtitle="Jan. 2014 - Jun. 2024",
       caption="Data Source:www.bls.gov") + 
  theme(plot.title = element_text(size = 20),                    
        plot.subtitle = element_text(size = 15),
        axis.title = element_text(size=18),
        axis.text = element_text(size=15),
        plot.caption = element_text(size = 10),
        legend.text = element_text(size = 12))
ggsave("img/labor_force_area_plot.png", width=6,height=4)
```

Exported Plot

Looks fine in HTML notes but not slides
May be fine in Word Document or Dashboard
If not, previous code shows additional options for formatting

Week 8 In-class Exercise

In this exercise we will:

Import labor_tidy.csv and convert variables to millions and round to 2 decimal places and select two variables. (Review)

OPTIONAL: use provided example to create an END of Month (eom) date variable and use that.

Code

```{r}
#|label: import labor_tidy and modify variables
labor_new <- read_csv("data/labor_tidy.csv", show_col_types=F) |>
  mutate(date = ym(paste(Year,month)),
         lfM = (lf/1000) |> round(2),
         empM = (emp/1000) |> round(2))|>
  select(date, lfM, empM)
```

Convert labor_new to an xts format, labor_xts

Code

```{r}
#|label: create labor_xts
```

In-class Exercise Cont’d

Create an unformatted hchart OR a dygraph with two variables
- Plot lfM and empM and save it as labor_hc or labor_dy

Code

```{r}
#|label: create and display labor hchart
# (labor_hc <- hchart())  or   (labor_dy <- dygraph())
```

Basic `hchart`

Basic `dygraph`

In-class Exercise - Final Steps

Submit screenshots of plot from Viewer pane.
Save R code as an R Script. In the R project folder I have saved an R Script for your work (Updated October 2025).

Copy and paste code into provided R Script and use save as to save the file with your name., e.g. Week_8_In_Class_Penelope_Pooler.R
R Script should include:
- code I provided to import and modify data
- tibble to xts conversion of labor dataset
- hchart OR dygraph plot code with comments
Submit final script on Blackboard (counts towards class participation for Week 8)
Due by Friday 10/17. No late submission accepted for In-class Exercises.

Quarto, R Markdown files and R Scripts

Quarto and Markdown files are ‘smart’, i.e. aware of where they are located.
R Scripts (older common file type) are useful BUT not aware of file location.
- User must specify working directory
- The script I provided is saved to your working directory
To check working directory: getwd()
To set working directory to code_data_output folder: (for working in an R Script)
- Click Session > Set Working Directory > To Source File Location

NOTES:

R users and developers do not recommend setting working directories within code which would have to be changed for each laptop.
Whenever possible, use R Projects and ‘smart’ files such as .qmd and .Rmd files.

Key Points from Weeks 7 and 8

Time Series Data

Importing stock data from Yahoo Finance as xts
Converting between xts and tibble
Plotting options include area plots, hcharts and dygraphs
dygraphs and hcharts are useful tools for understanding, managing, and curating time series data.
HW 4 due Wednesday, 10/15.
- Grace period in effect.
- TAs and I are available to assist if you have questions.

You may submit an ‘Engagement Question’ about each lecture until midnight on the day of the lecture. A minimum of four submissions are required during the semester.

--- title: "Weeks 7 and 8" subtitle: "Time Series, Data Formats, Output Formats, Project Introduction" author: "Penelope Pooler Eisenbies" date: last-modified lightbox: true toc: true toc-depth: 3 toc-location: left toc-title: "Table of Contents" toc-expand: 1 format: html: code-line-numbers: true code-fold: true code-tools: true execute: echo: fenced --- ## Housekeeping ```{r include=F} #|label: setup knitr::opts_chunk$set(echo=T, highlight=T) # specifies default options for all chunks options(scipen=100) # suppress scientific notation # install pacman if needed if (!require("pacman")) install.packages("pacman", repos = "http://lib.stat.cmu.edu/R/CRAN/") pacman::p_load(pacman, tidyverse, gridExtra, magrittr, kableExtra, tidyquant, highcharter, dygraphs, htmlwidgets, widgetframe, js) # install and load required packages p_loaded() # verify loaded packages ``` - Final grading in this course: - adheres to Whitman grading policy, but is fairly gentle. - takes into account assignments, course project, and class particpation. - Quiz 2 will be during Week 11 and will combine previous skills with material from weeks 6 through 10 - It will be similar to Quiz 1 but may have more questions and more steps in multi-step tasks. - If you have questions about your quiz, please let me know. ::: fragment **HW 4 is posted and is due on Wednesday, 10/15/25.** ::: - **HW 4 - Part 1** is due on Wednesday 10/8/25 and is required in order for you to complete this course. - **There are no office hours on Thursday this week, 10/9/25.** ## BUA 455 Group Dashboard Project ::: fragment **Group Assignments** ::: - Complete HW 4 - Part 1 TODAY, 10/8! (This should only take 5 min.) - **Note:** If you do not complete this Survey, I will not put you in a project group and you can not pass this class. - Groups of 5 or 6 will be determined and posted (Hopefully by Monday) - If you have a request to work with someone, include that information in your survey (Not required). - **Friday, 10/10, is the last day I will accept any group requests.** - I cannot guarantee that requests will be honored, but I will try. - I control group assignments to maintain some balance in skill level among groups. ## ### BUA 455 Group Dashboard Project Information - [Project Description](https://docs.google.com/document/d/1U-DJ3yeHPpxcg1o12Cg2qc2Besb6Jw6UiyAo6gR4S2I/edit?usp=sharing){target="_blank"} - [Interesting Data](https://penelope2040.quarto.pub/bua-455-semester/#interesting-data){target="_blank"} - Students are also required to use AI tools to find data. - I will provide a short demo of going from an obscure idea to good semi-related dataset using AI - Last year, I adapted the course to use the `Quarto Dashboard` because it became available in the late spring of 2024. - I have posted examples from last year to give you ideas. - Quarto provides a lot of flexibility BUT requires a little patience and iterative editing. - [Preview of HW 5 - Part 1 Example Using Quarto Dashboard](https://rpubs.com/PeneLope_PE/1229274) ## Upcoming Dates - **Groups assigned by Wednesday, 10/15 at the latest.** - **Thu. 10/30 at 5:00 PM:** Draft Proposals Due - NO GRACE PERIOD - Proposals should be in bullet point format and include links to data sources - It should take me 5 minutes to read your proposed ideas and check your data. - **Proposal Meetings:** - Recommended but not required: Come with questions and be prepared to answer my questions (5-15 min. per groups) - Meetings will take place outside of class. See sign-up sheet when it is posted. - **Wed. 10/29:** HW 5 - Part 1 Due - **Thu. 11/6:** Quiz 2 - **Tue. 11/11:** Final Proposals Due - Not much longer than draft proposal and also in bullet point. - Questions and issues discussed during meeting should be addressed. ## Reminders about HW 4 - In Chunk 6 (Part 5), the chunk header in the the template appears as follows: ::: fragment ![](img/HW4_Chunk6_Header.png){fig-align="center"} ::: - The `eval=F` prevents this chunk from being evaluated when it is knit. - `eval=F` was included in the template because original code was incomplete. - Remember to remove the text `eval=F` - Other helpful chunk header options for dashboard: `echo=F`, `include=F` - Chunk options can also be included as fences: - e.g. `#|label: import data` and `#|echo: false`. See [Quarto Cheat Sheet](https://rstudio.github.io/cheatsheets/quarto.pdf){target="_blank"} - **NOTE:** If two chunks have the SAME name or label, the file will not render. ## Quarto Output Formats - So far, all Quarto files in this course have been rendered as HTML (.html) files or slides - All slides for this course are created in Quarto. - Other common formats are Word documents, PDF documents, Powerpoint Slides, and **dashboards** - This [Quarto Reference site](https://quarto.org/docs/reference/) shows all the possible formats and provides details. - We will use the dashboard (next slide) format in HW 5 and in your projects. - Groups will also write their two project memos in Quarto and publish them as word documents. - Writing the memos in Quarto files simplifies formatting R, RStudio and packages citations. ## Quarto Dashboards - REQUIRED: [Download the latest version of Quarto here](https://quarto.org/docs/get-started/) - You will not be able to complete HW 5 without having Quarto installed on your computer. - [Quarto Dashboard](https://quarto.org/docs/dashboards/) is a new feature of Quarto that is extremely flexible and straightforward to use. - The [Quarto Dashboard Gallery](https://quarto.org/docs/gallery/#dashboards) includes example dashboards made with R, Python, and other langaugages. - In this course I will provide a simple template for HW 5 that can be used to build your dashboard. - Once you understand how to add pages, rows, column, tabsets, and modify as needed you are welcome to tailor the template to your project. - **A Quarto dashboard is a flexible blank canvas that you can tailor to your project and future endeavors.** ## Types of Time Series Data in R - In recent weeks, we have worked with Box Office Mojo and Bureau of Labor Statistics Data - These datasets are time series data. - They all include a date variable and another quantitative variable that changes at each time period. - So far we have worked with data in an R format called a `tibble`. - Two common data formats in R, `tibble` and `data.frame` are needed for creating ggplots of time series. - `tibble` is the more modern format and is more compatible with `tidyverse` commands to manage data. - Today, we'll discuss a third data format, `xts` that can be used specifically for time series data. ## ### Importing Stock Data as `xts` using `tidyquant` Package - [Yahoo Finance](https://finance.yahoo.com/), the Federal Reserve Bank, the Wall Street Journal, and others are excellent data sources that can be directly imported into R. - The default for `getsymbols` in the `tidyquant` package is Yahoo Finance. - Data format is `xts` which we will cover today ::: fragment ```{r} #|label: importing data from yahoo finance #|output: false # download data from Netflix, Amazon, Disney # time series starts day after from date specified # time series ends day before to date specified getSymbols("NFLX", from = "2016-01-01", to = "2025-10-01") getSymbols("AMZN", from = "2016-01-01", to = "2025-10-01") getSymbols("DIS", from = "2016-01-01", to = "2025-10-01") ``` ::: ## Example of `hchart` for One Stock `hchart` in the `highcharter` package is one way to plot `xts` data This chunk not compatible with published slides or published html file but this code will work in a published dashboard (see posted examples). ```{r hchart of 1 stock, fig.dim=c(15,4.5), echo=T, eval=F} (hc_nflx <- hchart(NFLX$NFLX.Adjusted, name="Adjusted", color="green") |> # plot adj. close hc_add_series(NFLX$NFLX.High, name="High" , color="blue") |> # add daily high hc_add_series(NFLX$NFLX.Low, name="Low" , color="red")) # add daily low ``` ## R code for Multi-Panel `hcharts` display - Stocks can be shown in separate plots that can be shown side by side or in one stacked column - The command `hw_grid` is used to display them and `ncol` indicates how many columns. ::: fragment ```{r separate stock plots, echo=T, eval=F} nflx_plt <- hchart(NFLX$NFLX.Adjusted, name="Adjusted", color="green") |> hc_add_series(NFLX$NFLX.High, name="High" , color="darkgreen") |> hc_add_series(NFLX$NFLX.Low, name="Low" , color="lightgreen") amzn_plt <- hchart(AMZN$AMZN.Adjusted, name="Adjusted", color="blue") |> hc_add_series(AMZN$AMZN.High, name="High" , color="darkblue") |> hc_add_series(AMZN$AMZN.Low, name="Low" , color="lightblue") dis_plt <- hchart(DIS$DIS.Adjusted, name="Adjusted", color="mediumpurple") |> hc_add_series(DIS$DIS.High, name="High" , color="purple4") |> hc_add_series(DIS$DIS.Low, name="Low" , color="plum") ``` ::: ## Multi-Panel `hcharts` Display This chunk not compatible with published slides or published html file but this code will work in a published dashboard (see posted examples). ```{r fig.dim=c(15,6), echo=T, eval=F} #|label: display of hcharts hw_grid(nflx_plt, amzn_plt, dis_plt, ncol=3) ``` ## Week 7 In-class Exercises - Q1 [***Poll Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} - My User Name: **penelopepoolereisenbies685** In the example above, we use the `hw_grid` command to create a multi-plot composition of hcharts. Previously, we covered another command to create a composition of non-interactive ggplots of `tibble` data. <br> **What is that other command?** **Hints:** This very useful command is in the `gridExtra` package which is loaded. If `gridExtra` is loaded in R, start typing `grid` in the console, and the command and others will appear. ## Week 7 In-class Exercises - Q2 [***Poll Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} - My User Name: **penelopepoolereisenbies685** 1. Use provided exampled of `getSymbols` code to write code to import the stock time series for Apple (`AAPL`) - Use these dates: from = "2017-01-01", to = "2025-10-06" 2. Open the imported `xts` file by clicking on it in the `Global Environment` 3. Sort the `AAPL.Adjusted` column by clicking on it. 4. Answer Question: - On what recent date, was Apple (AAPL) report it's highest adjusted closing value? ::: fragment ```{r} #|label: import aapl data ``` ::: ## More Information about `xts` - When these stock datasets are imported, they are in `xts` format. - `xts` stands for **Extensible Time Series** which means they are self-aware. - The key feature is that `date` is NOT a variable, but instead the dates become row IDs. - Any dataset with a `date` variable can be converted to an `xts` dataset. - Any `xts` dataset can be converted a tibble or data.frame (two common R data formats). ::: fragment ```{r} #|label: examine xts data head(NFLX) ``` ::: ## Merging `xts` datasets using merge - Converting xts to a tibble or dataframe (R data formats) is required if you want to create a ggplot or use other methods covered previously - A good first step is to create a merged `xts` dataset of the desired variables. ::: fragment ```{r} #|label: merge xts stock data # data are merged by matching dates nflx_amzn_dis <- merge(NFLX$NFLX.Adjusted, AMZN$AMZN.Adjusted, DIS$DIS.Adjusted) head(nflx_amzn_dis) ``` ::: ## Converting `xts` datasets to tibble format - There are a few ways to convert an xts to a tibble. - In the code below I show the conversion and then I rename the the new date variable as `date` ::: fragment ```{r convert xts to tibble} # converting data to a tibble requires a couple lines of code # I prefer to rename the index as date nflx_amzn_dis_tibble <- nflx_amzn_dis |> fortify.zoo() |> as_tibble(.name_repair = "minimal") |> rename("date" = "Index") head(nflx_amzn_dis_tibble) ``` ::: ## Converting tibble datasets to `xts` - Any dataset with a date formatted variable can be converted to an `xts` dataset - This means that we can create a `hchart` or `dygraph` (next topic) for any dataset with a `date` variable. ::: fragment ```{r} #|label: convert tibble to xts exp_imp <- read_csv("data/export_import_tidy.csv", show_col_types=F) exp_imp_xts <- xts(x=exp_imp[,2:3], order.by=exp_imp$date) # order.by must be a date variable ``` ::: ::: fragment ```{r} #|label: hchart code export import xts exp_imp_hchart <- hchart(exp_imp_xts$exp_indx, name="Export Price Index", color="blue") |> hc_add_series(exp_imp_xts$imp_indx, name="Import Price Index" , color="red") ``` ::: ## Export Import HighChart (`hchart`) ```{r fig.dim=c(15,4)} #|label: display of hchart exp_imp_hchart ``` ## Dygraphs - An Alternative to `hchart` :::: panel-tabset ### [Background]{style="color:blue;"} - `dygraph` is a more flexible alternative to `hchart`. - Straightforward to modify, add reference lines and shaded regions - Both `dygraph` and `hchart` allow viewer to interactively select date range ::: fragment Here is the dataset we will use: ```{r} #|label: dataset for dygraphs example three_stocks <- merge(AMZN$AMZN.Adjusted, DIS$DIS.Adjusted, NFLX$NFLX.Adjusted) names(three_stocks) <- c("AMZN.adj", "DIS.adj", "NFLX.adj") head(three_stocks, 3) # print first three rows only ``` ::: ### [Unformatted]{style="color:blue;"} Basic unformatted plot of three stocks with the range selector option ```{r fig.dim=c(15,4)} #|label: dygraph with range selector (dy3 <- dygraph(three_stocks, main="Streaming Company Stock Trends") |> dySeries("AMZN.adj", label="AMZN", color= "green") |> dySeries("DIS.adj", label="DIS", color= "red") |> dySeries("NFLX.adj", label="NFLX", color= "blue") |> dyRangeSelector()) ``` ### [Grid & Axes]{style="color:blue;"} Two useful formatting options (shown below) to make the plot more readable are: Removing the the grid lines Formatting the axis labels ```{r fig.dim=c(15,3.5)} #|label: dygraph with axes labeled and gridlines removed (dy3 <- dy3 |> dyAxis("y", label = "Adjusted Close", drawGrid = FALSE) |> dyAxis("x", label = "Date", drawGrid = FALSE)) ``` ### [Event Lines]{style="color:blue;"} Vertical lines can be added at specific dates and can be labeled and formatted. ```{r fig.dim=c(15,4)} #|label: dygraph with event lines (dy3 <- dy3 |> dyEvent("2020-3-12", label = "Theaters Closed", labelLoc = "bottom") |> dyEvent("2021-6-15", label = "Restrictions End", labelLoc = "bottom", strokePattern = "solid")) ``` ### [Shading]{style="color:blue;"} Alternatively, it may be helpful to shade plot for a specific time range. ```{r fig.dim=c(15,4)} #|label: dygraph with shaded region (dy3 <- dy3 |> dyShading(from = "2020-3-12", to = "2021-6-15", axis = "x", color = "lightgrey")) ``` :::: ## Review: `bls_tidy` Function - Labor Data - Before using our function on new data, we **ALWAYS** examine the .csv files - The number of rows to skip for these three labor datasets is **11**. ::: fragment ```{r run bls_tidy and import labor data} bls_tidy <- function(data_file, skip_num, var_name){ read_csv(data_file, skip = skip_num, show_col_types = F) |> pivot_longer(cols = Jan:Dec, names_to = "month", values_to = "value") |> filter(!is.na(value)) |> rename({{var_name}} := "value") } labor_force <- bls_tidy("data/bls_civ_lf.csv", skip_num=11, var_name="lf") unemp <- bls_tidy("data/bls_civ_unemp.csv", skip_num=11, var_name="unemp") emp <- bls_tidy("data/bls_civ_emp.csv", skip_num=11, var_name="emp") head(unemp) ``` ::: ## Joining More than Two Datasets - Last Week and in HW 4 we covered joining TWO datasets. - The commands we covered (there are 4) all have the same limitation: **datasets must be joined two at a time.** :::::::: columns :::: {.column width="48%"} ::: fragment **Joining with Piping** ```{r} #|label: joining 3 datasets with pipes # with piping lf_all <- labor_force |> full_join(emp) |> full_join(unemp) |> write_csv("data/labor_tidy.csv") #export head(lf_all) ``` ::: :::: ::: {.column width="4%"} ::: :::: {.column width="48%"} ::: fragment **Joining without Piping** ```{r} #|label: joining 3 datasets without pipes lf_all <- full_join(labor_force, emp) lf_all <- full_join(lf_all, unemp) head(lf_all) ``` ::: :::: :::::::: ## Review: Dates and Plot Data - Chunk below includes code that is similar to Parts 3 and 4 of HW 4. - BONUS: Code modified to show how to get 'End of Month' (eom) date. - [**Useful Link**](https://www.statology.org/lubridate-first-last-day-of-month/) ::: fragment ```{r} #|label: dates and data mod for plot lf_plt <- lf_all |> mutate(date_som = ym(paste(Year, month)), # create som date var date = ceiling_date(date_som, "month")-1, # create eom month date var empM = (emp/1000) |> round(2), # convert counts to millions unempM = (unemp/1000) |> round(2)) |> select(date, empM, unempM) |> # select vars and reshape pivot_longer(cols=empM:unempM, names_to = "type", values_to = "count") |> mutate(type = factor(type, # create factor var for plot levels = c("unempM", "empM"), labels = c("Unemployed", "Employed"))) head(lf_plt, 4) # examine first 8 rows ``` ::: ## Code for Polished Area Plot for Slides - Useful for data that sum to a whole: **Employed + Unemployed = Total Labor Force** ::: fragment ```{r plot code for lf area plot} lf_area_plt_slides <- lf_plt |> ggplot() + geom_area(aes(x=date, y=count, fill=type)) + theme_classic() + theme(legend.position="bottom") + scale_fill_manual(values=c("red", "blue")) + scale_x_date(date_breaks = "year", date_labels = "%Y") + labs(x="Date", y = "Number of Peolple (Millions)", fill="", title="Total Labor Force: Employed and Unemployed ", subtitle="Jan. 2014 - June 2024", caption="Data Source:www.bls.gov") + theme(plot.title = element_text(size = 20), plot.subtitle = element_text(size = 15), axis.title = element_text(size=18), axis.text = element_text(size=15), plot.caption = element_text(size = 10), legend.text = element_text(size = 12), panel.border = element_rect(colour = "lightgrey", fill=NA, linewidth=2), plot.background = element_rect(colour = "darkgrey", fill=NA, linewidth=2)) ``` ::: ## Area Plot Formatted for Slides ```{r echo=F, fig.dim=c(15,7)} #|label: display of final area plot lf_area_plt_slides ``` ## ### Area Plot for HTML, Documents and Export - Additional formatting in previous slides can always be added - Plot exported using `ggsave` which by default exports last plot created ::: fragment ```{r} #|label: simpler plot code with ggsave export lf_area_plt <- lf_plt |> ggplot() + geom_area(aes(x=date, y=count, fill=type)) + theme_classic() + theme(legend.position="bottom") + scale_fill_manual(values=c("red", "blue")) + scale_x_date(date_breaks = "year", date_labels = "%Y") + labs(x="Date", y = "Number of Peolple (Millions)", fill="", title="Total Labor Force: Employed and Unemployed ", subtitle="Jan. 2014 - Jun. 2024", caption="Data Source:www.bls.gov") + theme(plot.title = element_text(size = 20), plot.subtitle = element_text(size = 15), axis.title = element_text(size=18), axis.text = element_text(size=15), plot.caption = element_text(size = 10), legend.text = element_text(size = 12)) ggsave("img/labor_force_area_plot.png", width=6,height=4) ``` ::: ## Exported Plot - Looks fine in HTML notes but not slides - May be fine in Word Document or Dashboard - If not, previous code shows additional options for formatting ::: fragment ```{r fig.dim=c(15,6), echo=F} #|label: display of exported plot lf_area_plt ``` ::: ## Week 8 In-class Exercise In this exercise we will: 1. Import `labor_tidy.csv` and convert variables to millions and round to 2 decimal places and select two variables. (Review) - OPTIONAL: use provided example to create an END of Month (eom) date variable and use that. ::: fragment ```{r} #|label: import labor_tidy and modify variables labor_new <- read_csv("data/labor_tidy.csv", show_col_types=F) |> mutate(date = ym(paste(Year,month)), lfM = (lf/1000) |> round(2), empM = (emp/1000) |> round(2))|> select(date, lfM, empM) ``` ::: 2. Convert `labor_new` to an `xts` format, `labor_xts` ::: fragment ```{r} #|label: create labor_xts ``` ```{r solution, echo=F} #|label: create labor_xts sol'n labor_xts <- labor_new |> xts(x=labor_new[,2:3], order.by=labor_new$date) ``` ::: ## ### In-class Exercise Cont'd 4. Create an unformatted `hchart` OR a `dygraph` with two variables - Plot `lfM` and `empM` and save it as `labor_hc` or `labor_dy` ::: fragment ```{r} #|label: create and display labor hchart # (labor_hc <- hchart()) or (labor_dy <- dygraph()) ``` ::: ## Basic `hchart` ```{r echo=F, fig.dim=c(15,5)} #|label: create and display labor hchart sol'n # create labor and emp plot and print to screen (labor_hc <- hchart(labor_xts$lfM, name="Tot. Labor Force (mill.)", color="red") |> hc_add_series(labor_xts$empM, name="Employed (mill.)", color="blue")) ``` ## Basic `dygraph` ```{r echo=F, fig.dim=c(15,5)} #|label: create and display basic dygraph sol'n (labor_dy <- dygraph(labor_xts, main="Total Labor and Employed") |> dySeries("lfM", label="Total Labor", color= "red") |> dySeries("empM", label="Employed", color= "blue") |> dyRangeSelector()) ``` ## ### In-class Exercise - Final Steps 5. Submit screenshots of plot from `Viewer` pane. 6. Save R code as an R Script. In the R project folder I have saved an R Script for your work (Updated October 2025). - Copy and paste code into provided R Script and use `save as` to save the file with your name., e.g. `Week_8_In_Class_Penelope_Pooler.R` - **R Script should include:** - **code I provided** to import and modify data - **tibble to xts conversion of labor dataset** - **hchart OR dygraph plot** code with comments - Submit final script on Blackboard (counts towards class participation for Week 8) - Due by Friday 10/17. No late submission accepted for In-class Exercises. ## Quarto, R Markdown files and R Scripts - Quarto and Markdown files are 'smart', i.e. aware of where they are located. - R Scripts (older common file type) are useful BUT not aware of file location. - User must specify working directory - The script I provided is saved to your working directory - To check working directory: `getwd()` - To set working directory to code_data_output folder: (for working in an R Script) - Click Session \> Set Working Directory \> To Source File Location ::: fragment **NOTES:** ::: - R users and developers do not recommend setting working directories within code which would have to be changed for each laptop. - Whenever possible, use R Projects and 'smart' files such as `.qmd` and `.Rmd` files. ## ### Key Points from Weeks 7 and 8 ::: fragment **Time Series Data** ::: - Importing stock data from Yahoo Finance as `xts` - Converting between `xts` and `tibble` - Plotting options include area plots, hcharts and dygraphs - `dygraphs` and `hcharts` are useful tools for understanding, managing, and curating time series data. - HW 4 due Wednesday, 10/15. - Grace period in effect. - TAs and I are available to assist if you have questions. <br> ::: fragment You may submit an 'Engagement Question' about each lecture until midnight on the day of the lecture. **A minimum of four submissions are required during the semester.** :::

Housekeeping

BUA 455 Group Dashboard Project

BUA 455 Group Dashboard Project Information

Upcoming Dates

Reminders about HW 4

Quarto Output Formats

Quarto Dashboards

Types of Time Series Data in R

Importing Stock Data as xts using tidyquant Package

Example of hchart for One Stock

R code for Multi-Panel hcharts display

Multi-Panel hcharts Display

Week 7 In-class Exercises - Q1

Week 7 In-class Exercises - Q2

More Information about xts

Merging xts datasets using merge

Converting xts datasets to tibble format

Converting tibble datasets to xts

Export Import HighChart (hchart)

Dygraphs - An Alternative to hchart

Review: bls_tidy Function - Labor Data

Joining More than Two Datasets

Review: Dates and Plot Data

Code for Polished Area Plot for Slides

Area Plot Formatted for Slides

Area Plot for HTML, Documents and Export

Exported Plot

Week 8 In-class Exercise

In-class Exercise Cont’d

Basic hchart

Basic dygraph

In-class Exercise - Final Steps

Quarto, R Markdown files and R Scripts

Key Points from Weeks 7 and 8

Importing Stock Data as `xts` using `tidyquant` Package

Example of `hchart` for One Stock

R code for Multi-Panel `hcharts` display

Multi-Panel `hcharts` Display

More Information about `xts`

Merging `xts` datasets using merge

Converting `xts` datasets to tibble format

Converting tibble datasets to `xts`

Export Import HighChart (`hchart`)

Dygraphs - An Alternative to `hchart`

Review: `bls_tidy` Function - Labor Data

Basic `hchart`

Basic `dygraph`