This document demonstrates how to reshape stock price data from wide
format to long format using the tidyverse
package in R. The
dataset contains weekly closing prices for five US-listed companies
throughout 2019.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.5.2
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
stock_df <- read_csv("C:/Users/dell/Downloads/stock_df.csv")
## Rows: 5 Columns: 106
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): company
## dbl (105): 2019_week1, 2019_week2, 2019_week3, 2019_week4, 2019_week5, 2019_...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
stock_df
## # A tibble: 5 × 106
## company `2019_week1` `2019_week2` `2019_week3` `2019_week4` `2019_week5`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Amazon 1848. 1641. 1696. 1671. 1626.
## 2 Apple 73.4 38.1 39.2 39.4 41.6
## 3 Facebook 205. 144. 150. 149. 166.
## 4 Google 1337. 1057. 1098. 1091. 1111.
## 5 Microsoft 158. 103. 108. 107. 103.
## # ℹ 100 more variables: `2019_week6` <dbl>, `2019_week7` <dbl>,
## # `2019_week8` <dbl>, `2019_week9` <dbl>, `2019_week10` <dbl>,
## # `2019_week11` <dbl>, `2019_week12` <dbl>, `2019_week13` <dbl>,
## # `2019_week14` <dbl>, `2019_week15` <dbl>, `2019_week16` <dbl>,
## # `2019_week17` <dbl>, `2019_week18` <dbl>, `2019_week19` <dbl>,
## # `2019_week20` <dbl>, `2019_week21` <dbl>, `2019_week22` <dbl>,
## # `2019_week23` <dbl>, `2019_week24` <dbl>, `2019_week25` <dbl>, …
The dataset contains:
2019_week1
, 2019_week2
, etc.stock_df_long <- stock_df %>%
pivot_longer(
cols = !company,
names_to = c("year", "week"),
names_sep = "_week",
names_transform = list(year = as.integer, week = as.integer),
values_to = "price"
)
stock_df_long
## # A tibble: 525 × 4
## company year week price
## <chr> <int> <int> <dbl>
## 1 Amazon 2019 1 1848.
## 2 Amazon 2019 2 1641.
## 3 Amazon 2019 3 1696.
## 4 Amazon 2019 4 1671.
## 5 Amazon 2019 5 1626.
## 6 Amazon 2019 6 1588.
## 7 Amazon 2019 7 1608.
## 8 Amazon 2019 8 1632.
## 9 Amazon 2019 9 1672.
## 10 Amazon 2019 10 1621.
## # ℹ 515 more rows
The transformed dataset now has:
company
, year
,
week
, and price
summary(stock_df_long)
## company year week price
## Length:525 Min. :2019 Min. : 1.00 Min. : 38.07
## Class :character 1st Qu.:2019 1st Qu.:14.00 1st Qu.: 136.62
## Mode :character Median :2020 Median :27.00 Median : 212.48
## Mean :2020 Mean :26.75 Mean : 804.30
## 3rd Qu.:2020 3rd Qu.:40.00 3rd Qu.:1479.23
## Max. :2020 Max. :53.00 Max. :3401.80
dim(stock_df_long)
## [1] 525 4
head(stock_df_long, 10)
## # A tibble: 10 × 4
## company year week price
## <chr> <int> <int> <dbl>
## 1 Amazon 2019 1 1848.
## 2 Amazon 2019 2 1641.
## 3 Amazon 2019 3 1696.
## 4 Amazon 2019 4 1671.
## 5 Amazon 2019 5 1626.
## 6 Amazon 2019 6 1588.
## 7 Amazon 2019 7 1608.
## 8 Amazon 2019 8 1632.
## 9 Amazon 2019 9 1672.
## 10 Amazon 2019 10 1621.
ggplot(stock_df_long, aes(x = week, y = price, color = company)) +
geom_line(linewidth = 1) +
labs(
title = "Weekly Stock Prices for Five US Companies in 2019",
x = "Week",
y = "Price (USD)",
color = "Company"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
legend.position = "bottom"
)
The pivot_longer()
function transforms the data by:
We successfully transformed the stock price data from wide format (105 columns) to long format (4 columns), making it easier to analyze and visualize.
sessionInfo()
## R version 4.5.1 (2025-06-13 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
##
## Matrix products: default
## LAPACK version 3.12.1
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: Asia/Ulaanbaatar
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] lubridate_1.9.4 forcats_1.0.1 stringr_1.5.2 dplyr_1.1.4
## [5] purrr_1.1.0 readr_2.1.5 tidyr_1.3.1 tibble_3.3.0
## [9] ggplot2_4.0.0 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] bit_4.6.0 gtable_0.3.6 jsonlite_2.0.0 crayon_1.5.3
## [5] compiler_4.5.1 tidyselect_1.2.1 parallel_4.5.1 jquerylib_0.1.4
## [9] scales_1.4.0 yaml_2.3.10 fastmap_1.2.0 R6_2.6.1
## [13] labeling_0.4.3 generics_0.1.4 knitr_1.50 bslib_0.9.0
## [17] pillar_1.11.0 RColorBrewer_1.1-3 tzdb_0.5.0 rlang_1.1.6
## [21] utf8_1.2.6 stringi_1.8.7 cachem_1.1.0 xfun_0.53
## [25] sass_0.4.10 S7_0.2.0 bit64_4.6.0-1 timechange_0.3.0
## [29] cli_3.6.5 withr_3.0.2 magrittr_2.0.3 digest_0.6.37
## [33] grid_4.5.1 vroom_1.6.6 rstudioapi_0.17.1 hms_1.1.3
## [37] lifecycle_1.0.4 vctrs_0.6.5 evaluate_1.0.5 glue_1.8.0
## [41] farver_2.1.2 rmarkdown_2.29 tools_4.5.1 pkgconfig_2.0.3
## [45] htmltools_0.5.8.1