Problem Statement
Create an animated time-series plot to visualize stock price changes over time for five different companies.
Step 1: Library Initialization
- Required packages are loaded to enable data manipulation, visualization, and animation.
ggplot2 provides a layered plotting system.
dplyr enables efficient data transformation.
gganimate adds animation capabilities to plots.
library(ggplot2)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Warning: package 'gganimate' was built under R version 4.5.3
Warning: package 'gifski' was built under R version 4.5.3
Step 2: Data Import
- The dataset is read from a CSV file into a dataframe.
- The dataframe stores structured tabular data for analysis.
- This step initializes the dataset for further processing.
data <- read.csv("C:/Users/hp/Downloads/World-Stock-Prices-Dataset.csv")
Step 3: Data Inspection
- Column names are examined to understand available variables.
- A preview of the dataset is generated to verify correct loading.
- This ensures proper referencing of variables in later steps.
[1] "Date" "Open" "High" "Low"
[5] "Close" "Volume" "Brand_Name" "Ticker"
[9] "Industry_Tag" "Country" "Dividends" "Stock.Splits"
[13] "Capital.Gains"
Date Open High Low Close Volume Brand_Name
1 2025-07-03 00:00:00-04:00 6.630 6.74 6.6150 6.64 4209664 peloton
2 2025-07-03 00:00:00-04:00 106.750 108.37 106.3301 107.34 560190 crocs
3 2025-07-03 00:00:00-04:00 122.630 123.05 121.5500 121.93 36600 adidas
4 2025-07-03 00:00:00-04:00 221.705 224.01 221.3600 223.41 29295154 amazon
5 2025-07-03 00:00:00-04:00 212.145 214.65 211.8101 213.55 34697317 apple
6 2025-07-03 00:00:00-04:00 76.265 77.03 75.5800 76.39 11545304 nike
Ticker Industry_Tag Country Dividends Stock.Splits Capital.Gains
1 PTON fitness usa 0 0 NA
2 CROX footwear usa 0 0 NA
3 ADDYY apparel germany 0 0 NA
4 AMZN e-commerce usa 0 0 NA
5 AAPL technology usa 0 0 NA
6 NKE apparel usa 0 0 NA
Step 4: Data Extraction and Preparation
- Relevant variables (
Date, Ticker, Close) are selected from the dataset.
- Unnecessary columns are removed to simplify analysis.
- Column names are standardized for better readability.
Description of Variables
Date
Represents the specific trading day on which stock data is recorded.
Ticker
A unique symbol assigned to each company in the stock market (e.g., AAPL for Apple, TSLA for Tesla).
Close
Represents the closing price of a stock on a given day.
How These Variables Interlink
Each Date corresponds to a specific record of a company’s stock.
Ticker links that record to a particular company.
Close gives the stock price for that company on that date.
Together, they form a relationship: Price of a company (Ticker) at a given time (Date).
data_clean <- data %>%
select(Date, Ticker, Close)
colnames(data_clean) <- c("Date", "Company", "Price")
Step 5: Data Type Conversion
- The
Date variable is converted into Date format.
- This enables proper chronological ordering.
- Required for accurate time-series visualization and animation.
data_clean$Date <- as.Date(data_clean$Date)
Step 6: Exploration of Unique Entities
- Unique company identifiers are extracted from the dataset.
- This supports selection of valid companies for analysis.
- Ensures that chosen companies exist in the dataset.
unique(data_clean$Company)
[1] "PTON" "CROX" "ADDYY" "AMZN" "AAPL" "NKE" "TGT" "GOOGL" "SPOT"
[10] "ZM" "DIS" "RBLX" "DAL" "COST" "LUV" "AEO" "TSLA" "SBUX"
[19] "NVDA" "CRM" "HMC" "CL" "HSY" "CMG" "PINS" "LOGI" "SHOP"
[28] "AMD" "AXP" "COIN" "MA" "MCD" "ADBE" "UL" "CSCO" "JPM"
[37] "ABNB" "MAR" "TM" "HLT" "HD" "JNJ" "UBER" "PG" "FDX"
[46] "MMM" "PHG" "FL" "KO" "MSFT" "V" "LVMUY" "ZI" "UBSFY"
[55] "NFLX" "PMMAF" "NTDOY" "BAMXF" "POAHY" "TWTR" "JWN" "SQ"
Step 7: Data Filtering (Company Selection)
- The dataset is filtered to include only five selected companies.
%in% is used to match multiple values efficiently.
- This reduces clutter and improves visualization clarity.
data_filtered <- data_clean %>%
filter(Company %in% c("AAPL", "TSLA", "GOOGL", "AMZN", "MSFT"))
Step 8: Temporal Filtering
- The dataset is restricted to a specific time range.
- This reduces data size and improves performance.
- Ensures smoother animation and clearer trends.
data_filtered <- data_filtered %>%
filter(Date >= "2023-01-01") # Keep only recent data (from 2023) for clearer and faster analysis
Step 9: Visualization and Animation Construction
- A time-series plot is created using
ggplot2.
geom_line() is used to represent stock price trends.
labs() defines titles and axis labels.
theme_minimal() improves visual clarity.
transition_reveal() introduces animation over time.
animate() renders the animated visualization.
p <- ggplot(data_filtered, aes(x = Date, y = Price, color = Company)) +
geom_line(size = 1) +
labs(
title = "Stock Price Changes Over Time",
subtitle = "Date: {frame_along}",
x = "Date",
y = "Closing Price",
color = "Company"
) +
theme_minimal() +
transition_reveal(Date)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
# Show static plot for R Markdown
p
Step 10: Exporting Animation
- The generated animation is saved as a GIF file.
last_animation() retrieves the most recent animation.
anim_save() exports the animation for submission.
anim <- animate(p, renderer = gifski_renderer())
anim_save("stock_animation.gif", anim)
Step 11: Statistical Aggregation (Average Price)
- The dataset is grouped by
Company for analysis.
summarise() computes the average stock price.
- This provides a summary measure for comparison.
data_filtered %>%
group_by(Company) %>%
summarise(avg_price = mean(Price))
# A tibble: 5 × 2
Company avg_price
<chr> <dbl>
1 AAPL 197.
2 AMZN 167.
3 GOOGL 150.
4 MSFT 382.
5 TSLA 247.
Step 12: Derived Metric (Daily Price Change)
- A new variable
Change is created using mutate().
lag() retrieves the previous price value.
- The difference calculates daily price variation.
- The first value per company results in
NA.
data_filtered <- data_filtered %>%
group_by(Company) %>%
mutate(Change = Price - lag(Price))
head(data_filtered)
# A tibble: 6 × 4
# Groups: Company [5]
Date Company Price Change
<date> <chr> <dbl> <dbl>
1 2025-07-03 AMZN 223. NA
2 2025-07-03 AAPL 214. NA
3 2025-07-03 GOOGL 180. NA
4 2025-07-03 TSLA 315. NA
5 2025-07-03 MSFT 499. NA
6 2025-07-03 AAPL 214. 0