LA2

Problem Statement

Create an animated time-series plot to visualize stock price changes over time for five different companies.

Step 1: Library Initialization

  • Required packages are loaded to enable data manipulation, visualization, and animation.
  • ggplot2 provides a layered plotting system.
  • dplyr enables efficient data transformation.
  • gganimate adds animation capabilities to plots.
library(ggplot2)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(gganimate)
Warning: package 'gganimate' was built under R version 4.5.3
library(gifski)
Warning: package 'gifski' was built under R version 4.5.3

Step 2: Data Import

  • The dataset is read from a CSV file into a dataframe.
  • The dataframe stores structured tabular data for analysis.
  • This step initializes the dataset for further processing.
data <- read.csv("C:/Users/hp/Downloads/World-Stock-Prices-Dataset.csv")

Step 3: Data Inspection

  • Column names are examined to understand available variables.
  • A preview of the dataset is generated to verify correct loading.
  • This ensures proper referencing of variables in later steps.
colnames(data)
 [1] "Date"          "Open"          "High"          "Low"          
 [5] "Close"         "Volume"        "Brand_Name"    "Ticker"       
 [9] "Industry_Tag"  "Country"       "Dividends"     "Stock.Splits" 
[13] "Capital.Gains"
head(data)
                       Date    Open   High      Low  Close   Volume Brand_Name
1 2025-07-03 00:00:00-04:00   6.630   6.74   6.6150   6.64  4209664    peloton
2 2025-07-03 00:00:00-04:00 106.750 108.37 106.3301 107.34   560190      crocs
3 2025-07-03 00:00:00-04:00 122.630 123.05 121.5500 121.93    36600     adidas
4 2025-07-03 00:00:00-04:00 221.705 224.01 221.3600 223.41 29295154     amazon
5 2025-07-03 00:00:00-04:00 212.145 214.65 211.8101 213.55 34697317      apple
6 2025-07-03 00:00:00-04:00  76.265  77.03  75.5800  76.39 11545304       nike
  Ticker Industry_Tag Country Dividends Stock.Splits Capital.Gains
1   PTON      fitness     usa         0            0            NA
2   CROX     footwear     usa         0            0            NA
3  ADDYY      apparel germany         0            0            NA
4   AMZN   e-commerce     usa         0            0            NA
5   AAPL   technology     usa         0            0            NA
6    NKE      apparel     usa         0            0            NA

Step 4: Data Extraction and Preparation

  • Relevant variables (Date, Ticker, Close) are selected from the dataset.
  • Unnecessary columns are removed to simplify analysis.
  • Column names are standardized for better readability.

Description of Variables

  • Date
    Represents the specific trading day on which stock data is recorded.

  • Ticker
    A unique symbol assigned to each company in the stock market (e.g., AAPL for Apple, TSLA for Tesla).

  • Close
    Represents the closing price of a stock on a given day.

Step 5: Data Type Conversion

  • The Date variable is converted into Date format.
  • This enables proper chronological ordering.
  • Required for accurate time-series visualization and animation.
data_clean$Date <- as.Date(data_clean$Date)

Step 6: Exploration of Unique Entities

  • Unique company identifiers are extracted from the dataset.
  • This supports selection of valid companies for analysis.
  • Ensures that chosen companies exist in the dataset.
unique(data_clean$Company)
 [1] "PTON"  "CROX"  "ADDYY" "AMZN"  "AAPL"  "NKE"   "TGT"   "GOOGL" "SPOT" 
[10] "ZM"    "DIS"   "RBLX"  "DAL"   "COST"  "LUV"   "AEO"   "TSLA"  "SBUX" 
[19] "NVDA"  "CRM"   "HMC"   "CL"    "HSY"   "CMG"   "PINS"  "LOGI"  "SHOP" 
[28] "AMD"   "AXP"   "COIN"  "MA"    "MCD"   "ADBE"  "UL"    "CSCO"  "JPM"  
[37] "ABNB"  "MAR"   "TM"    "HLT"   "HD"    "JNJ"   "UBER"  "PG"    "FDX"  
[46] "MMM"   "PHG"   "FL"    "KO"    "MSFT"  "V"     "LVMUY" "ZI"    "UBSFY"
[55] "NFLX"  "PMMAF" "NTDOY" "BAMXF" "POAHY" "TWTR"  "JWN"   "SQ"   

Step 7: Data Filtering (Company Selection)

  • The dataset is filtered to include only five selected companies.
  • %in% is used to match multiple values efficiently.
  • This reduces clutter and improves visualization clarity.
data_filtered <- data_clean %>%
  filter(Company %in% c("AAPL", "TSLA", "GOOGL", "AMZN", "MSFT"))

Step 8: Temporal Filtering

  • The dataset is restricted to a specific time range.
  • This reduces data size and improves performance.
  • Ensures smoother animation and clearer trends.
data_filtered <- data_filtered %>%
  filter(Date >= "2023-01-01")  # Keep only recent data (from 2023) for clearer and faster analysis

Step 9: Visualization and Animation Construction

  • A time-series plot is created using ggplot2.
  • geom_line() is used to represent stock price trends.
  • labs() defines titles and axis labels.
  • theme_minimal() improves visual clarity.
  • transition_reveal() introduces animation over time.
  • animate() renders the animated visualization.
p <- ggplot(data_filtered, aes(x = Date, y = Price, color = Company)) +
  geom_line(size = 1) +
  labs(
    title = "Stock Price Changes Over Time",
    subtitle = "Date: {frame_along}",
    x = "Date",
    y = "Closing Price",
    color = "Company"
  ) +
  theme_minimal() +
  transition_reveal(Date)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
# Show static plot for R Markdown
p

Step 10: Exporting Animation

  • The generated animation is saved as a GIF file.
  • last_animation() retrieves the most recent animation.
  • anim_save() exports the animation for submission.
anim <- animate(p, renderer = gifski_renderer())
anim_save("stock_animation.gif", anim)

Step 11: Statistical Aggregation (Average Price)

  • The dataset is grouped by Company for analysis.
  • summarise() computes the average stock price.
  • This provides a summary measure for comparison.
data_filtered %>%
  group_by(Company) %>%
  summarise(avg_price = mean(Price))
# A tibble: 5 × 2
  Company avg_price
  <chr>       <dbl>
1 AAPL         197.
2 AMZN         167.
3 GOOGL        150.
4 MSFT         382.
5 TSLA         247.

Step 12: Derived Metric (Daily Price Change)

  • A new variable Change is created using mutate().
  • lag() retrieves the previous price value.
  • The difference calculates daily price variation.
  • The first value per company results in NA.
data_filtered <- data_filtered %>%
  group_by(Company) %>%
  mutate(Change = Price - lag(Price))
head(data_filtered)
# A tibble: 6 × 4
# Groups:   Company [5]
  Date       Company Price Change
  <date>     <chr>   <dbl>  <dbl>
1 2025-07-03 AMZN     223.     NA
2 2025-07-03 AAPL     214.     NA
3 2025-07-03 GOOGL    180.     NA
4 2025-07-03 TSLA     315.     NA
5 2025-07-03 MSFT     499.     NA
6 2025-07-03 AAPL     214.      0