Executive Summary

This report analyzes S&P 500 stocks from 2020-2024 to identify “Compounders” - companies with consistent earnings growth and stable Return on Equity (ROE). The analysis includes:

  • Task 1: Data transformation from wide to long format for time-series analysis
  • Task 2: Financial logic calculations to classify growth status
  • Task 3: Interactive visualizations showing ROE trends and valuation metrics

Data Loading and Preparation

# Load required libraries
library(tidyverse)
library(plotly)
library(DT)
library(scales)
library(knitr)

# Read the data
data <- read.csv("sp500_sampled_data_2020_2024.txt", stringsAsFactors = FALSE)

# Display raw data
kable(head(data, 10), caption = "Sample of Raw Stock Data (Wide Format)")
Sample of Raw Stock Data (Wide Format)
Symbol Company Sector PE_Current ROE_2020 ROE_2021 ROE_2022 ROE_2023 ROE_2024 EPS_2020 EPS_2021 EPS_2022 EPS_2023 EPS_2024
AAPL Apple Inc. Technology 28.5 0.73 0.85 1.20 1.35 1.45 3.20 4.5 5.60 6.1 6.5
MSFT Microsoft Corp. Technology 35.2 0.35 0.38 0.40 0.42 0.44 5.10 6.2 8.05 9.6 11.1
AMZN Amazon.com Inc. Consumer Disc. 75.0 0.20 0.22 0.05 0.12 0.15 1.15 2.1 0.50 1.5 2.8
GOOGL Alphabet Inc. Technology 24.5 0.18 0.25 0.22 0.24 0.27 2.80 4.1 4.50 5.2 6.1
TSLA Tesla Inc. Consumer Disc. 68.5 0.05 0.15 0.25 0.20 0.21 0.25 0.8 1.90 2.4 3.1
JPM JPMorgan Chase Financials 11.5 0.12 0.15 0.14 0.16 0.17 8.50 10.1 11.50 12.8 14.5
JNJ Johnson & Johnson Health Care 15.8 0.22 0.24 0.23 0.25 0.26 6.50 7.1 7.50 8.1 8.5
PG Procter & Gamble Consumer Staples 24.5 0.28 0.29 0.30 0.31 0.32 4.80 5.1 5.40 5.9 6.4
XOM Exxon Mobil Energy 10.2 -0.05 0.08 0.25 0.20 0.18 -0.50 3.5 8.10 7.2 6.8
NVDA NVIDIA Corp. Technology 95.0 0.25 0.35 0.45 0.55 0.65 0.80 1.2 2.50 4.1 8.5

Task 1: Data Wrangling - The “Pivot”

Objective

Transform data from wide format (years as columns) to long format (years as rows) to enable time-series visualization.

Wide Format (Before)

kable(data[1:3, c("Symbol", "Company", "ROE_2020", "ROE_2021", "ROE_2022", "ROE_2023", "ROE_2024")],
      caption = "Wide Format - Years as Columns")
Wide Format - Years as Columns
Symbol Company ROE_2020 ROE_2021 ROE_2022 ROE_2023 ROE_2024
AAPL Apple Inc. 0.73 0.85 1.20 1.35 1.45
MSFT Microsoft Corp. 0.35 0.38 0.40 0.42 0.44
AMZN Amazon.com Inc. 0.20 0.22 0.05 0.12 0.15

Transformation Code

# Pivot ROE columns from wide to long
roe_long <- data %>%
  pivot_longer(
    cols = starts_with("ROE_"),
    names_to = "Year",
    values_to = "ROE"
  ) %>%
  mutate(Year = as.numeric(gsub("ROE_", "", Year)))

# Pivot EPS columns
eps_long <- data %>%
  pivot_longer(
    cols = starts_with("EPS_"),
    names_to = "Year",
    values_to = "EPS"
  ) %>%
  mutate(Year = as.numeric(gsub("EPS_", "", Year)))

# Combine both metrics
financial_trends <- roe_long %>%
  left_join(
    eps_long %>% select(Symbol, Year, EPS),
    by = c("Symbol", "Year")
  )

Long Format (After)

kable(head(financial_trends, 10), 
      caption = "Long Format - Perfect for Time-Series Analysis")
Long Format - Perfect for Time-Series Analysis
Symbol Company Sector PE_Current EPS_2020 EPS_2021 EPS_2022 EPS_2023 EPS_2024 Year ROE EPS
AAPL Apple Inc. Technology 28.5 3.2 4.5 5.60 6.1 6.5 2020 0.73 3.20
AAPL Apple Inc. Technology 28.5 3.2 4.5 5.60 6.1 6.5 2021 0.85 4.50
AAPL Apple Inc. Technology 28.5 3.2 4.5 5.60 6.1 6.5 2022 1.20 5.60
AAPL Apple Inc. Technology 28.5 3.2 4.5 5.60 6.1 6.5 2023 1.35 6.10
AAPL Apple Inc. Technology 28.5 3.2 4.5 5.60 6.1 6.5 2024 1.45 6.50
MSFT Microsoft Corp. Technology 35.2 5.1 6.2 8.05 9.6 11.1 2020 0.35 5.10
MSFT Microsoft Corp. Technology 35.2 5.1 6.2 8.05 9.6 11.1 2021 0.38 6.20
MSFT Microsoft Corp. Technology 35.2 5.1 6.2 8.05 9.6 11.1 2022 0.40 8.05
MSFT Microsoft Corp. Technology 35.2 5.1 6.2 8.05 9.6 11.1 2023 0.42 9.60
MSFT Microsoft Corp. Technology 35.2 5.1 6.2 8.05 9.6 11.1 2024 0.44 11.10

Why This Matters: Long format enables ggplot2 to easily create line charts showing trends over time.


Task 2: Financial Logic Calculation

Growth Status Classification

# Calculate growth metrics and status
stock_metrics <- data %>%
  mutate(
    # TASK 2 REQUIREMENT: Growth Status
    Growth_Status = ifelse(EPS_2024 > EPS_2020, "Growing", "Declining"),
    
    # Additional metrics for deeper analysis
    EPS_Growth_5Y = ((EPS_2024 - EPS_2020) / abs(EPS_2020)) * 100,
    Avg_ROE = (ROE_2020 + ROE_2021 + ROE_2022 + ROE_2023 + ROE_2024) / 5,
    ROE_Stability = NA,
    Compounder_Score = NA
  )

# Calculate ROE stability (standard deviation)
for(i in 1:nrow(stock_metrics)) {
  roe_values <- as.numeric(stock_metrics[i, c("ROE_2020", "ROE_2021", "ROE_2022", "ROE_2023", "ROE_2024")])
  stock_metrics$ROE_Stability[i] <- sd(roe_values, na.rm = TRUE)
}

# Calculate Compounder Score
stock_metrics <- stock_metrics %>%
  mutate(
    ROE_Score = scale(Avg_ROE, center = FALSE),
    Stability_Score = scale(-ROE_Stability, center = FALSE),
    Growth_Score = scale(ifelse(is.infinite(EPS_Growth_5Y), NA, EPS_Growth_5Y), center = FALSE),
    Compounder_Score = (ROE_Score + Stability_Score + Growth_Score) / 3
  )

Growth Status Summary

growth_summary <- stock_metrics %>%
  group_by(Growth_Status) %>%
  summarise(
    Count = n(),
    Avg_EPS_Growth = mean(EPS_Growth_5Y[!is.infinite(EPS_Growth_5Y)], na.rm = TRUE),
    Avg_ROE = mean(Avg_ROE, na.rm = TRUE),
    Avg_PE = mean(PE_Current, na.rm = TRUE)
  ) %>%
  mutate(
    Avg_EPS_Growth = round(Avg_EPS_Growth, 1),
    Avg_ROE = round(Avg_ROE * 100, 1),
    Avg_PE = round(Avg_PE, 1)
  )

kable(growth_summary, 
      caption = "Growth Status Summary - Growing vs Declining Stocks",
      col.names = c("Growth Status", "Count", "Avg EPS Growth (%)", "Avg ROE (%)", "Avg P/E"))
Growth Status Summary - Growing vs Declining Stocks
Growth Status Count Avg EPS Growth (%) Avg ROE (%) Avg P/E
Declining 1 -111.1 11.6 85.0
Growing 19 279.0 27.1 31.4

Key Insight: 19 stocks (out of 20) are classified as “Growing” based on 5-year EPS growth.


Task 3: Interactive Dashboard - Visual Analysis

3.1: Individual Company Analysis

Selected Company: Apple Inc. (AAPL)

# Select Apple for detailed analysis
selected_stock <- stock_metrics %>% filter(Symbol == "AAPL")
selected_trends <- financial_trends %>% filter(Symbol == "AAPL")

# Display key metrics
aapl_metrics <- data.frame(
  Metric = c("Current P/E Ratio", "Average ROE (5Y)", "EPS Growth (5Y)", "Growth Status"),
  Value = c(
    paste0(round(selected_stock$PE_Current, 1)),
    paste0(round(selected_stock$Avg_ROE * 100, 1), "%"),
    paste0(round(selected_stock$EPS_Growth_5Y, 1), "%"),
    selected_stock$Growth_Status
  )
)

kable(aapl_metrics, caption = "Apple Inc. (AAPL) - Key Metrics")
Apple Inc. (AAPL) - Key Metrics
Metric Value
Current P/E Ratio 28.5
Average ROE (5Y) 111.6%
EPS Growth (5Y) 103.1%
Growth Status Growing

5-Year ROE Trend - Apple Inc.

plot_ly(selected_trends, x = ~Year, y = ~ROE, type = 'scatter', mode = 'lines+markers',
        marker = list(size = 10, color = '#667eea'),
        line = list(color = '#667eea', width = 3),
        hovertemplate = paste('<b>Year:</b> %{x}<br>',
                            '<b>ROE:</b> %{y:.2%}<br>',
                            '<extra></extra>')) %>%
  layout(
    title = list(text = "Apple Inc. - 5-Year ROE Trend", font = list(size = 18, color = '#333')),
    xaxis = list(title = "Year", dtick = 1),
    yaxis = list(title = "Return on Equity", tickformat = ".0%"),
    hovermode = 'closest',
    plot_bgcolor = '#f8f9fa',
    paper_bgcolor = 'white'
  )

Analysis: Apple shows increasing ROE trend from 73% (2020) to 145% (2024).

5-Year EPS Trend - Apple Inc.

plot_ly(selected_trends, x = ~Year, y = ~EPS, type = 'scatter', mode = 'lines+markers',
        marker = list(size = 10, color = '#28a745'),
        line = list(color = '#28a745', width = 3),
        hovertemplate = paste('<b>Year:</b> %{x}<br>',
                            '<b>EPS:</b> $%{y:.2f}<br>',
                            '<extra></extra>')) %>%
  layout(
    title = list(text = "Apple Inc. - 5-Year EPS Trend", font = list(size = 18, color = '#333')),
    xaxis = list(title = "Year", dtick = 1),
    yaxis = list(title = "Earnings Per Share ($)"),
    hovermode = 'closest',
    plot_bgcolor = '#f8f9fa',
    paper_bgcolor = 'white'
  )

Analysis: EPS grew from $3.2 (2020) to $6.5 (2024), representing a 103.1% increase.


3.2: Market Overview - All Companies

P/E Ratio vs. 5-Year EPS Growth

scatter_data <- stock_metrics %>%
  filter(!is.na(PE_Current), !is.infinite(EPS_Growth_5Y), !is.na(EPS_Growth_5Y))

plot_ly(scatter_data, 
        x = ~EPS_Growth_5Y, 
        y = ~PE_Current,
        type = 'scatter',
        mode = 'markers',
        marker = list(
          size = ~abs(Avg_ROE) * 80,
          color = ~Avg_ROE,
          colorscale = 'Viridis',
          showscale = TRUE,
          colorbar = list(title = "Avg ROE"),
          line = list(color = 'white', width = 1)
        ),
        text = ~paste0("<b>", Company, "</b><br>",
                      "Sector: ", Sector, "<br>",
                      "P/E: ", round(PE_Current, 1), "<br>",
                      "EPS Growth: ", round(EPS_Growth_5Y, 1), "%<br>",
                      "Avg ROE: ", round(Avg_ROE * 100, 1), "%"),
        hoverinfo = 'text') %>%
  layout(
    title = list(text = "P/E Ratio vs. 5-Year EPS Growth - All Stocks", font = list(size = 18)),
    xaxis = list(title = "5-Year EPS Growth (%)"),
    yaxis = list(title = "Current P/E Ratio"),
    hovermode = 'closest',
    plot_bgcolor = '#f8f9fa',
    paper_bgcolor = 'white'
  )

Investment Sweet Spot: Look for stocks in the bottom-right quadrant (high growth, low P/E) - these represent potential value opportunities.

ROE Consistency vs. Average ROE

plot_ly(stock_metrics, 
        x = ~Avg_ROE, 
        y = ~ROE_Stability,
        type = 'scatter',
        mode = 'markers',
        marker = list(
          size = 12,
          color = ~ifelse(Growth_Status == "Growing", '#28a745', '#dc3545'),
          line = list(color = 'white', width = 1)
        ),
        text = ~paste0("<b>", Company, "</b><br>",
                      "Avg ROE: ", round(Avg_ROE * 100, 1), "%<br>",
                      "ROE Volatility: ", round(ROE_Stability, 3), "<br>",
                      "Status: ", Growth_Status),
        hoverinfo = 'text') %>%
  layout(
    title = list(text = "ROE Consistency Analysis", font = list(size = 18)),
    xaxis = list(title = "Average ROE (5 Years)", tickformat = ".0%"),
    yaxis = list(title = "ROE Volatility (Lower is Better)"),
    hovermode = 'closest',
    plot_bgcolor = '#f8f9fa',
    paper_bgcolor = 'white'
  )

Quality Zone: Stocks in the top-right area (high ROE, low volatility) represent consistent performers - true “compounders”.


3.3: Stock Screener - Top Compounders

screener_results <- stock_metrics %>%
  select(
    Symbol,
    Company,
    Sector,
    PE_Current,
    Avg_ROE,
    ROE_Stability,
    EPS_Growth_5Y,
    Growth_Status,
    Compounder_Score
  ) %>%
  arrange(desc(Compounder_Score)) %>%
  mutate(
    PE_Current = round(PE_Current, 1),
    Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"),
    ROE_Stability = round(ROE_Stability, 3),
    EPS_Growth_5Y = ifelse(is.infinite(EPS_Growth_5Y) | is.na(EPS_Growth_5Y), 
                           "N/A", 
                           paste0(round(EPS_Growth_5Y, 1), "%")),
    Compounder_Score = round(Compounder_Score, 2)
  )

datatable(
  screener_results,
  colnames = c("Symbol", "Company", "Sector", "P/E", "Avg ROE", "ROE Vol.", "EPS Growth", "Status", "Score"),
  options = list(
    pageLength = 20,
    order = list(list(8, 'desc')),
    columnDefs = list(
      list(className = 'dt-center', targets = '_all')
    )
  ),
  caption = "Top Compounders - Ranked by Compounder Score",
  rownames = FALSE,
  class = 'cell-border stripe hover'
) %>%
  formatStyle(
    'Growth_Status',
    backgroundColor = styleEqual(
      c('Growing', 'Declining'),
      c('#d4edda', '#f8d7da')
    ),
    fontWeight = 'bold'
  )

Sector Analysis

sector_summary <- stock_metrics %>%
  group_by(Sector) %>%
  summarise(
    Count = n(),
    Avg_ROE = mean(Avg_ROE, na.rm = TRUE),
    Avg_Growth = mean(EPS_Growth_5Y[!is.infinite(EPS_Growth_5Y)], na.rm = TRUE),
    Growing_Pct = sum(Growth_Status == "Growing") / n() * 100
  ) %>%
  arrange(desc(Avg_ROE))

plot_ly(sector_summary, 
        x = ~reorder(Sector, Avg_ROE), 
        y = ~Avg_ROE,
        type = 'bar',
        marker = list(
          color = ~Avg_ROE,
          colorscale = 'Viridis',
          showscale = TRUE,
          colorbar = list(title = "Avg ROE")
        ),
        text = ~paste0(round(Avg_ROE * 100, 1), "%"),
        textposition = 'outside',
        hovertemplate = paste('<b>%{x}</b><br>',
                            'Avg ROE: %{y:.1%}<br>',
                            '<extra></extra>')) %>%
  layout(
    title = list(text = "Average ROE by Sector (2020-2024)", font = list(size = 18)),
    xaxis = list(title = ""),
    yaxis = list(title = "Average Return on Equity", tickformat = ".0%"),
    plot_bgcolor = '#f8f9fa',
    paper_bgcolor = 'white'
  )

Top Performers by Category

Highest ROE Stocks

top_roe <- stock_metrics %>%
  arrange(desc(Avg_ROE)) %>%
  head(5) %>%
  select(Symbol, Company, Avg_ROE, ROE_Stability, Growth_Status) %>%
  mutate(Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"),
         ROE_Stability = round(ROE_Stability, 3))

kable(top_roe, 
      caption = "Top 5 Stocks by Average ROE",
      col.names = c("Symbol", "Company", "Avg ROE", "ROE Volatility", "Growth Status"))
Top 5 Stocks by Average ROE
Symbol Company Avg ROE ROE Volatility Growth Status
AAPL Apple Inc. 111.6% 0.313 Growing
HD Home Depot 50.4% 0.036 Growing
NVDA NVIDIA Corp. 45% 0.158 Growing
KO Coca-Cola Co. 40% 0.038 Growing
MSFT Microsoft Corp. 39.8% 0.035 Growing

Fastest Growing (EPS)

top_growth <- stock_metrics %>%
  filter(!is.infinite(EPS_Growth_5Y), !is.na(EPS_Growth_5Y)) %>%
  arrange(desc(EPS_Growth_5Y)) %>%
  head(5) %>%
  select(Symbol, Company, EPS_2020, EPS_2024, EPS_Growth_5Y) %>%
  mutate(EPS_2020 = round(EPS_2020, 2),
         EPS_2024 = round(EPS_2024, 2),
         EPS_Growth_5Y = paste0(round(EPS_Growth_5Y, 1), "%"))

kable(top_growth, 
      caption = "Top 5 Fastest Growing Stocks (EPS)",
      col.names = c("Symbol", "Company", "EPS 2020", "EPS 2024", "5Y Growth"))
Top 5 Fastest Growing Stocks (EPS)
Symbol Company EPS 2020 EPS 2024 5Y Growth
XOM Exxon Mobil -0.50 6.8 1460%
TSLA Tesla Inc. 0.25 3.1 1140%
NVDA NVIDIA Corp. 0.80 8.5 962.5%
DIS Walt Disney -0.80 4.5 662.5%
NFLX Netflix Inc. 5.10 12.8 151%

Best Value (Low P/E + High Growth)

top_value <- stock_metrics %>%
  filter(!is.na(PE_Current), !is.infinite(EPS_Growth_5Y), EPS_Growth_5Y > 0) %>%
  mutate(Value_Score = EPS_Growth_5Y / PE_Current) %>%
  arrange(desc(Value_Score)) %>%
  head(5) %>%
  select(Symbol, Company, PE_Current, EPS_Growth_5Y, Avg_ROE) %>%
  mutate(PE_Current = round(PE_Current, 1),
         EPS_Growth_5Y = paste0(round(EPS_Growth_5Y, 1), "%"),
         Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"))

kable(top_value, 
      caption = "Top 5 Value Stocks (Growth per P/E Point)",
      col.names = c("Symbol", "Company", "P/E Ratio", "EPS Growth", "Avg ROE"))
Top 5 Value Stocks (Growth per P/E Point)
Symbol Company P/E Ratio EPS Growth Avg ROE
XOM Exxon Mobil 10.2 1460% 13.2%
TSLA Tesla Inc. 68.5 1140% 17.2%
DIS Walt Disney 45.0 662.5% 5%
NVDA NVIDIA Corp. 95.0 962.5% 45%
JPM JPMorgan Chase 11.5 70.6% 14.8%

Investment Strategy Recommendations

Quality Compounders (Top 3)

top_compounders <- stock_metrics %>%
  arrange(desc(Compounder_Score)) %>%
  head(3) %>%
  select(Symbol, Company, Sector, PE_Current, Avg_ROE, EPS_Growth_5Y, Compounder_Score)

for(i in 1:nrow(top_compounders)) {
  stock <- top_compounders[i,]
  cat("\n### ", i, ". ", stock$Company, " (", stock$Symbol, ")\n\n", sep="")
  cat("- **Sector**: ", stock$Sector, "\n")
  cat("- **P/E Ratio**: ", round(stock$PE_Current, 1), "\n")
  cat("- **Average ROE**: ", round(stock$Avg_ROE * 100, 1), "%\n", sep="")
  cat("- **EPS Growth**: ", round(stock$EPS_Growth_5Y, 1), "%\n", sep="")
  cat("- **Compounder Score**: ", round(stock$Compounder_Score, 2), "\n\n", sep="")
}
## 
## ### 1. Exxon Mobil (XOM)
## 
## - **Sector**:  Energy 
## - **P/E Ratio**:  10.2 
## - **Average ROE**: 13.2%
## - **EPS Growth**: 1460%
## - **Compounder Score**: 0.7
## 
## 
## ### 2. Tesla Inc. (TSLA)
## 
## - **Sector**:  Consumer Disc. 
## - **P/E Ratio**:  68.5 
## - **Average ROE**: 17.2%
## - **EPS Growth**: 1140%
## - **Compounder Score**: 0.66
## 
## 
## ### 3. NVIDIA Corp. (NVDA)
## 
## - **Sector**:  Technology 
## - **P/E Ratio**:  95 
## - **Average ROE**: 45%
## - **EPS Growth**: 962.5%
## - **Compounder Score**: 0.54

Investment Thesis: These stocks combine high returns on equity, consistent performance, and strong earnings growth - the hallmarks of quality compounders.


Methodology

Data Transformation (Task 1)

Used tidyr::pivot_longer() to transform wide-format data (years as columns) into long format (years as rows), enabling time-series visualization with ggplot2.

Growth Classification (Task 2)

Applied financial logic using dplyr::mutate() and ifelse(): - Growing: If EPS_2024 > EPS_2020 - Declining: Otherwise

Compounder Score Calculation

Composite metric combining three normalized factors:

  1. ROE Score: Average ROE over 5 years (higher is better)
  2. Stability Score: Inverse of ROE standard deviation (lower volatility is better)
  3. Growth Score: 5-year EPS growth rate (higher is better)

\[\text{Compounder Score} = \frac{\text{ROE Score} + \text{Stability Score} + \text{Growth Score}}{3}\]


Conclusion

This analysis successfully demonstrates:

Task 1 Complete: Data transformed from wide to long format using pivot_longer()

Task 2 Complete: Growth status calculated based on EPS comparison (2024 vs 2020)

Task 3 Complete: Interactive visualizations created showing: - Individual company ROE trends (line charts) - Market-wide P/E vs Growth analysis (scatter plots) - Compounder rankings (interactive tables)

Key Findings

  1. 19 out of 20 stocks classified as “Growing”
  2. Technology sector shows highest average ROE at 43.3%
  3. Top 3 compounders: XOM, TSLA, NVDA based on composite scoring

Disclaimer

This analysis is for educational purposes only and does not constitute investment advice. Always conduct thorough research and consult with a financial advisor before making investment decisions.


Report Generated: 2025-12-23 19:40:38.791241

Data Source: S&P 500 sampled data (2020-2024)

Analysis Tool: R with tidyverse, plotly, and DT packages