This report analyzes S&P 500 stocks from 2020-2024 to identify “Compounders” - companies with consistent earnings growth and stable Return on Equity (ROE). The analysis includes:
# Load required libraries
library(tidyverse)
library(plotly)
library(DT)
library(scales)
library(knitr)
# Read the data
data <- read.csv("sp500_sampled_data_2020_2024.txt", stringsAsFactors = FALSE)
# Display raw data
kable(head(data, 10), caption = "Sample of Raw Stock Data (Wide Format)")
| Symbol | Company | Sector | PE_Current | ROE_2020 | ROE_2021 | ROE_2022 | ROE_2023 | ROE_2024 | EPS_2020 | EPS_2021 | EPS_2022 | EPS_2023 | EPS_2024 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AAPL | Apple Inc. | Technology | 28.5 | 0.73 | 0.85 | 1.20 | 1.35 | 1.45 | 3.20 | 4.5 | 5.60 | 6.1 | 6.5 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 0.35 | 0.38 | 0.40 | 0.42 | 0.44 | 5.10 | 6.2 | 8.05 | 9.6 | 11.1 |
| AMZN | Amazon.com Inc. | Consumer Disc. | 75.0 | 0.20 | 0.22 | 0.05 | 0.12 | 0.15 | 1.15 | 2.1 | 0.50 | 1.5 | 2.8 |
| GOOGL | Alphabet Inc. | Technology | 24.5 | 0.18 | 0.25 | 0.22 | 0.24 | 0.27 | 2.80 | 4.1 | 4.50 | 5.2 | 6.1 |
| TSLA | Tesla Inc. | Consumer Disc. | 68.5 | 0.05 | 0.15 | 0.25 | 0.20 | 0.21 | 0.25 | 0.8 | 1.90 | 2.4 | 3.1 |
| JPM | JPMorgan Chase | Financials | 11.5 | 0.12 | 0.15 | 0.14 | 0.16 | 0.17 | 8.50 | 10.1 | 11.50 | 12.8 | 14.5 |
| JNJ | Johnson & Johnson | Health Care | 15.8 | 0.22 | 0.24 | 0.23 | 0.25 | 0.26 | 6.50 | 7.1 | 7.50 | 8.1 | 8.5 |
| PG | Procter & Gamble | Consumer Staples | 24.5 | 0.28 | 0.29 | 0.30 | 0.31 | 0.32 | 4.80 | 5.1 | 5.40 | 5.9 | 6.4 |
| XOM | Exxon Mobil | Energy | 10.2 | -0.05 | 0.08 | 0.25 | 0.20 | 0.18 | -0.50 | 3.5 | 8.10 | 7.2 | 6.8 |
| NVDA | NVIDIA Corp. | Technology | 95.0 | 0.25 | 0.35 | 0.45 | 0.55 | 0.65 | 0.80 | 1.2 | 2.50 | 4.1 | 8.5 |
Transform data from wide format (years as columns) to long format (years as rows) to enable time-series visualization.
kable(data[1:3, c("Symbol", "Company", "ROE_2020", "ROE_2021", "ROE_2022", "ROE_2023", "ROE_2024")],
caption = "Wide Format - Years as Columns")
| Symbol | Company | ROE_2020 | ROE_2021 | ROE_2022 | ROE_2023 | ROE_2024 |
|---|---|---|---|---|---|---|
| AAPL | Apple Inc. | 0.73 | 0.85 | 1.20 | 1.35 | 1.45 |
| MSFT | Microsoft Corp. | 0.35 | 0.38 | 0.40 | 0.42 | 0.44 |
| AMZN | Amazon.com Inc. | 0.20 | 0.22 | 0.05 | 0.12 | 0.15 |
# Pivot ROE columns from wide to long
roe_long <- data %>%
pivot_longer(
cols = starts_with("ROE_"),
names_to = "Year",
values_to = "ROE"
) %>%
mutate(Year = as.numeric(gsub("ROE_", "", Year)))
# Pivot EPS columns
eps_long <- data %>%
pivot_longer(
cols = starts_with("EPS_"),
names_to = "Year",
values_to = "EPS"
) %>%
mutate(Year = as.numeric(gsub("EPS_", "", Year)))
# Combine both metrics
financial_trends <- roe_long %>%
left_join(
eps_long %>% select(Symbol, Year, EPS),
by = c("Symbol", "Year")
)
kable(head(financial_trends, 10),
caption = "Long Format - Perfect for Time-Series Analysis")
| Symbol | Company | Sector | PE_Current | EPS_2020 | EPS_2021 | EPS_2022 | EPS_2023 | EPS_2024 | Year | ROE | EPS |
|---|---|---|---|---|---|---|---|---|---|---|---|
| AAPL | Apple Inc. | Technology | 28.5 | 3.2 | 4.5 | 5.60 | 6.1 | 6.5 | 2020 | 0.73 | 3.20 |
| AAPL | Apple Inc. | Technology | 28.5 | 3.2 | 4.5 | 5.60 | 6.1 | 6.5 | 2021 | 0.85 | 4.50 |
| AAPL | Apple Inc. | Technology | 28.5 | 3.2 | 4.5 | 5.60 | 6.1 | 6.5 | 2022 | 1.20 | 5.60 |
| AAPL | Apple Inc. | Technology | 28.5 | 3.2 | 4.5 | 5.60 | 6.1 | 6.5 | 2023 | 1.35 | 6.10 |
| AAPL | Apple Inc. | Technology | 28.5 | 3.2 | 4.5 | 5.60 | 6.1 | 6.5 | 2024 | 1.45 | 6.50 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 5.1 | 6.2 | 8.05 | 9.6 | 11.1 | 2020 | 0.35 | 5.10 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 5.1 | 6.2 | 8.05 | 9.6 | 11.1 | 2021 | 0.38 | 6.20 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 5.1 | 6.2 | 8.05 | 9.6 | 11.1 | 2022 | 0.40 | 8.05 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 5.1 | 6.2 | 8.05 | 9.6 | 11.1 | 2023 | 0.42 | 9.60 |
| MSFT | Microsoft Corp. | Technology | 35.2 | 5.1 | 6.2 | 8.05 | 9.6 | 11.1 | 2024 | 0.44 | 11.10 |
Why This Matters: Long format enables
ggplot2 to easily create line charts showing trends over
time.
# Calculate growth metrics and status
stock_metrics <- data %>%
mutate(
# TASK 2 REQUIREMENT: Growth Status
Growth_Status = ifelse(EPS_2024 > EPS_2020, "Growing", "Declining"),
# Additional metrics for deeper analysis
EPS_Growth_5Y = ((EPS_2024 - EPS_2020) / abs(EPS_2020)) * 100,
Avg_ROE = (ROE_2020 + ROE_2021 + ROE_2022 + ROE_2023 + ROE_2024) / 5,
ROE_Stability = NA,
Compounder_Score = NA
)
# Calculate ROE stability (standard deviation)
for(i in 1:nrow(stock_metrics)) {
roe_values <- as.numeric(stock_metrics[i, c("ROE_2020", "ROE_2021", "ROE_2022", "ROE_2023", "ROE_2024")])
stock_metrics$ROE_Stability[i] <- sd(roe_values, na.rm = TRUE)
}
# Calculate Compounder Score
stock_metrics <- stock_metrics %>%
mutate(
ROE_Score = scale(Avg_ROE, center = FALSE),
Stability_Score = scale(-ROE_Stability, center = FALSE),
Growth_Score = scale(ifelse(is.infinite(EPS_Growth_5Y), NA, EPS_Growth_5Y), center = FALSE),
Compounder_Score = (ROE_Score + Stability_Score + Growth_Score) / 3
)
growth_summary <- stock_metrics %>%
group_by(Growth_Status) %>%
summarise(
Count = n(),
Avg_EPS_Growth = mean(EPS_Growth_5Y[!is.infinite(EPS_Growth_5Y)], na.rm = TRUE),
Avg_ROE = mean(Avg_ROE, na.rm = TRUE),
Avg_PE = mean(PE_Current, na.rm = TRUE)
) %>%
mutate(
Avg_EPS_Growth = round(Avg_EPS_Growth, 1),
Avg_ROE = round(Avg_ROE * 100, 1),
Avg_PE = round(Avg_PE, 1)
)
kable(growth_summary,
caption = "Growth Status Summary - Growing vs Declining Stocks",
col.names = c("Growth Status", "Count", "Avg EPS Growth (%)", "Avg ROE (%)", "Avg P/E"))
| Growth Status | Count | Avg EPS Growth (%) | Avg ROE (%) | Avg P/E |
|---|---|---|---|---|
| Declining | 1 | -111.1 | 11.6 | 85.0 |
| Growing | 19 | 279.0 | 27.1 | 31.4 |
Key Insight: 19 stocks (out of 20) are classified as “Growing” based on 5-year EPS growth.
# Select Apple for detailed analysis
selected_stock <- stock_metrics %>% filter(Symbol == "AAPL")
selected_trends <- financial_trends %>% filter(Symbol == "AAPL")
# Display key metrics
aapl_metrics <- data.frame(
Metric = c("Current P/E Ratio", "Average ROE (5Y)", "EPS Growth (5Y)", "Growth Status"),
Value = c(
paste0(round(selected_stock$PE_Current, 1)),
paste0(round(selected_stock$Avg_ROE * 100, 1), "%"),
paste0(round(selected_stock$EPS_Growth_5Y, 1), "%"),
selected_stock$Growth_Status
)
)
kable(aapl_metrics, caption = "Apple Inc. (AAPL) - Key Metrics")
| Metric | Value |
|---|---|
| Current P/E Ratio | 28.5 |
| Average ROE (5Y) | 111.6% |
| EPS Growth (5Y) | 103.1% |
| Growth Status | Growing |
plot_ly(selected_trends, x = ~Year, y = ~ROE, type = 'scatter', mode = 'lines+markers',
marker = list(size = 10, color = '#667eea'),
line = list(color = '#667eea', width = 3),
hovertemplate = paste('<b>Year:</b> %{x}<br>',
'<b>ROE:</b> %{y:.2%}<br>',
'<extra></extra>')) %>%
layout(
title = list(text = "Apple Inc. - 5-Year ROE Trend", font = list(size = 18, color = '#333')),
xaxis = list(title = "Year", dtick = 1),
yaxis = list(title = "Return on Equity", tickformat = ".0%"),
hovermode = 'closest',
plot_bgcolor = '#f8f9fa',
paper_bgcolor = 'white'
)
Analysis: Apple shows increasing ROE trend from 73% (2020) to 145% (2024).
plot_ly(selected_trends, x = ~Year, y = ~EPS, type = 'scatter', mode = 'lines+markers',
marker = list(size = 10, color = '#28a745'),
line = list(color = '#28a745', width = 3),
hovertemplate = paste('<b>Year:</b> %{x}<br>',
'<b>EPS:</b> $%{y:.2f}<br>',
'<extra></extra>')) %>%
layout(
title = list(text = "Apple Inc. - 5-Year EPS Trend", font = list(size = 18, color = '#333')),
xaxis = list(title = "Year", dtick = 1),
yaxis = list(title = "Earnings Per Share ($)"),
hovermode = 'closest',
plot_bgcolor = '#f8f9fa',
paper_bgcolor = 'white'
)
Analysis: EPS grew from $3.2 (2020) to $6.5 (2024), representing a 103.1% increase.
scatter_data <- stock_metrics %>%
filter(!is.na(PE_Current), !is.infinite(EPS_Growth_5Y), !is.na(EPS_Growth_5Y))
plot_ly(scatter_data,
x = ~EPS_Growth_5Y,
y = ~PE_Current,
type = 'scatter',
mode = 'markers',
marker = list(
size = ~abs(Avg_ROE) * 80,
color = ~Avg_ROE,
colorscale = 'Viridis',
showscale = TRUE,
colorbar = list(title = "Avg ROE"),
line = list(color = 'white', width = 1)
),
text = ~paste0("<b>", Company, "</b><br>",
"Sector: ", Sector, "<br>",
"P/E: ", round(PE_Current, 1), "<br>",
"EPS Growth: ", round(EPS_Growth_5Y, 1), "%<br>",
"Avg ROE: ", round(Avg_ROE * 100, 1), "%"),
hoverinfo = 'text') %>%
layout(
title = list(text = "P/E Ratio vs. 5-Year EPS Growth - All Stocks", font = list(size = 18)),
xaxis = list(title = "5-Year EPS Growth (%)"),
yaxis = list(title = "Current P/E Ratio"),
hovermode = 'closest',
plot_bgcolor = '#f8f9fa',
paper_bgcolor = 'white'
)
Investment Sweet Spot: Look for stocks in the bottom-right quadrant (high growth, low P/E) - these represent potential value opportunities.
plot_ly(stock_metrics,
x = ~Avg_ROE,
y = ~ROE_Stability,
type = 'scatter',
mode = 'markers',
marker = list(
size = 12,
color = ~ifelse(Growth_Status == "Growing", '#28a745', '#dc3545'),
line = list(color = 'white', width = 1)
),
text = ~paste0("<b>", Company, "</b><br>",
"Avg ROE: ", round(Avg_ROE * 100, 1), "%<br>",
"ROE Volatility: ", round(ROE_Stability, 3), "<br>",
"Status: ", Growth_Status),
hoverinfo = 'text') %>%
layout(
title = list(text = "ROE Consistency Analysis", font = list(size = 18)),
xaxis = list(title = "Average ROE (5 Years)", tickformat = ".0%"),
yaxis = list(title = "ROE Volatility (Lower is Better)"),
hovermode = 'closest',
plot_bgcolor = '#f8f9fa',
paper_bgcolor = 'white'
)
Quality Zone: Stocks in the top-right area (high ROE, low volatility) represent consistent performers - true “compounders”.
screener_results <- stock_metrics %>%
select(
Symbol,
Company,
Sector,
PE_Current,
Avg_ROE,
ROE_Stability,
EPS_Growth_5Y,
Growth_Status,
Compounder_Score
) %>%
arrange(desc(Compounder_Score)) %>%
mutate(
PE_Current = round(PE_Current, 1),
Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"),
ROE_Stability = round(ROE_Stability, 3),
EPS_Growth_5Y = ifelse(is.infinite(EPS_Growth_5Y) | is.na(EPS_Growth_5Y),
"N/A",
paste0(round(EPS_Growth_5Y, 1), "%")),
Compounder_Score = round(Compounder_Score, 2)
)
datatable(
screener_results,
colnames = c("Symbol", "Company", "Sector", "P/E", "Avg ROE", "ROE Vol.", "EPS Growth", "Status", "Score"),
options = list(
pageLength = 20,
order = list(list(8, 'desc')),
columnDefs = list(
list(className = 'dt-center', targets = '_all')
)
),
caption = "Top Compounders - Ranked by Compounder Score",
rownames = FALSE,
class = 'cell-border stripe hover'
) %>%
formatStyle(
'Growth_Status',
backgroundColor = styleEqual(
c('Growing', 'Declining'),
c('#d4edda', '#f8d7da')
),
fontWeight = 'bold'
)
sector_summary <- stock_metrics %>%
group_by(Sector) %>%
summarise(
Count = n(),
Avg_ROE = mean(Avg_ROE, na.rm = TRUE),
Avg_Growth = mean(EPS_Growth_5Y[!is.infinite(EPS_Growth_5Y)], na.rm = TRUE),
Growing_Pct = sum(Growth_Status == "Growing") / n() * 100
) %>%
arrange(desc(Avg_ROE))
plot_ly(sector_summary,
x = ~reorder(Sector, Avg_ROE),
y = ~Avg_ROE,
type = 'bar',
marker = list(
color = ~Avg_ROE,
colorscale = 'Viridis',
showscale = TRUE,
colorbar = list(title = "Avg ROE")
),
text = ~paste0(round(Avg_ROE * 100, 1), "%"),
textposition = 'outside',
hovertemplate = paste('<b>%{x}</b><br>',
'Avg ROE: %{y:.1%}<br>',
'<extra></extra>')) %>%
layout(
title = list(text = "Average ROE by Sector (2020-2024)", font = list(size = 18)),
xaxis = list(title = ""),
yaxis = list(title = "Average Return on Equity", tickformat = ".0%"),
plot_bgcolor = '#f8f9fa',
paper_bgcolor = 'white'
)
top_roe <- stock_metrics %>%
arrange(desc(Avg_ROE)) %>%
head(5) %>%
select(Symbol, Company, Avg_ROE, ROE_Stability, Growth_Status) %>%
mutate(Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"),
ROE_Stability = round(ROE_Stability, 3))
kable(top_roe,
caption = "Top 5 Stocks by Average ROE",
col.names = c("Symbol", "Company", "Avg ROE", "ROE Volatility", "Growth Status"))
| Symbol | Company | Avg ROE | ROE Volatility | Growth Status |
|---|---|---|---|---|
| AAPL | Apple Inc. | 111.6% | 0.313 | Growing |
| HD | Home Depot | 50.4% | 0.036 | Growing |
| NVDA | NVIDIA Corp. | 45% | 0.158 | Growing |
| KO | Coca-Cola Co. | 40% | 0.038 | Growing |
| MSFT | Microsoft Corp. | 39.8% | 0.035 | Growing |
top_growth <- stock_metrics %>%
filter(!is.infinite(EPS_Growth_5Y), !is.na(EPS_Growth_5Y)) %>%
arrange(desc(EPS_Growth_5Y)) %>%
head(5) %>%
select(Symbol, Company, EPS_2020, EPS_2024, EPS_Growth_5Y) %>%
mutate(EPS_2020 = round(EPS_2020, 2),
EPS_2024 = round(EPS_2024, 2),
EPS_Growth_5Y = paste0(round(EPS_Growth_5Y, 1), "%"))
kable(top_growth,
caption = "Top 5 Fastest Growing Stocks (EPS)",
col.names = c("Symbol", "Company", "EPS 2020", "EPS 2024", "5Y Growth"))
| Symbol | Company | EPS 2020 | EPS 2024 | 5Y Growth |
|---|---|---|---|---|
| XOM | Exxon Mobil | -0.50 | 6.8 | 1460% |
| TSLA | Tesla Inc. | 0.25 | 3.1 | 1140% |
| NVDA | NVIDIA Corp. | 0.80 | 8.5 | 962.5% |
| DIS | Walt Disney | -0.80 | 4.5 | 662.5% |
| NFLX | Netflix Inc. | 5.10 | 12.8 | 151% |
top_value <- stock_metrics %>%
filter(!is.na(PE_Current), !is.infinite(EPS_Growth_5Y), EPS_Growth_5Y > 0) %>%
mutate(Value_Score = EPS_Growth_5Y / PE_Current) %>%
arrange(desc(Value_Score)) %>%
head(5) %>%
select(Symbol, Company, PE_Current, EPS_Growth_5Y, Avg_ROE) %>%
mutate(PE_Current = round(PE_Current, 1),
EPS_Growth_5Y = paste0(round(EPS_Growth_5Y, 1), "%"),
Avg_ROE = paste0(round(Avg_ROE * 100, 1), "%"))
kable(top_value,
caption = "Top 5 Value Stocks (Growth per P/E Point)",
col.names = c("Symbol", "Company", "P/E Ratio", "EPS Growth", "Avg ROE"))
| Symbol | Company | P/E Ratio | EPS Growth | Avg ROE |
|---|---|---|---|---|
| XOM | Exxon Mobil | 10.2 | 1460% | 13.2% |
| TSLA | Tesla Inc. | 68.5 | 1140% | 17.2% |
| DIS | Walt Disney | 45.0 | 662.5% | 5% |
| NVDA | NVIDIA Corp. | 95.0 | 962.5% | 45% |
| JPM | JPMorgan Chase | 11.5 | 70.6% | 14.8% |
top_compounders <- stock_metrics %>%
arrange(desc(Compounder_Score)) %>%
head(3) %>%
select(Symbol, Company, Sector, PE_Current, Avg_ROE, EPS_Growth_5Y, Compounder_Score)
for(i in 1:nrow(top_compounders)) {
stock <- top_compounders[i,]
cat("\n### ", i, ". ", stock$Company, " (", stock$Symbol, ")\n\n", sep="")
cat("- **Sector**: ", stock$Sector, "\n")
cat("- **P/E Ratio**: ", round(stock$PE_Current, 1), "\n")
cat("- **Average ROE**: ", round(stock$Avg_ROE * 100, 1), "%\n", sep="")
cat("- **EPS Growth**: ", round(stock$EPS_Growth_5Y, 1), "%\n", sep="")
cat("- **Compounder Score**: ", round(stock$Compounder_Score, 2), "\n\n", sep="")
}
##
## ### 1. Exxon Mobil (XOM)
##
## - **Sector**: Energy
## - **P/E Ratio**: 10.2
## - **Average ROE**: 13.2%
## - **EPS Growth**: 1460%
## - **Compounder Score**: 0.7
##
##
## ### 2. Tesla Inc. (TSLA)
##
## - **Sector**: Consumer Disc.
## - **P/E Ratio**: 68.5
## - **Average ROE**: 17.2%
## - **EPS Growth**: 1140%
## - **Compounder Score**: 0.66
##
##
## ### 3. NVIDIA Corp. (NVDA)
##
## - **Sector**: Technology
## - **P/E Ratio**: 95
## - **Average ROE**: 45%
## - **EPS Growth**: 962.5%
## - **Compounder Score**: 0.54
Investment Thesis: These stocks combine high returns on equity, consistent performance, and strong earnings growth - the hallmarks of quality compounders.
Used tidyr::pivot_longer() to transform wide-format data
(years as columns) into long format (years as rows), enabling
time-series visualization with ggplot2.
Applied financial logic using dplyr::mutate() and
ifelse(): - Growing: If EPS_2024 >
EPS_2020 - Declining: Otherwise
Composite metric combining three normalized factors:
\[\text{Compounder Score} = \frac{\text{ROE Score} + \text{Stability Score} + \text{Growth Score}}{3}\]
This analysis successfully demonstrates:
✅ Task 1 Complete: Data transformed from wide to
long format using pivot_longer()
✅ Task 2 Complete: Growth status calculated based on EPS comparison (2024 vs 2020)
✅ Task 3 Complete: Interactive visualizations created showing: - Individual company ROE trends (line charts) - Market-wide P/E vs Growth analysis (scatter plots) - Compounder rankings (interactive tables)
This analysis is for educational purposes only and does not constitute investment advice. Always conduct thorough research and consult with a financial advisor before making investment decisions.
Report Generated: 2025-12-23 19:40:38.791241
Data Source: S&P 500 sampled data (2020-2024)
Analysis Tool: R with tidyverse, plotly, and DT packages