1. Brief Info

This is an R Markdown document for Master Thesis in the title of ANALYZING STOCK MARKET BEHAVIOR AND PRICE PREDICTIONS ACROSS GLOBAL MARKETS. The following analysises is conducted in this paper - data selection, data preparation, and explaratory data analysis. The next steps of the analysis such as modeling and results is not included in this file and is implemented in python environment.

2. Library Installation

# Install packages if you haven't already
# install.packages("quantmod")
# install.packages("dplyr")
# install.packages("ggplot2")
# install.packages("tidyquant")

# Load packages into R environment
library(quantmod)
library(dplyr)
library(ggplot2)
library(tidyverse)
library(tidyquant)
library(readxl)

3. Data Import (from stckdatanalysis websource retrieved January 2025)

stocks_details <- read_excel("C:/Users/User/Desktop/Master Thesis/Data from Stockanalysis.com (3225 stocks)/All Stocks.xlsx")

#View(stocks_details)
colnames(stocks_details)
## [1] "No"                 "Symbol"             "Company Name"      
## [4] "Sector"             "Market Cap"         "% Change"          
## [7] "Volume"             "Revenue"            "Market Cap Numeric"

View of the data

head(stocks_details)
## # A tibble: 6 × 9
##      No Symbol `Company Name`      Sector `Market Cap` `% Change` Volume Revenue
##   <dbl> <chr>  <chr>               <chr>  <chr>        <chr>      <chr>  <chr>  
## 1     1 AAPL   Apple Inc.          Techn… 3,698.01B    -0.0014    10694… 391.04B
## 2     1 GE     General Electric C… Indus… 187.38B      0.0039     24289… 69.95B 
## 3     1 LLY    Eli Lilly and Comp… Healt… 707.21B      0.0245     41386… 40.86B 
## 4     1 BRK.B  Berkshire Hathaway… Finan… 973.14B      0.0087     32551  369.89B
## 5     1 WMT    Walmart Inc.        Consu… 730.40B      -0.0056    27341… 673.82B
## 6     1 XOM    Exxon Mobil Corpor… Energ… 477.57B      -0.0135    13925… 343.82B
## # ℹ 1 more variable: `Market Cap Numeric` <dbl>
nrow(stocks_details)
## [1] 3225
ncol(stocks_details)
## [1] 9

Initially 3225 companies are taken from stockanalysis.com

the number of stocks per each sector

stocksbysector <- stocks_details %>% 
  group_by(Sector) %>% 
  summarize(count = n(), .groups = "drop")

stocksbysector
## # A tibble: 6 × 2
##   Sector                 count
##   <chr>                  <int>
## 1 Consumer Staple Sector   245
## 2 Energy Sector            250
## 3 Financial Sector         603
## 4 Healthcare Sector       1164
## 5 Industrials Sector       453
## 6 Technology Sector        510

The 8 companies are deleted because they are not available and facing the fetching problem from yahoo finance

# Remove rows with problematic symbols 
# We need to remove them because they are not availbale in the Yahoo Finance

stocks_details <- stocks_details %>%
  filter(!Symbol %in% c("BRK.B", "PBR.A", "AGM.A", "VG", "AKO.A", "MKC.V","BF.A","DRDB","HEI.A"))

# Print the updated dataframe to confirm the removal
head(stocks_details)
## # A tibble: 6 × 9
##      No Symbol `Company Name`      Sector `Market Cap` `% Change` Volume Revenue
##   <dbl> <chr>  <chr>               <chr>  <chr>        <chr>      <chr>  <chr>  
## 1     1 AAPL   Apple Inc.          Techn… 3,698.01B    -0.0014    10694… 391.04B
## 2     1 GE     General Electric C… Indus… 187.38B      0.0039     24289… 69.95B 
## 3     1 LLY    Eli Lilly and Comp… Healt… 707.21B      0.0245     41386… 40.86B 
## 4     1 WMT    Walmart Inc.        Consu… 730.40B      -0.0056    27341… 673.82B
## 5     1 XOM    Exxon Mobil Corpor… Energ… 477.57B      -0.0135    13925… 343.82B
## 6     2 NVDA   NVIDIA Corporation  Techn… 3,494.48B    -0.0451    16666… 113.27B
## # ℹ 1 more variable: `Market Cap Numeric` <dbl>
# Optionally, confirm the removal by checking the absence of these symbols
cat("Checking for presence of probelmatic symbols in the dataset:\n")
## Checking for presence of probelmatic symbols in the dataset:
if(any(stocks_details$Symbol == "BRK.B") | any(stocks_details$Symbol == "PBR.A")) {
  cat("Problematic Symbols are still present.\n")
} else {
  cat("Problematic Symbols have been successfully removed.\n")
}
## Problematic Symbols have been successfully removed.

4. Defining Market Leaders

### ADDING MARKET LEADER VARIABLE TO THE DATASET BASED ON MEGA and LARGE CAP CRITERION

# Define a new column 'Market Leader' to indicate whether a stock is a Mega Cap
stocks_details <- stocks_details %>%
  mutate(MarketLeader = `Market Cap Numeric` > 200e9)  # 200 billion represented in scientific notation

# Print the updated dataframe to check the new column
head(stocks_details)
## # A tibble: 6 × 10
##      No Symbol `Company Name`      Sector `Market Cap` `% Change` Volume Revenue
##   <dbl> <chr>  <chr>               <chr>  <chr>        <chr>      <chr>  <chr>  
## 1     1 AAPL   Apple Inc.          Techn… 3,698.01B    -0.0014    10694… 391.04B
## 2     1 GE     General Electric C… Indus… 187.38B      0.0039     24289… 69.95B 
## 3     1 LLY    Eli Lilly and Comp… Healt… 707.21B      0.0245     41386… 40.86B 
## 4     1 WMT    Walmart Inc.        Consu… 730.40B      -0.0056    27341… 673.82B
## 5     1 XOM    Exxon Mobil Corpor… Energ… 477.57B      -0.0135    13925… 343.82B
## 6     2 NVDA   NVIDIA Corporation  Techn… 3,494.48B    -0.0451    16666… 113.27B
## # ℹ 2 more variables: `Market Cap Numeric` <dbl>, MarketLeader <lgl>
#view(stocks_details)


### COUNTING MARKET LEADERS IN EACH SECTOR

# Calculate the number of market leaders in each sector
market_leader_count <- stocks_details %>%
  group_by(Sector) %>%
  summarise(NumberOfMarketLeaders = sum(MarketLeader, na.rm = TRUE),
            .groups = 'drop')  # This line ensures that the grouping is dropped after summarise

# Print the table of market leaders count by sector
print(market_leader_count)
## # A tibble: 6 × 2
##   Sector                 NumberOfMarketLeaders
##   <chr>                                  <int>
## 1 Consumer Staple Sector                     5
## 2 Energy Sector                              3
## 3 Financial Sector                           8
## 4 Healthcare Sector                         12
## 5 Industrials Sector                         0
## 6 Technology Sector                         14

Decreasing cutoff from 200B to 100B.

# Rearranging threshold from 200b to 100b 

# Define a new column 'Market Leader' to indicate whether a stock is a Mega Cap
stocks_details <- stocks_details %>%
  mutate(MarketLeader = `Market Cap Numeric` > 100e9)  # 200 billion represented in scientific notation

### COUNTING MARKET LEADERS IN EACH SECTOR

# Calculate the number of market leaders in each sector
market_leader_count <- stocks_details %>%
  group_by(Sector) %>%
  summarise(NumberOfMarketLeaders = sum(MarketLeader, na.rm = TRUE),
            .groups = 'drop')  # This line ensures that the grouping is dropped after summarise

# Print the table of market leaders count by sector
print(market_leader_count)
## # A tibble: 6 × 2
##   Sector                 NumberOfMarketLeaders
##   <chr>                                  <int>
## 1 Consumer Staple Sector                     7
## 2 Energy Sector                              5
## 3 Financial Sector                          23
## 4 Healthcare Sector                         25
## 5 Industrials Sector                        10
## 6 Technology Sector                         33

Decreasing threshold from 100B to 50B.

# Rearranging threshold from 100b to 50b 

# Define a new column 'Market Leader' to indicate whether a stock is a Mega Cap
stocks_details <- stocks_details %>%
  mutate(MarketLeader = `Market Cap Numeric` > 50e9)  # 200 billion represented in scientific notation

### COUNTING MARKET LEADERS IN EACH SECTOR

# Calculate the number of market leaders in each sector
market_leader_count <- stocks_details %>%
  group_by(Sector) %>%
  summarise(NumberOfMarketLeaders = sum(MarketLeader, na.rm = TRUE),
            .groups = 'drop')  # This line ensures that the grouping is dropped after summarise

# Print the table of market leaders count by sector
print(market_leader_count)
## # A tibble: 6 × 2
##   Sector                 NumberOfMarketLeaders
##   <chr>                                  <int>
## 1 Consumer Staple Sector                    15
## 2 Energy Sector                             21
## 3 Financial Sector                          52
## 4 Healthcare Sector                         44
## 5 Industrials Sector                        33
## 6 Technology Sector                         53

Decreasing the threshold from 50B to 10B

# Rearranging threshold from 50b to 10b 

# Define a new column 'Market Leader' to indicate whether a stock is a Mega Cap
stocks_details <- stocks_details %>%
  mutate(MarketLeader = `Market Cap Numeric` > 10e9)  # 200 billion represented in scientific notation

### COUNTING MARKET LEADERS IN EACH SECTOR

# Calculate the number of market leaders in each sector
market_leader_count <- stocks_details %>%
  group_by(Sector) %>%
  summarise(NumberOfMarketLeaders = sum(MarketLeader, na.rm = TRUE),
            .groups = 'drop')  # This line ensures that the grouping is dropped after summarise

# Print the table of market leaders count by sector
print(market_leader_count)
## # A tibble: 6 × 2
##   Sector                 NumberOfMarketLeaders
##   <chr>                                  <int>
## 1 Consumer Staple Sector                    53
## 2 Energy Sector                             50
## 3 Financial Sector                         134
## 4 Healthcare Sector                         85
## 5 Industrials Sector                       114
## 6 Technology Sector                        150

We see the $10 Billion cutoff is proper for this analysis and based on literature it is also seen that it is correct. 10 Billion and above includes companies inside the large cap and mega cap.

### COUNTING MARKET LEADERS AND NON-MARKET LEADERS IN EACH SECTOR

# Calculate the number of market leaders and non-market leaders in each sector
sector_leader_counts <- stocks_details %>%
  group_by(Sector) %>%
  summarise(NumberOfMarketLeaders = sum(MarketLeader, na.rm = TRUE),
            NumberOfNonMarketLeaders = sum(!MarketLeader, na.rm = TRUE),
            .groups = 'drop')  # This line ensures that the grouping is dropped after summarise

# Print the table of market leaders and non-market leaders count by sector
print(sector_leader_counts)
## # A tibble: 6 × 3
##   Sector                 NumberOfMarketLeaders NumberOfNonMarketLeaders
##   <chr>                                  <int>                    <int>
## 1 Consumer Staple Sector                    53                      189
## 2 Energy Sector                             50                      198
## 3 Financial Sector                         134                      466
## 4 Healthcare Sector                         85                     1079
## 5 Industrials Sector                       114                      338
## 6 Technology Sector                        150                      360

Market Leaders are already defined. As a next step the Volatility levels will be defined…

————————————————————————————-

5. Importing all defined stocks from YahooFinace

# Load necessary libraries
library(quantmod)
library(lubridate)

# Ensure all stock symbols from stock_details are unique and non-NA
symbols <- unique(na.omit(stocks_details$Symbol))

# Initialize a list to store the stock data
symbol_data_list <- list()

# Set the current date and start date to 25 years ago
#end_date <- Sys.Date()
#start_date <- end_date %m-% years(25)
# Set the date range explicitly
start_date <- as.Date("2000-01-01")
end_date <- Sys.Date()



# Loop through each symbol and fetch the data
for (symbol in symbols) {
  # Print the symbol being fetched for tracking progress
  cat("Fetching data for:", symbol, "\n")
  
  # Try to fetch the data, continue even if some symbols fail
  tryCatch({
    # Fetch stock data from Yahoo Finance
    stock_data <- getSymbols(symbol, src = 'yahoo', from = start_date, to = end_date, auto.assign = FALSE)
    # Store the data in the list with the symbol as the key
    symbol_data_list[[symbol]] <- stock_data
  }, error = function(e) {
    cat("Error fetching data for symbol:", symbol, "- Error message:", e$message, "\n")
  })
}

All defined stocks are fetched (some of them have not fetched due to of the problems)

# Check the contents of one of the symbols to verify (pick a symbol known to be in the list)
if(length(symbol_data_list) > 0) {
  print(head(symbol_data_list[[names(symbol_data_list)[1]]]))
} else {
  cat("No data fetched. Check symbol validity or connection settings.")
}
##            AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
## 2000-01-03  0.936384  1.004464 0.907924   0.999442   535796800     0.8421506
## 2000-01-04  0.966518  0.987723 0.903460   0.915179   512377600     0.7711489
## 2000-01-05  0.926339  0.987165 0.919643   0.928571   778321600     0.7824333
## 2000-01-06  0.947545  0.955357 0.848214   0.848214   767972800     0.7147228
## 2000-01-07  0.861607  0.901786 0.852679   0.888393   460734400     0.7485784
## 2000-01-10  0.910714  0.912946 0.845982   0.872768   505064000     0.7354123
# Check and print the number of stocks fetched
cat("Number of stocks successfully fetched:", length(symbol_data_list), "\n")
## Number of stocks successfully fetched: 3174
# Optionally, you can also list the symbols for which data has been fetched
if(length(symbol_data_list) > 0) {
  cat("Symbols fetched successfully include:\n")
  print(names(symbol_data_list))
} else {
  cat("No stocks have been successfully fetched.")
}
## Symbols fetched successfully include:
##    [1] "AAPL"  "GE"    "LLY"   "WMT"   "XOM"   "NVDA"  "CAT"   "UNH"   "JPM"  
##   [10] "COST"  "CVX"   "MSFT"  "RTX"   "NVO"   "V"     "PG"    "SHEL"  "AVGO" 
##   [19] "HON"   "JNJ"   "MA"    "KO"    "TTE"   "TSM"   "UNP"   "ABBV"  "BAC"  
##   [28] "PEP"   "COP"   "ORCL"  "ETN"   "MRK"   "WFC"   "PM"    "ENB"   "CRM"  
##   [37] "BA"    "TMO"   "AXP"   "UL"    "ASML"  "DE"    "ABT"   "BX"    "BUD"  
##   [46] "PBR"   "SAP"   "LMT"   "AZN"   "MS"    "MO"    "BP"    "CSCO"  "UPS"  
##   [55] "ISRG"  "GS"    "BTI"   "EOG"   "ACN"   "TT"    "NVS"   "HSBC"  "MDLZ" 
##   [64] "EPD"   "NOW"   "RELX"  "DHR"   "RY"    "CL"    "WMB"   "IBM"   "PH"   
##   [73] "SYK"   "SPGI"  "DEO"   "ET"    "AMD"   "WM"    "BSX"   "BLK"   "TGT"  
##   [82] "KMI"   "ADBE"  "CTAS"  "AMGN"  "HDB"   "MNST"  "CNQ"   "QCOM"  "ITW"  
##   [91] "PFE"   "MUFG"  "KR"    "EQNR"  "TXN"   "TRI"   "SNY"   "PGR"   "KMB"  
##  [100] "OKE"   "INTU"  "MMM"   "BMY"   "C"     "KDP"   "SLB"   "PLTR"  "GD"   
##  [109] "GILD"  "SCHW"  "STZ"   "ARM"   "CP"    "MDT"   "KKR"   "KVUE"  "MPLX" 
##  [118] "ANET"  "TDG"   "VRTX"  "CB"    "SYY"   "LNG"   "AMAT"  "EMR"   "ELV"  
##  [127] "IBN"   "KHC"   "FANG"  "SHOP"  "FDX"   "CI"    "MMC"   "CCEP"  "PSX"  
##  [136] "UBER"  "NOC"   "HCA"   "UBS"   "GIS"   "TRP"   "SONY"  "CNI"   "ZTS"  
##  [145] "SMFG"  "HSY"   "SU"    "ADP"   "RSG"   "MCK"   "TD"    "ABEV"  "MPC"  
##  [154] "MU"    "CSX"   "REGN"  "APO"   "FMX"   "OXY"   "FI"    "CARR"  "BDX"  
##  [163] "PYPL"  "K"     "TRGP"  "PANW"  "PCAR"  "GSK"   "BN"    "EL"    "BKR"  
##  [172] "ADI"   "CPRT"  "CVS"   "MCO"   "CHD"   "HES"   "APP"   "NSC"   "COR"  
##  [181] "ICE"   "ADM"   "VLO"   "MRVL"  "JCI"   "ALC"   "CME"   "TSN"   "E"    
##  [190] "LRCX"  "GWW"   "A"     "PNC"   "MKC"   "IMO"   "KLAC"  "CMI"   "HLN"  
##  [199] "USB"   "EQT"   "INFY"  "VRT"   "TAK"   "AON"   "CLX"   "TPL"   "CRWD" 
##  [208] "PWR"   "EW"    "SAN"   "CQP"   "INTC"  "HWM"   "GEHC"  "AJG"   "HRL"  
##  [217] "WDS"   "MSTR"  "URI"   "ARGX"  "BMO"   "DG"    "CVE"   "DELL"  "WCN"  
##  [226] "IQV"   "COF"   "DLTR"  "EXE"   "APH"   "AXON"  "RMD"   "COIN"  "KOF"  
##  [235] "HAL"   "CDNS"  "AME"   "VEEV"  "BNS"   "USFD"  "CCJ"   "MSI"   "FAST" 
##  [244] "ALNY"  "MFG"   "SFM"   "DVN"   "SNPS"  "DAL"   "IDXX"  "CM"    "PFGC" 
##  [253] "PBA"   "FTNT"  "LHX"   "HUM"   "TFC"   "CAG"   "CTRA"  "WDAY"  "VRSK" 
##  [262] "DXCM"  "BBVA"  "CPB"   "TS"    "TEAM"  "ODFL"  "CNC"   "AFL"   "PRMB" 
##  [271] "YPF"   "ADSK"  "OTIS"  "CAH"   "MET"   "BJ"    "EC"    "TTD"   "IR"   
##  [280] "BNTX"  "BK"    "SJM"   "WES"   "NXPI"  "FERG"  "MTD"   "ARES"  "TAP"  
##  [289] "PAA"   "WAB"   "PHG"   "TRV"   "FTI"   "ROP"   "UAL"   "WST"   "MFC"  
##  [298] "AR"    "COKE"  "SNOW"  "HEI"   "TEVA"  "NU"    "OVV"   "ACI"   "PAYX" 
##  [307] "ROK"   "WAT"   "AMP"   "DTM"   "BG"    "DDOG"  "FER"   "ONC"   "ING"  
##  [316] "VNOM"  "PPC"   "FICO"  "EFX"   "NTRA"  "ALL"   "KNTK"  "EDU"   "TEL"  
##  [325] "GPN"   "ZBH"   "BCS"   "RRC"   "BRBR"  "FIS"   "XYL"   "ILMN"  "ITUB" 
##  [334] "APA"   "LW"    "GLW"   "STE"   "MSCI"  "HESM"  "INGR"  "GRMN"  "DOV"  
##  [343] "BIIB"  "AIG"   "CHRD"  "ELF"   "NET"   "VLTO"  "LH"    "DFS"   "AM"   
##  [352] "FRPT"  "IT"    "HUBB"  "PODD"  "NDAQ"  "MTDR"  "CELH"  "CTSH"  "LII"  
##  [361] "COO"   "PRU"   "SUN"   "OLLI"  "HUBS"  "EME"   "RPRX"  "LYG"   "POST" 
##  [370] "WIT"   "RYAAY" "SMMT"  "NWG"   "DINO"  "BRFS"  "HPQ"   "LUV"   "ALGN" 
##  [379] "HOOD"  "NOV"   "COTY"  "MCHP"  "JBHT"  "MOH"   "ACGL"  "TGS"   "HIMS" 
##  [388] "MPWR"  "DGX"   "DB"    "LB"    "TAL"   "HPE"   "AER"   "BAX"   "OWL"  
##  [397] "NFG"   "DAR"   "ANSS"  "SNA"   "UTHR"  "SLF"   "NE"    "CALM"  "KEYS" 
##  [406] "WSO"   "ICLR"  "RJF"   "WFRD"  "LANC"  "ZS"    "GFL"   "HOLX"  "MTB"  
##  [415] "CHX"   "LOPE"  "ON"    "CSL"   "MRNA"  "HIG"   "VIST"  "LRN"   "GDDY" 
##  [424] "RBA"   "NBIX"  "TW"    "CRK"   "FLO"   "ERIC"  "BLDR"  "AVTR"  "WTW"  
##  [433] "CRC"   "NWL"   "FTV"   "BAH"   "RVTY"  "FCNCA" "WHD"   "IPAR"  "BR"   
##  [442] "J"     "FMS"   "FCNCO" "IEP"   "FIZZ"  "ZM"    "PNR"   "INSM"  "BRO"  
##  [451] "SM"    "GHC"   "GIB"   "FTAI"  "INCY"  "STT"   "MGY"   "SMPL"  "CDW"  
##  [460] "FIX"   "DVA"   "FITB"  "CIVI"  "ATGE"  "GFS"   "XPO"   "ITCI"  "SYF"  
##  [469] "AROC"  "TBBB"  "IOT"   "IEX"   "GMAB"  "TROW"  "GLNG"  "SAM"   "TYL"  
##  [478] "EXPD"  "VTRS"  "LPLA"  "MUR"   "JJSF"  "CPAY"  "MAS"   "SOLV"  "IX"   
##  [487] "CNX"   "RLX"   "NTAP"  "SYM"   "THC"   "HBAN"  "PAGP"  "PSMT"  "NOK"  
##  [496] "ZTO"   "GMED"  "TPG"   "NXE"   "LAUR"  "STM"   "RKLB"  "UHS"   "BAM"  
##  [505] "CRGY"  "NOMD"  "ASX"   "OC"    "TECH"  "CINF"  "NOG"   "TER"   "ACM"  
##  [514] "RDY"   "MKL"   "KGS"   "WDC"   "CNH"   "BMRN"  "WRB"   "RIG"   "CENT" 
##  [523] "PTC"   "TXT"   "PCVX"  "KB"    "LBRT"  "SPB"   "ALAB"  "GGG"   "SRPT" 
##  [532] "RF"    "FRO"   "TR"    "TDY"   "UHAL"  "SNN"   "RKT"   "ARLP"  "STRA" 
##  [541] "UI"    "CW"    "MEDP"  "ERIE"  "PTEN"  "CENTA" "TOST"  "POOL"  "DOCS" 
##  [550] "PUK"   "VRN"   "CCU"   "FSLR"  "ATR"   "IBKR"  "VAL"   "UTZ"   "SMCI" 
##  [559] "SWK"   "PEN"   "NTRS"  "HP"    "CHEF"  "PSTG"  "CLH"   "EXAS"  "CBOE" 
##  [568] "UEC"   "COCO"  "ZBRA"  "CHRW"  "WBA"   "CFG"   "GPOR"  "WMK"   "VRSN" 
##  [577] "RTO"   "QGEN"  "BBD"   "PBF"   "THS"   "CHKP"  "SAIA"  "CG"    "BSM"  
##  [586] "UNFI"  "AFRM"  "ESLT"  "BIO"   "PFG"   "STR"   "PRDO"  "LDOS"  "NDSN" 
##  [595] "EHC"   "L"     "DNUT"  "GRAB"  "ITT"   "HSIC"  "FDS"   "EE"    "VITL" 
##  [604] "STX"   "AAL"   "RGEN"  "TRU"   "TDW"   "GO"    "SSNC"  "NVT"   "HQY"  
##  [613] "CRBG"  "UGP"   "EPC"   "MDB"   "TFII"  "EXEL"  "NMR"   "USAC"  "KLG"  
##  [622] "KSPI"  "MTZ"   "MASI"  "KEY"   "OII"   "FDP"   "DOCU"  "ALLE"  "GKOS" 
##  [631] "SHG"   "CSAN"  "ANDE"  "TRMB"  "TTEK"  "BRKR"  "JEF"   "SDRL"  "UTI"  
##  [640] "JBL"   "WWD"   "CRL"   "SOFI"  "AESI"  "COUR"  "CYBR"  "BWXT"  "TFX"  
##  [649] "RYAN"  "STNG"  "HELE"  "TWLO"  "LECO"  "CHE"   "EG"    "BTU"   "AFYA" 
##  [658] "NTNX"  "RRX"   "ROIV"  "FNF"   "DKL"   "UVV"   "GEN"   "APG"   "TEM"  
##  [667] "EQH"   "BTE"   "FLEX"  "CNM"   "ENSG"  "BAP"   "SOC"   "UDMY"  "MANH" 
##  [676] "AOS"   "ASND"  "BSBR"  "NEXT"  "DOLE"  "UMC"   "ULS"   "RVMD"  "RGA"  
##  [685] "TRMD"  "IMKTA" "AZPN"  "AAON"  "JAZZ"  "ARCC"  "CMBT"  "JBSS"  "DT"   
##  [694] "ARMK"  "BPMC"  "MORN"  "BKV"   "TPB"   "ENTG"  "GNRC"  "MDGL"  "UNM"  
##  [703] "CLMT"  "AGRO"  "COHR"  "AIT"   "BBIO"  "EWBC"  "DNN"   "NGVC"  "AUR"  
##  [712] "PAC"   "HALO"  "RNR"   "CVI"   "AVO"   "FFIV"  "AYI"   "LNTH"  "CNA"  
##  [721] "HPK"   "STKL"  "SWKS"  "RBC"   "LEGN"  "GGAL"  "INSW"  "MGPI"  "BSY"  
##  [730] "BLD"   "WAY"   "HLI"   "SEI"   "DAO"   "OKTA"  "MLI"   "CORT"  "BCH"  
##  [739] "KOS"   "DDL"   "GWRE"  "CRS"   "MMSI"  "AFG"   "TALO"  "HLF"   "AKAM" 
##  [748] "WCC"   "BLCO"  "EVR"   "MNR"   "USNA"  "DUOL"  "WMS"   "CYTK"  "SF"   
##  [757] "WTTR"  "BRCC"  "EPAM"  "CR"    "ELAN"  "ALLY"  "DMLP"  "HNST"  "LOGI" 
##  [766] "STN"   "STVN"  "FUTU"  "WKC"   "SPTN"  "JNPR"  "KNX"   "NUVL"  "FHN"  
##  [775] "DHT"   "WEST"  "JKHY"  "FLR"   "GH"    "AIZ"   "GLP"   "BGS"   "CRDO" 
##  [784] "DRS"   "TLX"   "SEIC"  "VET"   "SENEB" "WIX"   "FBIN"  "GRFS"  "BEN"  
##  [793] "KRP"   "GOTU"  "RBRK"  "GTLS"  "INSP"  "KNSL"  "XPRO"  "SENEA" "CIEN" 
##  [802] "ALK"   "IONS"  "AEG"   "NRP"   "SNDL"  "CLS"   "LTM"   "OPCH"  "BNT"  
##  [811] "TNK"   "HAIN"  "PAYC"  "DCI"   "ALKS"  "WBS"   "HLX"   "FC"    "DAY"  
##  [820] "ATI"   "AXSM"  "GL"    "FLNG"  "LINC"  "MNDY"  "TTC"   "ITGR"  "PRI"  
##  [829] "LEU"   "LMNR"  "PCOR"  "ASR"   "RDNT"  "UWMC"  "RES"   "VLGEA" "KVYO" 
##  [838] "JOBY"  "WAL"   "GEL"   "CVGW"  "PCTY"  "FLS"   "VRNA"  "PNFP"  "EFXT" 
##  [847] "OTLY"  "IONQ"  "SARO"  "TGTX"  "BSAC"  "ACDC"  "CDXC"  "YMM"   "KBR"  
##  [856] "KRYS"  "CFR"   "VTLE"  "LND"   "NICE"  "CAE"   "ICUI"  "ORI"   "DK"   
##  [865] "APEI"  "ENPH"  "LPX"   "OGN"   "BMA"   "NVGS"  "EWCZ"  "SNX"   "HII"  
##  [874] "OSCR"  "WTFC"  "UUUU"  "NUS"   "ESTC"  "DLB"   "ACHC"  "MKTX"  "MRC"  
##  [883] "LWAY"  "GTLB"  "MIDD"  "SRRK"  "CBSH"  "LPG"   "ZVIA"  "MTSI"  "TREX" 
##  [892] "RARE"  "CMA"   "INVX"  "YSG"   "DSGX"  "ESAB"  "BHVN"  "HLNE"  "PUMP" 
##  [901] "ISPR"  "U"     "RHI"   "PRCT"  "ZION"  "VTOL"  "NATR"  "PSN"   "ERJ"  
##  [910] "BTSG"  "CIB"   "BORR"  "AFRI"  "ONTO"  "LOAR"  "ALVO"  "FRHC"  "SBR"  
##  [919] "MAMA"  "DOX"   "AGCO"  "XRAY"  "WF"    "PARR"  "NAMI"  "CFLT"  "FCN"  
##  [928] "SHC"   "IVZ"   "PDS"   "BYND"  "FOUR"  "WTS"   "VKTX"  "SNV"   "CLB"  
##  [937] "GLOB"  "SPXC"  "ADMA"  "AXS"   "GRNT"  "SKIL"  "ALTR"  "AZEK"  "PBH"  
##  [946] "SSB"   "DEC"   "ALCO"  "CACI"  "CWST"  "APLS"  "RLI"   "CAPL"  "WILC" 
##  [955] "DBX"   "R"     "CRSP"  "ONB"   "NESR"  "SKIN"  "APPF"  "NVST"  "PB"   
##  [964] "VTS"   "VSTA"  "TTAN"  "CRNX"  "BOKF"  "REPX"  "BILL"  "MSA"   "PTCT" 
##  [973] "STEP"  "TXO"   "CHGG"  "KD"    "ADT"   "HAE"   "BPOP"  "EU"    "FTLF" 
##  [982] "FN"    "WSC"   "RYTM"  "JHG"   "CLNE"  "PHH"   "PEGA"  "BECN"  "RNA"  
##  [991] "VOYA"  "NPKI"  "HFFG"  "VERX"  "ZWS"   "IRTC"  "JXN"   "NGL"   "ACU"  
## [1000] "INFA"  "VMI"   "ACLX"  "OMF"   "TK"    "CLEU"  "LSCC"  "AWI"   "VCYT" 
## [1009] "CADE"  "NBR"   "DSY"   "OSK"   "IMVT"  "PJT"   "NOA"   "QSG"   "MKSI" 
## [1018] "KEX"   "PRGO"  "MARA"  "NAT"   "BRLS"  "G"     "CSWI"  "TWST"  "FAF"  
## [1027] "PBT"   "COE"   "OTEX"  "LSTR"  "DNLI"  "XP"    "TEN"   "LGCY"  "PATH" 
## [1036] "SITE"  "WRBY"  "FSK"   "GPRK"  "STG"   "OLED"  "BE"    "ACAD"  "MTG"  
## [1045] "TTI"   "BRID"  "S"     "FSS"   "LFST"  "COOP"  "HNRG"  "PAVS"  "WEX"  
## [1054] "SMR"   "XENE"  "SLM"   "GFR"   "IH"    "EXLS"  "GXO"   "AMED"  "OBDC" 
## [1063] "CLCO"  "VFF"   "CCCS"  "GATX"  "CON"   "ESNT"  "EGY"   "DIT"   "AL"   
## [1072] "QDEL"  "COLB"  "SD"    "LSF"   "QRVO"  "AMTM"  "RXRX"  "CACC"  "OBE"  
## [1081] "MYND"  "SMTC"  "HRI"   "MLTX"  "GBCI"  "URG"   "BEDU"  "SPSC"  "GTES" 
## [1090] "KYMR"  "QFIN"  "SMC"   "BRFH"  "CVLT"  "HXL"   "FOLD"  "MC"    "SGU"  
## [1099] "GROV"  "ASTS"  "DY"    "ALHC"  "BBAR"  "PNRG"  "UG"    "AMKR"  "TKR"  
## [1108] "ZLAB"  "CRVL"  "RNGR"  "CLNN"  "SOUN"  "SNDR"  "NVCR"  "SIGI"  "NGS"  
## [1117] "GNS"   "OS"    "STRL"  "BHC"   "AMG"   "BRY"   "FARM"  "CWAN"  "ROAD" 
## [1126] "VCEL"  "HOMB"  "OIS"   "SHOT"  "QXO"   "ACHR"  "MRUS"  "UPST"  "UROY" 
## [1135] "ORIS"  "SATS"  "BBU"   "PDCO"  "THG"   "TBN"   "COOT"  "CGNX"  "AEIS" 
## [1144] "PRVA"  "VIRT"  "REI"   "SOWG"  "NVMI"  "ACA"   "CPRX"  "LNC"   "FTK"  
## [1153] "DTCK"  "NXT"   "MMS"   "AMRX"  "UMBF"  "GTE"   "AACG"  "BMI"   "AVAV" 
## [1162] "SWTX"  "DNB"   "EP"    "WVVI"  "RMBS"  "IESC"  "EWTX"  "FG"    "AMPY" 
## [1171] "NAII"  "WK"    "MATX"  "LIVN"  "FNB"   "WTI"   "MSS"   "LFUS"  "TNET" 
## [1180] "ARWR"  "MAIN"  "NC"    "MTEX"  "LITE"  "MSM"   "SGRY"  "PFSI"  "KGEI" 
## [1189] "RAY"   "ARW"   "EXPO"  "JANX"  "FFIN"  "FET"   "FEDU"  "LYFT"  "FELE" 
## [1198] "SEM"   "UBSI"  "DLNG"  "SANW"  "QTWO"  "KTOS"  "NAMS"  "VLY"   "SJT"  
## [1207] "JVA"   "TSEM"  "SKYW"  "NEOG"  "OZK"   "EPM"   "LOCL"  "SAIC"  "PRIM" 
## [1216] "AZTA"  "ACT"   "PROP"  "SDOT"  "VNT"   "KAI"   "CLOV"  "WTM"   "BROG" 
## [1225] "BOF"   "CRUS"  "RXO"   "TNDM"  "ESGR"  "PHX"   "EDTK"  "SRAD"  "CBZ"  
## [1234] "CGON"  "HWC"   "AMTX"  "YHC"   "NOVT"  "VRRM"  "ADUS"  "RDN"   "EPSN" 
## [1243] "RMCF"  "SITM"  "BCO"   "HCM"   "FCFS"  "MMLP"  "YQ"    "PONY"  "SPR"  
## [1252] "IBRX"  "PIPR"  "PTLE"  "NCRA"  "ACIW"  "GEO"   "MESO"  "AGO"   "GEOS" 
## [1261] "DDC"   "GDS"   "GVA"   "APGE"  "SFBS"  "DTI"   "EEIQ"  "RGTI"  "CPA"  
## [1270] "MIRM"  "CNS"   "ABVE"  "QLYS"  "UNF"   "PTGX"  "BGC"   "SLNG"  "VINE" 
## [1279] "VRNS"  "ENS"   "PACS"  "LAZ"   "SND"   "SBEV"  "NSIT"  "MDU"   "LMAT" 
## [1288] "BWIN"  "LSE"   "CTCX"  "IDCC"  "MIR"   "WGS"   "ABCB"  "KAVL"  "ITRI" 
## [1297] "NPO"   "LGND"  "KMPR"  "IMPP"  "WAFU"  "FRSH"  "ECG"   "CNTA"  "RIOT" 
## [1306] "KLXE"  "IMG"   "INTA"  "OMAB"  "HRMY"  "AB"    "VOC"   "RKDA"  "INGM" 
## [1315] "BRC"   "PINC"  "GBDC"  "NCSM"  "IBG"   "ASAN"  "CXT"   "VERA"  "AX"   
## [1324] "PED"   "TWG"   "BRZE"  "MWA"   "CERT"  "IBOC"  "RCON"  "AMBO"  "TENB" 
## [1333] "KFY"   "SUPN"  "VCTR"  "CRT"   "ATPC"  "CLBT"  "GFF"   "CNMD"  "FLG"  
## [1342] "USEG"  "GV"    "ALGM"  "REZI"  "SLNO"  "ASB"   "NINE"  "LXEH"  "AI"   
## [1351] "ATMU"  "BEAM"  "CNO"   "PVL"   "GSUN"  "BDC"   "CAAP"  "UFPT"  "GSHD" 
## [1360] "DWSN"  "EDBL"  "AVT"   "HAYW"  "IDYA"  "NNI"   "PRT"   "CASK"  "RELY" 
## [1369] "GMS"   "TMDX"  "BANF"  "NRT"   "BTTR"  "BOX"   "ABM"   "ARDT"  "UCB"  
## [1378] "PXS"   "EEFT"  "FA"    "OMCL"  "AUB"   "VIVK"  "AGRI"  "ZETA"  "ATKR" 
## [1387] "IART"  "MCY"   "INDO"  "AQB"   "ST"    "TEX"   "ATRC"  "TCBI"  "TOPS" 
## [1396] "TCTM"  "PLXS"  "CAR"   "AGIO"  "WU"    "CGBS"  "SNAX"  "PI"    "LUNR" 
## [1405] "NEO"   "FULT"  "BATL"  "SISI"  "SLAB"  "TRN"   "IOVA"  "TFSL"  "BANL" 
## [1414] "XXII"  "SANM"  "POWL"  "GDRX"  "EBC"   "CKX"   "FAMI"  "CAMT"  "ATS"  
## [1423] "PGNY"  "FIBK"  "MXC"   "GNLN"  "GBTG"  "HAFN"  "TARS"  "IFS"   "SKYQ" 
## [1432] "TANH"  "PYCR"  "NSP"   "BLTE"  "CATY"  "HUSA"  "STKH"  "CORZ"  "ENVX" 
## [1441] "WVE"   "APAM"  "BRN"   "PAY"   "PLUG"  "AKRO"  "SNEX"  "MTR"   "NCNO" 
## [1450] "MGRC"  "ASTH"  "FHB"   "BPT"   "RUM"   "HUBG"  "GERN"  "HGTY"  "MARPS"
## [1459] "ASGN"  "MAN"   "SDGR"  "CBU"   "EONR"  "YOU"   "JBLU"  "TXG"   "HTGC" 
## [1468] "TPET"  "WRD"   "AZZ"   "TVTX"  "CLSK"  "BLKB"  "MRCY"  "AMPH"  "WD"   
## [1477] "DXC"   "ENOV"  "TDOC"  "WSFS"  "CLVT"  "ZIM"   "IRON"  "FHI"   "PAYO" 
## [1486] "ENR"   "DVAX"  "BFH"   "FORM"  "MNKD"  "FBP"   "ALIT"  "CXW"   "IMCR" 
## [1495] "GNW"   "BL"    "SEB"   "PHR"   "CVBF"  "POWI"  "MYRG"  "ARQT"  "BUR"  
## [1504] "ALKT"  "NPWR"  "GPCR"  "BHF"   "ZI"    "HNI"   "FTRE"  "BKU"   "AGYS" 
## [1513] "SXI"   "CLDX"  "NMIH"  "FROG"  "ARCB"  "BCRX"  "PLMR"  "ESE"   "WERN" 
## [1522] "GLPG"  "BOH"   "DLO"   "EPAC"  "ATEC"  "PRK"   "LIF"   "VSTS"  "EVO"  
## [1531] "SFNC"  "DV"    "ICFI"  "AGL"   "INDB"  "IPGP"  "HI"    "INDV"  "BANC" 
## [1540] "AMBA"  "AIR"   "PRAX"  "WAFD"  "DOCN"  "ALG"   "AAPG"  "LMND"  "SYNA" 
## [1549] "AGX"   "NHC"   "LPL"   "XMTR"  "NVAX"  "ENVA"  "RNG"   "WOR"   "VIR"  
## [1558] "IREN"  "AVPT"  "CDLR"  "VRDN"  "FFBC"  "DIOD"  "HURN"  "INMD"  "TBBK" 
## [1567] "GRND"  "GBX"   "KNSA"  "TOWN"  "CNXC"  "VSEC"  "MRVI"  "AVAL"  "TDC"  
## [1576] "ULCC"  "NRIX"  "SPNT"  "BTDR"  "HLMN"  "DYN"   "PFS"   "ALRM"  "KMT"  
## [1585] "XNCR"  "HUT"   "FIVN"  "PCT"   "SPRY"  "PPBI"  "PRGS"  "CMPR"  "GRDN" 
## [1594] "WULF"  "OSIS"  "GOGL"  "AHCO"  "MRX"   "KLIC"  "ROCK"  "USPH"  "GCMG" 
## [1603] "RUN"   "PRG"   "RXST"  "FBK"   "PAR"   "ALGT"  "ARDX"  "BANR"  "APPN" 
## [1612] "BWLP"  "MDXG"  "FRME"  "TTMI"  "SBLK"  "BLFS"  "RNST"  "ODD"   "HEES" 
## [1621] "AORT"  "LION"  "STNE"  "EVEX"  "SYRE"  "SBCF"  "KC"    "REVG"  "OCUL" 
## [1630] "NBTB"  "RPD"   "CODI"  "ARVN"  "TRMK"  "LSPD"  "DSGR"  "DAWN"  "WSBC" 
## [1639] "ACLS"  "DAC"   "SNDX"  "LU"    "NATL"  "TNC"   "LQDA"  "AGM"   "BB"   
## [1648] "CMRE"  "MD"    "EFSC"  "SWI"   "VVX"   "STAA"  "SYBT"  "VSH"   "HLIO" 
## [1657] "RCUS"  "CALX"  "KRNT"  "CDNA"  "TFIN"  "FLYW"  "AMRC"  "HROW"  "TRUP" 
## [1666] "VICR"  "ATSG"  "ADPT"  "PWP"   "VIAV"  "BV"    "WEAV"  "TSLX"  "EXTR" 
## [1675] "CMPO"  "INVA"  "INTR"  "QUBT"  "TGI"   "MYGN"  "OFG"   "EVTC"  "SFL"  
## [1684] "EVH"   "FIHL"  "CXM"   "NMM"   "OMI"   "HG"    "QBTS"  "WLFC"  "AUPH" 
## [1693] "PSEC"  "AVDX"  "DNOW"  "ANIP"  "CIFR"  "APLD"  "TILE"  "PCRX"  "HTH"  
## [1702] "IBTA"  "APOG"  "NTLA"  "SUPV"  "VYX"   "BLBD"  "RCKT"  "LC"    "WNS"  
## [1711] "LZ"    "PLSE"  "SKWD"  "PAGS"  "PBI"   "EMBC"  "CASH"  "MQ"    "CDRE" 
## [1720] "TLRY"  "STC"   "PLUS"  "LNN"   "ELVN"  "FBNC"  "RAMP"  "DXPE"  "AMN"  
## [1729] "PAX"   "EVCM"  "NSSC"  "COLL"  "LOB"   "DAVA"  "TPC"   "REPL"  "GB"   
## [1738] "MRTN"  "BGM"   "FINV"  "ROG"   "PL"    "ABCL"  "LKFN"  "SIMO"  "HTZ"  
## [1747] "PHVS"  "CHCO"  "JAMF"  "NVEE"  "ZYME"  "BBUC"  "CNXN"  "CRAI"  "OPK"  
## [1756] "FCF"   "KN"    "EOSE"  "BKDT"  "NWBI"  "SPT"   "UP"    "BKD"   "MBIN" 
## [1765] "GDYN"  "NNE"   "HSTM"  "CLBK"  "SONO"  "AMSC"  "FNA"   "NBHC"  "UCTT" 
## [1774] "ARLO"  "GRAL"  "NTB"   "NABL"  "RDW"   "GYRE"  "TWFG"  "MXL"   "CECO" 
## [1783] "PNTG"  "NIC"   "NN"    "ULH"   "OCS"   "HMN"   "SEMR"  "CTOS"  "AVAH" 
## [1792] "CUBI"  "DFIN"  "ASPN"  "COGT"  "VRTS"  "BHE"   "NX"    "ESTA"  "PX"   
## [1801] "VRNT"  "BBSI"  "AVBP"  "STEL"  "VECO"  "CCEC"  "EOLS"  "SASR"  "UPBD" 
## [1810] "LMB"   "BCYC"  "HOPE"  "PD"    "JBI"   "AVXL"  "SEZL"  "DBD"   "CMCO" 
## [1819] "PAHC"  "NAVI"  "PLAB"  "KFRC"  "CVAC"  "GSBD"  "CTS"   "DLX"   "NUVB" 
## [1828] "STBA"  "GRC"   "PRTA"  "VBTX"  "THR"   "ETNB"  "SRCE"  "HLIT"  "TRNS" 
## [1837] "TYRA"  "TCBK"  "NTCT"  "EH"    "BFLY"  "WT"    "TH"    "HCSG"  "WABC" 
## [1846] "ADEA"  "CRESY" "CBLL"  "BLX"   "AAOI"  "PRLB"  "SEPN"  "DCOM"  "MLNK" 
## [1855] "DCO"   "RLAY"  "QCRH"  "JKS"   "FWRD"  "AXGN"  "BUSE"  "TASK"  "GIC"  
## [1864] "RBCAA" "KARO"  "FBYD"  "IRMD"  "CET"   "SPNS"  "HSII"  "QURE"  "MFIC" 
## [1873] "VNET"  "VLRS"  "OPT"   "OCSL"  "HIMX"  "HY"    "AVDL"  "BY"    "CSGS" 
## [1882] "ERII"  "SANA"  "TIGR"  "AMPL"  "FIP"   "CRMD"  "EIG"   "DQ"    "HTLD" 
## [1891] "DH"    "NMFC"  "FSLY"  "SERV"  "CSTL"  "SAFT"  "SABR"  "SNCY"  "CTKB" 
## [1900] "HCI"   "FORTY" "BXC"   "ORIC"  "BHLB"  "DCBO"  "PSIX"  "CRON"  "CCB"  
## [1909] "ATEN"  "SKYH"  "OFIX"  "GABC"  "ENFN"  "SPLP"  "TECX"  "PFBC"  "VSAT" 
## [1918] "EVTL"  "AVNS"  "ROOT"  "COHU"  "MATW"  "SIBN"  "BCSF"  "OLO"   "GSL"  
## [1927] "DNA"   "ECPG"  "CRCT"  "ADSE"  "SLP"   "FSUN"  "NYAX"  "GLDD"  "LENZ" 
## [1936] "FBMS"  "SMWB"  "SWIM"  "CAPR"  "PEBO"  "RCAT"  "ASTE"  "SENS"  "SII"  
## [1945] "SCSC"  "WNC"   "VERV"  "IGIC"  "AOSL"  "MEG"   "IMTX"  "CSWC"  "DSP"  
## [1954] "MVST"  "PLRX"  "BOW"   "ICHR"  "CVLG"  "BVS"   "OCFC"  "INOD"  "ECO"  
## [1963] "KIDS"  "OBK"   "TIXT"  "ZIP"   "EBS"   "BRKL"  "PDFS"  "NVRI"  "BCAX" 
## [1972] "AMAL"  "COMM"  "NPK"   "AUTL"  "BBDC"  "VTEX"  "JELD"  "ABUS"  "BFC"  
## [1981] "DAVE"  "MAGN"  "DNTH"  "AMSF"  "XRX"   "ZJK"   "IMNM"  "TMP"   "BELFA"
## [1990] "PLPC"  "KURA"  "LX"    "VMEO"  "BYRN"  "PHAR"  "SBSI"  "DGII"  "GNK"  
## [1999] "CRGX"  "BRDG"  "EXOD"  "BLDP"  "CGEM"  "CTBI"  "PENG"  "RGR"   "LAB"  
## [2008] "PFLT"  "ARRY"  "RYI"   "AUNA"  "NCDL"  "SEDG"  "EVLV"  "EYPT"  "CGBD" 
## [2017] "SHLS"  "CASS"  "IRWD"  "AMTB"  "PSFE"  "ATRO"  "HUMA"  "BELFB" "FLX"  
## [2026] "CCRN"  "BHRB"  "PRO"   "SBC"   "VREX"  "SLRC"  "DMRC"  "EBF"   "SVRA" 
## [2035] "FUFU"  "BBAI"  "ASC"   "TMCI"  "GHLD"  "ACMR"  "WLDN"  "STOK"  "MTAL" 
## [2044] "WOLF"  "KODK"  "ACCD"  "CNOB"  "BLND"  "KELYB" "UPB"   "TRIN"  "ML"   
## [2053] "KELYA" "ERAS"  "FMBH"  "TUYA"  "IIIN"  "SRDX"  "BITF"  "INDI"  "ACCO" 
## [2062] "TALK"  "UVSP"  "RDWR"  "GHM"   "FLGT"  "HFWA"  "ETWO"  "AIRJ"  "AKBA" 
## [2071] "ATLC"  "CRNC"  "NWPX"  "TBPH"  "PRAA"  "CSIQ"  "KE"    "XERS"  "OSBC" 
## [2080] "PRTH"  "SWBI"  "SMLR"  "PRA"   "CINT"  "BWMN"  "ANAB"  "MCBS"  "BASE" 
## [2089] "EAF"   "MNMD"  "NBN"   "GCT"   "BOC"   "DCTH"  "TYG"   "HCKT"  "TWI"  
## [2098] "RAPP"  "EGBN"  "YEXT"  "TRC"   "INNV"  "CFFN"  "AIOT"  "OFLX"  "MXCT" 
## [2107] "AC"    "CRSR"  "ACTG"  "PROK"  "NBBK"  "RZLV"  "CYRX"  "DCGO"  "TCPC" 
## [2116] "IIIV"  "ALT"   "CPF"   "DJCO"  "RR"    "MLYS"  "TIPT"  "DAKT"  "SHYF" 
## [2125] "RNAC"  "FCBC"  "RSKD"  "AMPX"  "ANGO"  "CEVA"  "NL"    "CLPT"  "BFST" 
## [2134] "NTGR"  "LNZA"  "MGTX"  "ABL"   "MLAB"  "SB"    "OMER"  "FDUS"  "NVTS" 
## [2143] "PAMT"  "NNOX"  "EQBK"  "ADTN"  "BLDE"  "KROS"  "IBCP"  "BBCP"  "QTRX" 
## [2152] "SLQT"  "PGY"   "SPIR"  "PHAT"  "CCAP"  "OUST"  "MTRX"  "ARCT"  "MBWM" 
## [2161] "SKYT"  "LXFR"  "PACB"  "UFCS"  "PUBM"  "PKOH"  "SIGA"  "AACT"  "RPAY" 
## [2170] "QUAD"  "ABSI"  "ORRF"  "KULR"  "ASLE"  "GHRS"  "GSBC"  "OSPN"  "MEC"  
## [2179] "TRDA"  "HBNC"  "IMOS"  "TITN"  "YI"    "NOAH"  "SSYS"  "SATL"  "MREO" 
## [2188] "HAFC"  "YALA"  "CIX"   "CYH"   "HBT"   "CTLP"  "MTW"   "TRML"  "LPRO" 
## [2197] "IMXI"  "RLGT"  "ORGO"  "OPY"   "ITRN"  "ARQ"   "OLMA"  "EZPW"  "CGNT" 
## [2206] "RGP"   "ANNX"  "MCB"   "CNDT"  "FSTR"  "NPCE"  "AMRK"  "CAN"   "TATT" 
## [2215] "PSNL"  "GLAD"  "MGIC"  "PKE"   "SAGE"  "VINP"  "NOVA"  "ORN"   "URGN" 
## [2224] "SMBC"  "HKD"   "NVX"   "CVRX"  "FSBC"  "CLMB"  "FORR"  "SNDA"  "INV"  
## [2233] "LASR"  "MG"    "ETON"  "ESQ"   "LYTS"  "AMBI"  "ZVRA"  "HIPO"  "AEHR" 
## [2242] "FCEL"  "TKNO"  "OPFI"  "RXT"   "TG"    "KALV"  "VEL"   "NNDM"  "PANL" 
## [2251] "CELC"  "CAC"   "BIGC"  "ESEA"  "NRC"   "TRST"  "FARO"  "FLYX"  "KOD"  
## [2260] "WRLD"  "PRCH"  "AZUL"  "PRTC"  "CCBG"  "CLFD"  "GENC"  "ATXS"  "CION" 
## [2269] "DDD"   "QSI"   "ALTI"  "RDVT"  "TCMD"  "WASH"  "MITK"  "KMDA"  "LDI"  
## [2278] "ARQQ"  "ESPR"  "MOFG"  "XPER"  "EHAB"  "ACIC"  "ATOM"  "PGEN"  "AMBC" 
## [2287] "CCSI"  "ALLO"  "MTLS"  "ABVX"  "UVE"   "BAND"  "PRME"  "HTBK"  "AIP"  
## [2296] "CDXS"  "HIFS"  "CRNT"  "ZIMV"  "SPFI"  "UIS"   "NYXH"  "MPB"   "POET" 
## [2305] "AURA"  "BTBT"  "MEI"   "RIGL"  "PGC"   "ALNT"  "RGNX"  "NVEC"  "VALN" 
## [2314] "THFF"  "API"   "OABI"  "TRAK"  "TREE"  "LSAK"  "TERN"  "QD"    "ILLR" 
## [2323] "AHG"   "SMBK"  "KLTR"  "CERS"  "FISI"  "ARBE"  "ENGN"  "GDOT"  "OOMA" 
## [2332] "CMPX"  "BCAL"  "BKKT"  "ALMS"  "SHBI"  "GILT"  "MBX"   "FFWM"  "LGTY" 
## [2341] "LXRX"  "FMNB"  "IBEX"  "CMRX"  "DGICA" "HCAT"  "PFIS"  "EGHT"  "KRRO" 
## [2350] "CCNE"  "BKSY"  "TRVI"  "PSBD"  "EB"    "NMRA"  "DGICB" "FRGT"  "CRVS" 
## [2359] "GBLI"  "MVIS"  "GLUE"  "MSBI"  "SMRT"  "TBRG"  "HONE"  "BLZE"  "SPOK" 
## [2368] "NFBK"  "VUZI"  "VMD"   "GAIN"  "NRDY"  "TNGX"  "ALRS"  "VLN"   "SMTI" 
## [2377] "GLRE"  "TSSI"  "ZBIO"  "VALU"  "VPG"   "XOMA"  "EBTC"  "LAW"   "TSHA" 
## [2386] "PNNT"  "EXFY"  "VYGR"  "VBNK"  "AUDC"  "ALDX"  "BHB"   "PERF"  "IMMP" 
## [2395] "CBNK"  "ASUR"  "ATYR"  "GCBC"  "AEVA"  "OSUR"  "ANSC"  "IMMR"  "AIRS" 
## [2404] "AROW"  "ONTF"  "CADL"  "HIVE"  "DOMO"  "CTNM"  "EQV"   "PDYN"  "ELMD" 
## [2413] "AAM"   "ICG"   "DRTS"  "KRNY"  "TSAT"  "AQST"  "TCBX"  "ALLT"  "DSGN" 
## [2422] "WDH"   "ATGL"  "YMAB"  "ITIC"  "SVCO"  "ITOS"  "YRD"   "MAPS"  "SOPH" 
## [2431] "DHIL"  "RZLT"  "FFIC"  "HRTX"  "NRIM"  "BDMD"  "UNTY"  "INGN"  "NETD" 
## [2440] "VSTM"  "RWAY"  "CGC"   "BSVN"  "CATX"  "ASA"   "ACIU"  "BSRR"  "DMAC" 
## [2449] "CARE"  "SLN"   "WALD"  "ACRS"  "XYF"   "AVIR"  "GNTY"  "ZYXI"  "CCIX" 
## [2458] "CMPS"  "MFH"   "ATAI"  "BMRC"  "ELDN"  "SCM"   "FMAO"  "AMRN"  "FBIZ" 
## [2467] "VNDA"  "HRZN"  "CCCC"  "GPAT"  "MNPR"  "ALF"   "BNTC"  "MBAV"  "RCEL" 
## [2476] "BWB"   "AMLX"  "MBI"   "THRD"  "CWBC"  "SGMO"  "HBCP"  "LUNG"  "RBB"  
## [2485] "NAUT"  "WTBA"  "HITI"  "RRBI"  "ANIK"  "USCB"  "TVGN"  "PMTS"  "LFCR" 
## [2494] "FRBA"  "NUTX"  "HRTG"  "ARAY"  "ACNB"  "ABEO"  "RM"    "LRMR"  "GIG"  
## [2503] "OCGN"  "NEWT"  "CRDF"  "SAR"   "DRUG"  "GOCO"  "FHTX"  "VACH"  "FULC" 
## [2512] "HYAC"  "NGNE"  "FSBW"  "PRQR"  "CCIR"  "LFMD"  "CIVB"  "SERA"  "NODK" 
## [2521] "CGEN"  "SSBK"  "FDMT"  "CUB"   "MDWD"  "BEAG"  "UTMD"  "SIMA"  "ACB"  
## [2530] "COFS"  "PROF"  "EHTH"  "GOSS"  "TPVG"  "IVA"   "OBT"   "CDTX"  "MLAC" 
## [2539] "BWAY"  "SFST"  "AVR"   "ALDF"  "MGNX"  "POLE"  "OBIO"  "VCIC"  "ZYBT" 
## [2548] "PDLB"  "MYO"   "FNLC"  "CYBN"  "FOA"   "BTMD"  "GRAF"  "OGI"   "GSRT" 
## [2557] "SLRN"  "INBK"  "ACTU"  "HOND"  "MOLN"  "CZFS"  "BIOA"  "LIEN"  "NVRO" 
## [2566] "LPBB"  "INBX"  "BCML"  "PHLT"  "PBFS"  "KRMD"  "SNFCA" "IKT"   "JMSB" 
## [2575] "INMB"  "LPAA"  "PROC"  "CZNC"  "STXS"  "BRBS"  "LXEO"  "NECB"  "HURA" 
## [2584] "FRST"  "VOR"   "HIT"   "PLX"   "PCB"   "GNFT"  "CBAN"  "VXRT"  "LNKB" 
## [2593] "VERU"  "TDAC"  "LYEL"  "PLBC"  "NKTX"  "ALEC"  "FDBC"  "INFU"  "FLIC" 
## [2602] "GLSI"  "MVBF"  "ADCT"  "GHI"   "FENC"  "HLXB"  "ACRV"  "SBXD"  "JYNT" 
## [2611] "SAMG"  "NBTX"  "BACQ"  "AMWL"  "SBT"   "STRO"  "SCPH"  "CLLS"  "NKTR" 
## [2620] "BMEA"  "RAPT"  "VTYX"  "CSBR"  "ZNTL"  "TLSI"  "KYTX"  "TIL"   "AKYA" 
## [2629] "FATE"  "ADAP"  "IPHA"  "BDSX"  "TELO"  "SGHT"  "MCRB"  "CKPT"  "PBYI" 
## [2638] "ARTV"  "CHRS"  "TCRX"  "RENB"  "SEER"  "BDTX"  "LCTX"  "ZJYL"  "NOTV" 
## [2647] "IFRX"  "CRBU"  "MIST"  "SGMT"  "GNLX"  "TSVT"  "PDEX"  "QIPT"  "SLDB" 
## [2656] "CRBP"  "ANRO"  "NVCT"  "CABA"  "CLYM"  "ZOM"   "SAVA"  "CRDL"  "EUDA" 
## [2665] "ZTEK"  "ENTA"  "ATOS"  "SRTS"  "EDIT"  "ALGS"  "ZURA"  "ECOR"  "XBIT" 
## [2674] "MGX"   "LNSR"  "ELUT"  "EPRX"  "ALLK"  "TELA"  "MLSS"  "CAMP"  "ACHV" 
## [2683] "FONR"  "PETS"  "GUTS"  "IGMS"  "OPRX"  "ABOS"  "COYA"  "MGRM"  "FBRX" 
## [2692] "TNYA"  "PYXS"  "HLVX"  "SY"    "NTRB"  "MDXH"  "CUE"   "DBVT"  "ASMB" 
## [2701] "TARA"  "ADVM"  "MNOV"  "IMUX"  "ORMP"  "MASS"  "ANIX"  "INZY"  "ME"   
## [2710] "VIGL"  "CTOR"  "RVPH"  "ENTX"  "KPTI"  "IMAB"  "DERM"  "TLSA"  "RANI" 
## [2719] "JSPR"  "ADAG"  "RBOT"  "SKYE"  "AGEN"  "ELTX"  "LPTX"  "VTGN"  "CBUS" 
## [2728] "XFOR"  "RGLS"  "XTNT"  "SCLX"  "EDAP"  "ASRT"  "ALXO"  "MRSN"  "THTX" 
## [2737] "AVTX"  "INCR"  "INO"   "EPIX"  "ARMP"  "AADI"  "GALT"  "IPSC"  "ANL"  
## [2746] "ACET"  "GNTA"  "HBIO"  "BLUE"  "OKUR"  "BNR"   "AVTE"  "MODV"  "APLT" 
## [2755] "STIM"  "NSPR"  "VANI"  "OWLT"  "IKNA"  "HYPR"  "TNXP"  "RMTI"  "SLS"  
## [2764] "PIII"  "PMVP"  "XGN"   "CLSD"  "WOK"   "QNCX"  "CTMX"  "BYSI"  "FORA" 
## [2773] "ICCM"  "ANVS"  "CNTX"  "MURA"  "PRE"   "PRLD"  "OPTN"  "VRCA"  "ABP"  
## [2782] "IMRX"  "ICAD"  "PEPG"  "KRON"  "UNCY"  "SGN"   "GBIO"  "RGC"   "IOBT" 
## [2791] "HOWL"  "CCEL"  "NEUE"  "ONCY"  "NVNO"  "FBLG"  "GANX"  "RPID"  "LTRN" 
## [2800] "PDSB"  "VNRX"  "TOI"   "OSTX"  "CTSO"  "IVVD"  "BEAT"  "IMMX"  "CCLD" 
## [2809] "CELU"  "APYX"  "STTK"  "CNTB"  "BOLD"  "MODD"  "OVID"  "RPTX"  "FGEN" 
## [2818] "PTHL"  "ALVR"  "FBIO"  "LSB"   "LUCD"  "OTLK"  "SHLT"  "MAIA"  "RNTX" 
## [2827] "ACHL"  "ATRA"  "SPRO"  "JUNS"  "KZR"   "ICCC"  "OM"    "VTVT"  "SER"  
## [2836] "DYAI"  "QTTB"  "CLGN"  "ITRM"  "TPST"  "CASI"  "ESLA"  "VYNE"  "BRNS" 
## [2845] "SCYX"  "ATNM"  "UBX"   "ANEB"  "DXR"   "GRCE"  "CMMB"  "ELEV"  "DTIL" 
## [2854] "NRXP"  "SCNX"  "SABS"  "ONMD"  "MDAI"  "PASG"  "KALA"  "ANTX"  "RADX" 
## [2863] "INKT"  "NMTC"  "CALC"  "SRZN"  "INTS"  "XCUR"  "RLYB"  "CPIX"  "XLO"  
## [2872] "IRD"   "DHAI"  "ZCMD"  "OKYO"  "NXL"   "CVKD"  "MBOT"  "BIVI"  "OCX"  
## [2881] "ENZ"   "RNXT"  "COCH"  "MOVE"  "POCI"  "SNYR"  "XAIR"  "CGTX"  "IXHL" 
## [2890] "CVM"   "LEXX"  "MLEC"  "LGHL"  "GDTC"  "RAIN"  "AXDX"  "PMN"   "ERNA" 
## [2899] "BCAB"  "CRIS"  "AKTX"  "NSYS"  "CTXR"  "NRXS"  "IBIO"  "DARE"  "CARA" 
## [2908] "IRIX"  "ATHE"  "DRIO"  "GELS"  "PULM"  "FEMY"  "HOOK"  "DRRX"  "ENLV" 
## [2917] "IGC"   "VSEE"  "CODX"  "NRSN"  "RVP"   "LPCN"  "NXGL"  "LVTX"  "LGVN" 
## [2926] "PLUR"  "SLGL"  "COEP"  "EQ"    "CYTH"  "NXTC"  "VVOS"  "MRKR"  "APRE" 
## [2935] "BLRX"  "ALUR"  "LSTA"  "PTN"   "TENX"  "PYPD"  "SNTI"  "COCP"  "ATHA" 
## [2944] "BFRG"  "CCM"   "AMS"   "TRAW"  "IBO"   "INAB"  "CARM"  "GOVX"  "BOLT" 
## [2953] "IPA"   "KAPA"  "CRVO"  "MEIP"  "MIRA"  "NKGN"  "CDIO"  "IINN"  "AFMD" 
## [2962] "BCTX"  "GLYC"  "STRM"  "BSGM"  "NEPH"  "LFWD"  "BTAI"  "SYBX"  "MTVA" 
## [2971] "COSM"  "CING"  "BRTX"  "FLGC"  "NNVC"  "CANF"  "SPRB"  "XTLB"  "HCWB" 
## [2980] "DWTX"  "NERV"  "PHGE"  "ACXP"  "CHRO"  "ADGM"  "OCEA"  "CLDI"  "LUCY" 
## [2989] "EKSO"  "IMNN"  "AIM"   "TRIB"  "SNSE"  "INDP"  "TXMD"  "PRPH"  "BCLI" 
## [2998] "APTO"  "SSKN"  "BNGO"  "IMRN"  "LYRA"  "EVGN"  "RLMD"  "TLPH"  "KPRX" 
## [3007] "PMCB"  "BIAF"  "AIMD"  "DOMH"  "BMRA"  "CJJD"  "CLRB"  "TSBX"  "MNDR" 
## [3016] "AYTU"  "APLM"  "MBIO"  "JAGX"  "CSCI"  "BCDA"  "ENSC"  "POAI"  "PRPO" 
## [3025] "APDN"  "ELAB"  "HOTH"  "CUTR"  "VRAX"  "PPBT"  "EVOK"  "KLTO"  "MYNZ" 
## [3034] "AEMD"  "SILO"  "SYRA"  "NURO"  "BFRI"  "MHUA"  "RDHL"  "KDLY"  "EDSA" 
## [3043] "CHEK"  "MGRX"  "GLTO"  "KZIA"  "OTRK"  "PSTV"  "APM"   "PAVM"  "NEUP" 
## [3052] "EVAX"  "STRR"  "ICU"   "AWH"   "ADXN"  "VIRX"  "CYCN"  "XBIO"  "IMCC" 
## [3061] "SNGX"  "ABVC"  "EYEN"  "QNTM"  "INBS"  "ALZN"  "SSY"   "ADIL"  "XWEL" 
## [3070] "PHIO"  "CNSP"  "PLRZ"  "ALLR"  "CMND"  "GNPX"  "SBFM"  "SYRS"  "AEON" 
## [3079] "PBM"   "TNON"  "SONN"  "GRI"   "CDT"   "LIXT"  "GTBP"  "TTOO"  "ONVO" 
## [3088] "MBRX"  "NUWE"  "FOXO"  "CERO"  "PRTG"  "SNPX"  "HCTI"  "NCNA"  "GLMD" 
## [3097] "SINT"  "CELZ"  "TOVX"  "ONCO"  "CPHI"  "ARTL"  "SNOA"  "OGEN"  "SLRX" 
## [3106] "RNAZ"  "BPTH"  "ATXI"  "NIVF"  "AMIX"  "NBY"   "VINC"  "ZVSA"  "LIPO" 
## [3115] "NAYA"  "STSS"  "XYLO"  "TTNP"  "HSCS"  "PTPI"  "VTAK"  "TNFA"  "YCBD" 
## [3124] "SHPH"  "WORX"  "RSLS"  "ATNF"  "XRTX"  "QLGN"  "ENVB"  "MTNB"  "KTTA" 
## [3133] "THAR"  "INM"   "PTIX"  "NDRA"  "ICCT"  "SCNI"  "TCRT"  "NLSP"  "TCBP" 
## [3142] "BDRX"  "BBLG"  "HSDT"  "VERO"  "LGMK"  "DRMA"  "BJDX"  "ENTO"  "APVO" 
## [3151] "AKAN"  "CYCC"  "TIVC"  "PCSA"  "SPRC"  "ISPC"  "SXTP"  "AZTR"  "PALI" 
## [3160] "PRFX"  "QNRX"  "AVGR"  "ADTX"  "WINT"  "GCTK"  "NAOV"  "SXTC"  "BACK" 
## [3169] "VRPX"  "SCPX"  "HEPA"  "SLXN"  "REVB"  "ACON"

3 example companies to verify the fetched data.

# VIEW FOR 3 EXAMPLE STOCKS

# Print the first few rows of data for Apple, Microsoft, and Alphabet
if("AAPL" %in% names(symbol_data_list)) {
  cat("Data for Apple (AAPL):\n")
  print(head(symbol_data_list[["AAPL"]]))
} else {
  cat("Data for AAPL not fetched or does not exist.\n")
}
## Data for Apple (AAPL):
##            AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
## 2000-01-03  0.936384  1.004464 0.907924   0.999442   535796800     0.8421506
## 2000-01-04  0.966518  0.987723 0.903460   0.915179   512377600     0.7711489
## 2000-01-05  0.926339  0.987165 0.919643   0.928571   778321600     0.7824333
## 2000-01-06  0.947545  0.955357 0.848214   0.848214   767972800     0.7147228
## 2000-01-07  0.861607  0.901786 0.852679   0.888393   460734400     0.7485784
## 2000-01-10  0.910714  0.912946 0.845982   0.872768   505064000     0.7354123
if("MSFT" %in% names(symbol_data_list)) {
  cat("Data for Microsoft (MSFT):\n")
  print(head(symbol_data_list[["MSFT"]]))
} else {
  cat("Data for MSFT not fetched or does not exist.\n")
}
## Data for Microsoft (MSFT):
##            MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
## 2000-01-03  58.68750  59.31250 56.00000   58.28125    53228400      35.79231
## 2000-01-04  56.78125  58.56250 56.12500   56.31250    54119000      34.58325
## 2000-01-05  55.56250  58.18750 54.68750   56.90625    64059600      34.94788
## 2000-01-06  56.09375  56.93750 54.18750   55.00000    54976600      33.77718
## 2000-01-07  54.31250  56.12500 53.65625   55.71875    62013600      34.21859
## 2000-01-10  56.71875  56.84375 55.68750   56.12500    44963600      34.46809
if("MEDP" %in% names(symbol_data_list)) {
  cat("Data for Medpace (MEDP):\n")
  print(head(symbol_data_list[["MEDP"]]))
} else {
  cat("Data for MEDP not fetched or does not exist.\n")
}
## Data for Medpace (MEDP):
##            MEDP.Open MEDP.High MEDP.Low MEDP.Close MEDP.Volume MEDP.Adjusted
## 2016-08-11     28.15    28.740   27.100      27.79     5356300         27.79
## 2016-08-12     27.55    28.500   27.130      28.04      472700         28.04
## 2016-08-15     28.29    29.600   27.907      29.22      620600         29.22
## 2016-08-16     29.56    29.980   29.200      29.32      315200         29.32
## 2016-08-17     29.43    29.790   28.120      28.14      507900         28.14
## 2016-08-18     28.23    28.843   27.720      27.77      330700         27.77

Finding failed stocks in fetching process and eliminate them

# Get the list of successfully fetched symbols
fetched_symbols <- names(symbol_data_list)

# Find symbols that failed to fetch
failed_symbols <- setdiff(symbols, fetched_symbols)

# Print them
cat("Failed to fetch data for the following symbols:\n")
## Failed to fetch data for the following symbols:
print(failed_symbols)
##  [1] "SQ"     "TAP.A"  "ENLC"   "WSO.B"  "AKO.B"  "UHAL.B" "BIO.B"  "CEIX"  
##  [9] "ASAI"   "NARI"   "MOG.B"  "MOG.A"  "SMAR"   "HCP"    "PFIE"   "AE"    
## [17] "EAST"   "B"      "HTLF"   "OBDE"   "MTTR"   "INFN"   "ZUO"    "CDMO"  
## [25] "PFC"    "FREY"   "CFB"    "SCWX"   "HTBI"   "CRD.A"  "CRD.B"  "RVNC"  
## [33] "HEAR"   "MPLN"   "OMIC"   "MRNS"   "OMGA"   "QTI"    "RHE"

All stocks are fetched and prepared successfully. # ———————————————————————————–

6. Data Filtering for start date <= 2000

There are more than 3k stocks fetched but the majority of them do not include data from 2000s and they have missing values. Now, I am implementing the filter and only keep the companies which have available data since 2000.

# Filter stocks that start exactly on 2000-01-03
filtered_symbol_data_list <- list()

for (symbol in names(symbol_data_list)) {
  stock_data <- symbol_data_list[[symbol]]
  
  # Get the first available date
  start_date <- index(stock_data)[1]
  
  # Keep only those starting exactly on 2000-01-03
  if (start_date == as.Date("2000-01-03")) {
    filtered_symbol_data_list[[symbol]] <- stock_data
  }
}

# Show how many passed the filter
cat("✅ Number of stocks starting exactly on 2000-01-03:", length(filtered_symbol_data_list), "\n")
## ✅ Number of stocks starting exactly on 2000-01-03: 966
# Show a few for verification
cat("📅 Start dates preview:\n")
## 📅 Start dates preview:
for (symbol in head(names(filtered_symbol_data_list), 10)) {
  cat(symbol, "starts at:", as.character(index(filtered_symbol_data_list[[symbol]])[1]), "\n")
}
## AAPL starts at: 2000-01-03 
## GE starts at: 2000-01-03 
## LLY starts at: 2000-01-03 
## WMT starts at: 2000-01-03 
## XOM starts at: 2000-01-03 
## NVDA starts at: 2000-01-03 
## CAT starts at: 2000-01-03 
## UNH starts at: 2000-01-03 
## JPM starts at: 2000-01-03 
## COST starts at: 2000-01-03
# Replace the original list if everything looks fine
symbol_data_list <- filtered_symbol_data_list

The number of fetched stocks are more than 3k+, and the number of stocks decreased to 967 after applying filter and take stocks which have start date from January 2000.

————————————————————————

7. Defining the Volatility

Calculating and Defining the volatility of stocks , using Annualized Volatility of recent 1 year.

# Load necessary libraries
library(quantmod)
library(dplyr)
library(lubridate)

# Initialize the volatility data frame
volatility_data <- data.frame(Symbol = character(), 
                              Sector = character(), 
                              MarketLeader = logical(), 
                              Volatility = numeric(), 
                              stringsAsFactors = FALSE)

# Loop through each symbol to calculate 1-year annualized volatility
for (symbol in names(symbol_data_list)) {
  # Extract the stock data
  stock_data <- symbol_data_list[[symbol]]
  
  # Calculate daily returns using logarithmic return
  daily_returns <- dailyReturn(Cl(stock_data), type = 'log')
  
  # Filter the last 1 year of returns data
  last_1_year <- daily_returns[index(daily_returns) >= (Sys.Date() %m-% years(1))]
  
  # Compute annualized volatility
  annualized_volatility <- sd(last_1_year, na.rm = TRUE) * sqrt(252)
  
  # Get sector and market leader status from 'stocks_details'
  stock_info <- stocks_details %>% 
    filter(Symbol == symbol) %>% 
    select(Sector, MarketLeader) %>% 
    slice(1)
  
  # Append to the volatility data frame
  volatility_data <- rbind(volatility_data, data.frame(Symbol = symbol,
                                                       Sector = stock_info$Sector,
                                                       MarketLeader = stock_info$MarketLeader,
                                                       Volatility = annualized_volatility))
}

# Calculate the mean annualized volatility
mean_volatility <- mean(volatility_data$Volatility, na.rm = TRUE)
print(paste("Mean Annualized Volatility (Last 1 Year):", round(mean_volatility, 4)))
## [1] "Mean Annualized Volatility (Last 1 Year): 0.4179"
# View the first few rows of the volatility data frame
head(volatility_data)
##   Symbol                 Sector MarketLeader Volatility
## 1   AAPL      Technology Sector         TRUE  0.3214633
## 2     GE     Industrials Sector         TRUE  0.3463606
## 3    LLY      Healthcare Sector         TRUE  0.3769383
## 4    WMT Consumer Staple Sector         TRUE  0.2470392
## 5    XOM          Energy Sector         TRUE  0.2415347
## 6   NVDA      Technology Sector         TRUE  0.5970071
nrow(volatility_data)
## [1] 966
ncol(volatility_data)
## [1] 4

Average volatility stocks of the 967 stocks is 0.4166 which is also proving that the stock market is one of the safest market to invest.

First division based on the thresholds of (<0.2, <0.5, and >0.5).

# Load the dplyr library if not already loaded
library(dplyr)

# Add a new column for volatility classification
volatility_data <- volatility_data %>%
  mutate(VolatilityClass = case_when(
    Volatility < 0.2 ~ "Stable",
    Volatility >= 0.2 & Volatility < 0.5 ~ "Moderate",
    Volatility >= 0.5 ~ "Volatile"
  ))

# View the first few rows of the updated volatility data frame
head(volatility_data)
##   Symbol                 Sector MarketLeader Volatility VolatilityClass
## 1   AAPL      Technology Sector         TRUE  0.3214633        Moderate
## 2     GE     Industrials Sector         TRUE  0.3463606        Moderate
## 3    LLY      Healthcare Sector         TRUE  0.3769383        Moderate
## 4    WMT Consumer Staple Sector         TRUE  0.2470392        Moderate
## 5    XOM          Energy Sector         TRUE  0.2415347        Moderate
## 6   NVDA      Technology Sector         TRUE  0.5970071        Volatile
# Load the dplyr library if not already loaded
library(dplyr)

# Group by Sector, MarketLeader, and VolatilityClass, then count the number of stocks
sector_leader_volatility_counts <- volatility_data %>%
  group_by(Sector, MarketLeader, VolatilityClass) %>%
  summarise(StockCount = n(), .groups = 'drop')  # 'drop' ensures the grouping is dropped after summarise

# View the grouped and counted data
# Print all rows of the grouped and counted data
print(sector_leader_volatility_counts, n = Inf)  # 'Inf' indicates to print all rows
## # A tibble: 31 × 4
##    Sector                 MarketLeader VolatilityClass StockCount
##    <chr>                  <lgl>        <chr>                <int>
##  1 Consumer Staple Sector FALSE        Moderate                36
##  2 Consumer Staple Sector FALSE        Volatile                12
##  3 Consumer Staple Sector TRUE         Moderate                29
##  4 Consumer Staple Sector TRUE         Stable                   8
##  5 Consumer Staple Sector TRUE         Volatile                 1
##  6 Energy Sector          FALSE        Moderate                30
##  7 Energy Sector          FALSE        Volatile                24
##  8 Energy Sector          TRUE         Moderate                27
##  9 Energy Sector          TRUE         Stable                   2
## 10 Energy Sector          TRUE         Volatile                 1
## 11 Financial Sector       FALSE        Moderate               161
## 12 Financial Sector       FALSE        Stable                   2
## 13 Financial Sector       FALSE        Volatile                 8
## 14 Financial Sector       TRUE         Moderate                66
## 15 Financial Sector       TRUE         Stable                   3
## 16 Healthcare Sector      FALSE        Moderate                59
## 17 Healthcare Sector      FALSE        Stable                   2
## 18 Healthcare Sector      FALSE        Volatile                78
## 19 Healthcare Sector      TRUE         Moderate                25
## 20 Healthcare Sector      TRUE         Stable                   3
## 21 Healthcare Sector      TRUE         Volatile                 2
## 22 Industrials Sector     FALSE        Moderate               106
## 23 Industrials Sector     FALSE        Volatile                39
## 24 Industrials Sector     TRUE         Moderate                73
## 25 Industrials Sector     TRUE         Stable                   2
## 26 Industrials Sector     TRUE         Volatile                 1
## 27 Technology Sector      FALSE        Moderate                67
## 28 Technology Sector      FALSE        Volatile                31
## 29 Technology Sector      TRUE         Moderate                53
## 30 Technology Sector      TRUE         Stable                   2
## 31 Technology Sector      TRUE         Volatile                13

Next and Proper division for the analysis 0.25 and below stable stocks and 0.25 above volatile stocks

# Reclassify volatility using updated thresholds
volatility_data <- volatility_data %>%
  mutate(VolatilityClass = case_when(
    Volatility <= 0.25 ~ "Stable",
    Volatility > 0.25 ~ "Volatile"
  ))
# Recalculate stock counts based on the new VolatilityClass
sector_leader_volatility_counts <- volatility_data %>%
  group_by(Sector, MarketLeader, VolatilityClass) %>%
  summarise(StockCount = n(), .groups = 'drop')

# Print all rows of the grouped and counted data
print(sector_leader_volatility_counts, n = Inf)
## # A tibble: 24 × 4
##    Sector                 MarketLeader VolatilityClass StockCount
##    <chr>                  <lgl>        <chr>                <int>
##  1 Consumer Staple Sector FALSE        Stable                   3
##  2 Consumer Staple Sector FALSE        Volatile                45
##  3 Consumer Staple Sector TRUE         Stable                  24
##  4 Consumer Staple Sector TRUE         Volatile                14
##  5 Energy Sector          FALSE        Stable                   2
##  6 Energy Sector          FALSE        Volatile                52
##  7 Energy Sector          TRUE         Stable                   7
##  8 Energy Sector          TRUE         Volatile                23
##  9 Financial Sector       FALSE        Stable                   8
## 10 Financial Sector       FALSE        Volatile               163
## 11 Financial Sector       TRUE         Stable                  20
## 12 Financial Sector       TRUE         Volatile                49
## 13 Healthcare Sector      FALSE        Stable                   8
## 14 Healthcare Sector      FALSE        Volatile               131
## 15 Healthcare Sector      TRUE         Stable                   9
## 16 Healthcare Sector      TRUE         Volatile                21
## 17 Industrials Sector     FALSE        Stable                   6
## 18 Industrials Sector     FALSE        Volatile               139
## 19 Industrials Sector     TRUE         Stable                  18
## 20 Industrials Sector     TRUE         Volatile                58
## 21 Technology Sector      FALSE        Stable                   1
## 22 Technology Sector      FALSE        Volatile                97
## 23 Technology Sector      TRUE         Stable                  12
## 24 Technology Sector      TRUE         Volatile                56

—————————————————————————————–

8. EXPLORATORY DATA ANALYSIS

This table shows average volatility by each sector ##### Table 1

# Load necessary library
library(dplyr)

# Calculate average annualized volatility score by sector
average_volatility_by_sector <- volatility_data %>%
  group_by(Sector) %>%
  summarise(AverageVolatility = mean(Volatility, na.rm = TRUE))

# View the result
print(average_volatility_by_sector)
## # A tibble: 6 × 2
##   Sector                 AverageVolatility
##   <chr>                              <dbl>
## 1 Consumer Staple Sector             0.342
## 2 Energy Sector                      0.431
## 3 Financial Sector                   0.331
## 4 Healthcare Sector                  0.584
## 5 Industrials Sector                 0.387
## 6 Technology Sector                  0.449
Plot 1

This plot is bar plot version of the table 1 and showing same thing with table 1 which is average volatility score by each sector of comapnies.

# Load necessary libraries
library(ggplot2)
library(dplyr)

# Custom fill colors
custom_colors <- c(
  "#B22222",  # Dark Red
  "#006400",  # Dark Green
  "#00008B",  # Dark Blue
  "#FF8C00",  # Orange
  "#FFD700",  # Yellow
  "#FF69B4",  # Pink
  "#008080"   # Teal
)

# Create a bar plot with white background and custom dark colors
ggplot(average_volatility_by_sector, aes(x = reorder(Sector, AverageVolatility), y = AverageVolatility, fill = Sector)) +
  geom_bar(stat = "identity", color = "gray50", show.legend = FALSE) +
  geom_text(aes(label = round(AverageVolatility, 2)), vjust = -0.5, color = "black", size = 4) +
  scale_fill_manual(values = custom_colors) +
  theme_minimal(base_size = 14) +
  theme(
    panel.background = element_rect(fill = "white", color = NA),
    plot.background = element_rect(fill = "white", color = NA),
    axis.text.x = element_text(color = "black"),
    axis.text.y = element_text(color = "black"),
    axis.title.x = element_text(color = "black", size = 14),
    axis.title.y = element_text(color = "black", size = 14),
    plot.title = element_text(color = "black", size = 16, face = "bold", hjust = 0.5),
    panel.grid.major = element_line(color = "gray80", size = 0.5),
    panel.grid.minor = element_line(color = "gray90", size = 0.25),
    axis.ticks = element_line(color = "black"),
    axis.line = element_line(color = "black")
  ) +
  labs(
    title = "Average Volatility by Sector",
    x = "Sector",
    y = "Average Volatility"
  ) +
  coord_flip()
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Plot 2

In this code chunk we can observe again average volatility by each sector but this time, it is also groupped by market leaders and non market leaders.

# 2. Calculate average volatility score by sector and market leader status
average_volatility_by_sector_leader <- volatility_data %>%
  group_by(Sector, MarketLeader) %>%
  summarise(AverageVolatility = mean(Volatility, na.rm = TRUE), .groups = 'drop')

# View the result
print(average_volatility_by_sector_leader)
## # A tibble: 12 × 3
##    Sector                 MarketLeader AverageVolatility
##    <chr>                  <lgl>                    <dbl>
##  1 Consumer Staple Sector FALSE                    0.410
##  2 Consumer Staple Sector TRUE                     0.257
##  3 Energy Sector          FALSE                    0.495
##  4 Energy Sector          TRUE                     0.317
##  5 Financial Sector       FALSE                    0.350
##  6 Financial Sector       TRUE                     0.283
##  7 Healthcare Sector      FALSE                    0.643
##  8 Healthcare Sector      TRUE                     0.312
##  9 Industrials Sector     FALSE                    0.429
## 10 Industrials Sector     TRUE                     0.306
## 11 Technology Sector      FALSE                    0.491
## 12 Technology Sector      TRUE                     0.388
# Load necessary libraries
library(ggplot2)
library(dplyr)

# Define custom colors
custom_colors <- c(
  "#B22222",  # Dark Red
  "#006400",  # Dark Green
  "#00008B",  # Dark Blue
  "#FF8C00",  # Orange
  "#FFD700",  # Yellow
  "#FF69B4",  # Pink
  "#008080"   # Teal
)

# Plot 2.1: Market Leaders' Average Volatility
market_leader_plot <- average_volatility_by_sector_leader %>%
  filter(MarketLeader == TRUE) %>%
  ggplot(aes(x = Sector, y = AverageVolatility, fill = Sector)) +
  geom_bar(stat = "identity", show.legend = FALSE, color = "gray50") +
  geom_text(aes(label = round(AverageVolatility, 2)), vjust = -0.3, size = 5, fontface = "bold", color = "black") +
  scale_fill_manual(values = custom_colors) +
  theme_minimal(base_size = 15) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 12, color = "black"),
    axis.text.y = element_text(color = "black"),
    axis.title = element_text(color = "black"),
    plot.title = element_text(color = "black", size = 16, face = "bold", hjust = 0.5),
    plot.background = element_rect(fill = "white", color = NA),
    panel.background = element_rect(fill = "white", color = NA),
    panel.grid.major = element_line(color = "gray80"),
    panel.grid.minor = element_line(color = "gray90")
  ) +
  labs(
    title = "Market Leaders' Average Volatility",
    x = "Sector",
    y = "Average Volatility"
  )

# Plot 2.2: Non-Market Leaders' Average Volatility
non_market_leader_plot <- average_volatility_by_sector_leader %>%
  filter(MarketLeader == FALSE) %>%
  ggplot(aes(x = Sector, y = AverageVolatility, fill = Sector)) +
  geom_bar(stat = "identity", show.legend = FALSE, color = "gray50") +
  geom_text(aes(label = round(AverageVolatility, 2)), vjust = -0.3, size = 8, fontface = "bold", color = "black") +
  scale_fill_manual(values = custom_colors) +
  theme_minimal(base_size = 15) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 14, color = "black"),
    axis.text.y = element_text(color = "black"),
    axis.title = element_text(color = "black"),
    plot.title = element_text(color = "black", size = 16, face = "bold", hjust = 0.5),
    plot.background = element_rect(fill = "white", color = NA),
    panel.background = element_rect(fill = "white", color = NA),
    panel.grid.major = element_line(color = "gray80"),
    panel.grid.minor = element_line(color = "gray90")
  ) +
  labs(
    title = "Non-Market Leaders' Average Volatility",
    x = "Sector",
    y = "Average Volatility"
  )

# Show plots
print(market_leader_plot)

print(non_market_leader_plot)

Plot 3

This plot showing the number of market leaders and non market leaders in each sector

# Calculate the count of market leaders and non-leaders per sector
market_leader_distribution <- volatility_data %>%
  group_by(Sector, MarketLeader) %>%
  summarise(Count = n(), .groups = 'drop')

# View the result
print(market_leader_distribution)
## # A tibble: 12 × 3
##    Sector                 MarketLeader Count
##    <chr>                  <lgl>        <int>
##  1 Consumer Staple Sector FALSE           48
##  2 Consumer Staple Sector TRUE            38
##  3 Energy Sector          FALSE           54
##  4 Energy Sector          TRUE            30
##  5 Financial Sector       FALSE          171
##  6 Financial Sector       TRUE            69
##  7 Healthcare Sector      FALSE          139
##  8 Healthcare Sector      TRUE            30
##  9 Industrials Sector     FALSE          145
## 10 Industrials Sector     TRUE            76
## 11 Technology Sector      FALSE           98
## 12 Technology Sector      TRUE            68
# Convert MarketLeader to factor explicitly
market_leader_distribution$MarketLeader <- factor(
  market_leader_distribution$MarketLeader,
  levels = c(FALSE, TRUE),
  labels = c("Non-Leader", "Leader")
)

# Plot with white background and correct manual colors
ggplot(market_leader_distribution, aes(x = Sector, y = Count, fill = MarketLeader)) +
  geom_bar(stat = "identity", position = "fill", color = "gray50") +
  scale_fill_manual(
    values = c("Non-Leader" = "#B22222", "Leader" = "#00008B")
  ) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  theme_minimal(base_size = 16) +
  theme(
    plot.background = element_rect(fill = "white", color = NA),
    panel.background = element_rect(fill = "white", color = NA),
    text = element_text(color = "black"),
    axis.title = element_text(color = "black"),
    axis.text = element_text(color = "black"),
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.title = element_text(color = "black"),
    legend.text = element_text(color = "black")
  ) +
  labs(
    title = "Market Leadership Distribution by Sector",
    x = "Sector",
    y = "Percentage of Stocks",
    fill = "Market Leader"
)

PLOT 4

Volatile Stocks

This plot showing the number of stable companies and risky (volatile) comapnies in each sector

# Calculate the count of volatile and stable stocks per sector
volatility_distribution <- volatility_data %>%
  group_by(Sector, VolatilityClass) %>%
  summarise(Count = n(), .groups = 'drop')

# View the result
print(volatility_distribution)
## # A tibble: 12 × 3
##    Sector                 VolatilityClass Count
##    <chr>                  <chr>           <int>
##  1 Consumer Staple Sector Stable             27
##  2 Consumer Staple Sector Volatile           59
##  3 Energy Sector          Stable              9
##  4 Energy Sector          Volatile           75
##  5 Financial Sector       Stable             28
##  6 Financial Sector       Volatile          212
##  7 Healthcare Sector      Stable             17
##  8 Healthcare Sector      Volatile          152
##  9 Industrials Sector     Stable             24
## 10 Industrials Sector     Volatile          197
## 11 Technology Sector      Stable             13
## 12 Technology Sector      Volatile          153
# Convert VolatilityClass to factor with meaningful labels
volatility_distribution$VolatilityClass <- factor(
  volatility_distribution$VolatilityClass,
  levels = c("Stable", "Volatile")  # Adjust these if your data uses TRUE/FALSE or 0/1
)

# Bar plot with white background and correct colors
ggplot(volatility_distribution, aes(x = Sector, y = Count, fill = VolatilityClass)) +
  geom_bar(stat = "identity", position = "fill", color = "gray50") +
  scale_fill_manual(
    values = c("Stable" = "#FFD700", "Volatile" = "#00008B")
  ) +
  scale_y_continuous(labels = scales::percent_format()) +
  theme_minimal(base_size = 16) +
  theme(
    plot.background = element_rect(fill = "white", color = NA),
    panel.background = element_rect(fill = "white", color = NA),
    text = element_text(color = "black"),
    axis.title = element_text(color = "black"),
    axis.text = element_text(color = "black"),
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.title = element_text(color = "black"),
    legend.text = element_text(color = "black")
  ) +
  labs(
    title = "Volatility Distribution by Sector",
    x = "Sector",
    y = "Percentage",
    fill = "Volatility Class"
)

Table 5

It is important to see the market gigants which are in the top based on market shares and also important to be aware of the most stable companies in each industry. This table shows the top 3 and bottom 3 companies by each 6 sector.

library(dplyr)
# Merge the market capitalization data with the original volatility data
full_data <- left_join(volatility_data, stocks_details, by = "Symbol")

# Remove duplicate columns after the merge and keep only the necessary ones
full_data <- full_data %>%
  select(-c(Sector.x, MarketLeader.x)) %>%  # Remove Sector.x and MarketLeader.x
  rename(Sector = Sector.y, MarketLeader = MarketLeader.y)  # Rename Sector.y and MarketLeader.y

# Now, find the top 3 and bottom 3 Market Capitalizations per sector
max_min_market_cap <- full_data %>%
  group_by(Sector) %>%
  arrange(Sector, desc(`Market Cap Numeric`)) %>%
  slice_head(n = 3) %>%
  bind_rows(
    full_data %>%
      group_by(Sector) %>%
      arrange(Sector, `Market Cap Numeric`) %>%
      slice_head(n = 3)
  ) %>%
  select(Sector, `Company Name`, Symbol, `Market Cap Numeric`) %>%
  arrange(Sector, `Market Cap Numeric`)

# Print the results for market capitalization
print(max_min_market_cap, n = Inf)
## # A tibble: 36 × 4
## # Groups:   Sector [6]
##    Sector                 `Company Name`             Symbol `Market Cap Numeric`
##    <chr>                  <chr>                      <chr>                 <dbl>
##  1 Consumer Staple Sector Rocky Mountain Chocolate … RMCF               15450000
##  2 Consumer Staple Sector Mannatech, Incorporated    MTEX               23090000
##  3 Consumer Staple Sector Natural Alternatives Inte… NAII               25020000
##  4 Consumer Staple Sector The Procter & Gamble Comp… PG             380390000000
##  5 Consumer Staple Sector Costco Wholesale Corporat… COST           408390000000
##  6 Consumer Staple Sector Walmart Inc.               WMT            730400000000
##  7 Energy Sector          Marine Petroleum Trust     MARPS               8180000
##  8 Energy Sector          BP Prudhoe Bay Royalty Tr… BPT                11660000
##  9 Energy Sector          Mesa Royalty Trust         MTR                11720000
## 10 Energy Sector          Shell plc                  SHEL           202800000000
## 11 Energy Sector          Chevron Corporation        CVX            279720000000
## 12 Energy Sector          Exxon Mobil Corporation    XOM            477570000000
## 13 Financial Sector       Greystone Housing Impact … GHI               259460000
## 14 Financial Sector       The First of Long Island … FLIC              260380000
## 15 Financial Sector       Fidelity D & D Bancorp, I… FDBC              265120000
## 16 Financial Sector       Wells Fargo & Company      WFC            241850000000
## 17 Financial Sector       Bank of America Corporati… BAC            355560000000
## 18 Financial Sector       JPMorgan Chase & Co.       JPM            687570000000
## 19 Healthcare Sector      Windtree Therapeutics, In… WINT                1610000
## 20 Healthcare Sector      Titan Pharmaceuticals, In… TTNP                3310000
## 21 Healthcare Sector      Becton, Dickinson and Com… BDX                 3920000
## 22 Healthcare Sector      Novo Nordisk A/S           NVO            389380000000
## 23 Healthcare Sector      UnitedHealth Group Incorp… UNH            490060000000
## 24 Healthcare Sector      Eli Lilly and Company      LLY            707210000000
## 25 Industrials Sector     Tredegar Corporation       TG                254590000
## 26 Industrials Sector     FuelCell Energy, Inc.      FCEL              266660000
## 27 Industrials Sector     Forrester Research, Inc.   FORR              279240000
## 28 Industrials Sector     RTX Corporation            RTX            152670000000
## 29 Industrials Sector     Caterpillar Inc.           CAT            176130000000
## 30 Industrials Sector     General Electric Company   GE             187380000000
## 31 Technology Sector      Immersion Corporation      IMMR              278860000
## 32 Technology Sector      Asure Software, Inc.       ASUR              288770000
## 33 Technology Sector      AudioCodes Ltd.            AUDC              302490000
## 34 Technology Sector      Microsoft Corporation      MSFT          3175510000000
## 35 Technology Sector      NVIDIA Corporation         NVDA          3494480000000
## 36 Technology Sector      Apple Inc.                 AAPL          3698010000000
Table 6

This table shows the most stable 3 and the most volatile 3 comapnies based on each sector.

# Find the top 3 and bottom 3 Volatility scores per sector
max_min_volatility <- full_data %>%
  group_by(Sector) %>%
  arrange(Sector, desc(Volatility)) %>%
  slice_head(n = 3) %>%
  bind_rows(
    full_data %>%
      group_by(Sector) %>%
      arrange(Sector, Volatility) %>%
      slice_head(n = 3)
  ) %>%
  select(Sector, `Company Name`, Symbol, Volatility) %>%
  arrange(Sector, Volatility)

# Print the results for volatility
print(max_min_volatility, n = Inf)
## # A tibble: 36 × 4
## # Groups:   Sector [6]
##    Sector                 `Company Name`                       Symbol Volatility
##    <chr>                  <chr>                                <chr>       <dbl>
##  1 Consumer Staple Sector The Coca-Cola Company                KO          0.165
##  2 Consumer Staple Sector Unilever PLC                         UL          0.187
##  3 Consumer Staple Sector Altria Group, Inc.                   MO          0.190
##  4 Consumer Staple Sector Rocky Mountain Chocolate Factory, I… RMCF        0.684
##  5 Consumer Staple Sector Newell Brands Inc.                   NWL         0.699
##  6 Consumer Staple Sector Mannatech, Incorporated              MTEX        0.732
##  7 Energy Sector          Enbridge Inc.                        ENB         0.172
##  8 Energy Sector          Enterprise Products Partners L.P.    EPD         0.193
##  9 Energy Sector          National Fuel Gas Company            NFG         0.208
## 10 Energy Sector          BP Prudhoe Bay Royalty Trust         BPT         0.879
## 11 Energy Sector          U.S. Energy Corp.                    USEG        0.897
## 12 Energy Sector          Centrus Energy Corp.                 LEU         0.964
## 13 Financial Sector       Enstar Group Limited                 ESGR        0.142
## 14 Financial Sector       Central Securities Corporation       CET         0.157
## 15 Financial Sector       Marsh & McLennan Companies, Inc.     MMC         0.176
## 16 Financial Sector       MBIA Inc.                            MBI         0.617
## 17 Financial Sector       Banco BBVA Argentina S.A.            BBAR        0.668
## 18 Financial Sector       TeraWulf Inc.                        WULF        1.21 
## 19 Healthcare Sector      Amedisys, Inc.                       AMED        0.137
## 20 Healthcare Sector      Cencora, Inc.                        COR         0.188
## 21 Healthcare Sector      Johnson & Johnson                    JNJ         0.188
## 22 Healthcare Sector      Soligenix, Inc.                      SNGX        1.63 
## 23 Healthcare Sector      Windtree Therapeutics, Inc.          WINT        2.00 
## 24 Healthcare Sector      Kazia Therapeutics Limited           KZIA        2.06 
## 25 Industrials Sector     Waste Connections, Inc.              WCN         0.171
## 26 Industrials Sector     Republic Services, Inc.              RSG         0.177
## 27 Industrials Sector     Waste Management, Inc.               WM          0.202
## 28 Industrials Sector     American Superconductor Corporation  AMSC        0.891
## 29 Industrials Sector     Plug Power Inc.                      PLUG        0.956
## 30 Industrials Sector     FuelCell Energy, Inc.                FCEL        1.02 
## 31 Technology Sector      Juniper Networks, Inc.               JNPR        0.147
## 32 Technology Sector      Automatic Data Processing, Inc.      ADP         0.191
## 33 Technology Sector      Amdocs Limited                       DOX         0.204
## 34 Technology Sector      MicroVision, Inc.                    MVIS        0.998
## 35 Technology Sector      Innodata Inc.                        INOD        1.20 
## 36 Technology Sector      Wolfspeed, Inc.                      WOLF        1.51

————————————————————————————–

9. DATA SELECTION

Simple View of the Full data

head(full_data)
##   Symbol Volatility VolatilityClass No             Company Name
## 1   AAPL  0.3214633        Volatile  1               Apple Inc.
## 2     GE  0.3463606        Volatile  1 General Electric Company
## 3    LLY  0.3769383        Volatile  1    Eli Lilly and Company
## 4    WMT  0.2470392          Stable  1             Walmart Inc.
## 5    XOM  0.2415347          Stable  1  Exxon Mobil Corporation
## 6   NVDA  0.5970071        Volatile  2       NVIDIA Corporation
##                   Sector Market Cap % Change    Volume Revenue
## 1      Technology Sector  3,698.01B  -0.0014  10694412 391.04B
## 2     Industrials Sector    187.38B   0.0039   2428942  69.95B
## 3      Healthcare Sector    707.21B   0.0245   4138602  40.86B
## 4 Consumer Staple Sector    730.40B  -0.0056   2734176 673.82B
## 5          Energy Sector    477.57B  -0.0135  13925409 343.82B
## 6      Technology Sector  3,494.48B  -0.0451 166664997 113.27B
##   Market Cap Numeric MarketLeader
## 1        3.69801e+12         TRUE
## 2        1.87380e+11         TRUE
## 3        7.07210e+11         TRUE
## 4        7.30400e+11         TRUE
## 5        4.77570e+11         TRUE
## 6        3.49448e+12         TRUE

The columnames of the full data

colnames(full_data)
##  [1] "Symbol"             "Volatility"         "VolatilityClass"   
##  [4] "No"                 "Company Name"       "Sector"            
##  [7] "Market Cap"         "% Change"           "Volume"            
## [10] "Revenue"            "Market Cap Numeric" "MarketLeader"
Random Selection of 3 stocks for each section.

There are 24 sections.So, it is assumed 24*3 = 72 stocks.

# Set the seed for reproducibility
set.seed(42)

# Load required libraries
library(dplyr)

# Make sure your column names are correct (e.g. MarketLeader is logical or character)
full_data <- full_data %>%
  mutate(
    MarketLeader = as.character(MarketLeader), # or as.factor if you prefer
    VolatilityClass = as.character(VolatilityClass),
    Sector = as.character(Sector)
  )

# Group by Sector, MarketLeader, VolatilityClass and sample 3 from each group
selected_stocks <- full_data %>%
  group_by(Sector, MarketLeader, VolatilityClass) %>%
  slice_sample(n = 3) %>%
  ungroup()

# View the result
print(selected_stocks, n = Inf)
## # A tibble: 69 × 12
##    Symbol Volatility VolatilityClass    No `Company Name`    Sector `Market Cap`
##    <chr>       <dbl> <chr>           <dbl> <chr>             <chr>  <chr>       
##  1 INGR        0.248 Stable             58 Ingredion Incorp… Consu… 8.74B       
##  2 TR          0.247 Stable             91 Tootsie Roll Ind… Consu… 2.23B       
##  3 FLO         0.208 Stable             73 Flowers Foods, I… Consu… 4.27B       
##  4 STKL        0.512 Volatile          123 SunOpta Inc.      Consu… 879.90M     
##  5 JJSF        0.291 Volatile           82 J&J Snack Foods … Consu… 2.87B       
##  6 WILC        0.396 Volatile          162 G. Willi-Food In… Consu… 226.03M     
##  7 TSN         0.234 Stable             32 Tyson Foods, Inc. Consu… 20.15B      
##  8 CHD         0.206 Stable             30 Church & Dwight … Consu… 25.07B      
##  9 GIS         0.221 Stable             24 General Mills, I… Consu… 34.15B      
## 10 ADM         0.269 Volatile           31 Archer-Daniels-M… Consu… 24.10B      
## 11 EL          0.508 Volatile           29 The Est茅e Laude… Consu… 27.15B      
## 12 HSY         0.267 Volatile           25 The Hershey Comp… Consu… 33.98B      
## 13 NFG         0.208 Stable             68 National Fuel Ga… Energ… 5.53B       
## 14 SBR         0.231 Stable            156 Sabine Royalty T… Energ… 960.27M     
## 15 NRT         0.505 Volatile          233 North European O… Energ… 42.84M      
## 16 LEU         0.964 Volatile          141 Centrus Energy C… Energ… 1.31B       
## 17 REPX        0.484 Volatile          165 Riley Exploratio… Energ… 693.68M     
## 18 SHEL        0.232 Stable              3 Shell plc         Energ… 202.80B     
## 19 TTE         0.234 Stable              4 TotalEnergies SE  Energ… 132.32B     
## 20 XOM         0.242 Stable              1 Exxon Mobil Corp… Energ… 477.57B     
## 21 EOG         0.290 Volatile           10 EOG Resources, I… Energ… 73.82B      
## 22 WMB         0.263 Volatile           12 The Williams Com… Energ… 72.62B      
## 23 VLO         0.375 Volatile           31 Valero Energy Co… Energ… 44.20B      
## 24 ESGR        0.142 Stable            209 Enstar Group Lim… Finan… 4.80B       
## 25 WTM         0.235 Stable            208 White Mountains … Finan… 4.88B       
## 26 RLI         0.233 Stable            162 RLI Corp.         Finan… 7.22B       
## 27 BUSE        0.325 Volatile          350 First Busey Corp… Finan… 1.31B       
## 28 CNOB        0.397 Volatile          397 ConnectOne Banco… Finan… 858.98M     
## 29 GL          0.294 Volatile          140 Globe Life Inc.   Finan… 9.39B       
## 30 RY          0.187 Stable             12 Royal Bank of Ca… Finan… 171.61B     
## 31 AFL         0.226 Stable             46 Aflac Incorporat… Finan… 57.31B      
## 32 BK          0.246 Stable             48 The Bank of New … Finan… 56.72B      
## 33 USB         0.294 Volatile           34 U.S. Bancorp      Finan… 76.61B      
## 34 ING         0.280 Volatile           54 ING Groep N.V.    Finan… 49.73B      
## 35 BLK         0.259 Volatile           14 BlackRock, Inc.   Finan… 153.97B     
## 36 DGX         0.219 Stable             67 Quest Diagnostic… Healt… 17.33B      
## 37 UTMD        0.190 Stable            558 Utah Medical Pro… Healt… 213.40M     
## 38 ATR         0.219 Stable             95 AptarGroup, Inc.  Healt… 10.53B      
## 39 ERNA        1.19  Volatile          885 Eterna Therapeut… Healt… 29.08M      
## 40 PETS        0.706 Volatile          671 PetMed Express, … Healt… 101.93M     
## 41 RMD         0.341 Volatile           40 ResMed Inc.       Healt… 36.79B      
## 42 ABT         0.207 Stable              8 Abbott Laborator… Healt… 216.86B     
## 43 MDT         0.216 Stable             20 Medtronic plc     Healt… 115.65B     
## 44 COR         0.188 Stable             31 Cencora, Inc.     Healt… 48.03B      
## 45 NVO         0.430 Volatile            3 Novo Nordisk A/S  Healt… 389.38B     
## 46 BIO         0.384 Volatile          101 Bio-Rad Laborato… Healt… 9.93B       
## 47 RDNT        0.452 Volatile          142 RadNet, Inc.      Healt… 4.74B       
## 48 BBSI        0.224 Stable            335 Barrett Business… Indus… 1.07B       
## 49 CWST        0.249 Stable            162 Casella Waste Sy… Indus… 6.67B       
## 50 MMS         0.248 Stable            197 Maximus, Inc.     Indus… 4.67B       
## 51 AIT         0.352 Volatile          121 Applied Industri… Indus… 9.55B       
## 52 EBF         0.285 Volatile          394 Ennis, Inc.       Indus… 537.24M     
## 53 AAON        0.532 Volatile          118 AAON, Inc.        Indus… 9.86B       
## 54 ITW         0.218 Stable             16 Illinois Tool Wo… Indus… 73.91B      
## 55 RSG         0.177 Stable             26 Republic Service… Indus… 63.50B      
## 56 CP          0.250 Stable             20 Canadian Pacific… Indus… 71.08B      
## 57 BA          0.398 Volatile            7 The Boeing Compa… Indus… 129.56B     
## 58 LUV         0.409 Volatile           65 Southwest Airlin… Indus… 19.68B      
## 59 CHRW        0.287 Volatile           98 C.H. Robinson Wo… Indus… 12.38B      
## 60 DOX         0.204 Stable            157 Amdocs Limited    Techn… 9.52B       
## 61 ADTN        0.589 Volatile          423 ADTRAN Holdings,… Techn… 750.34M     
## 62 POWI        0.442 Volatile          259 Power Integratio… Techn… 3.63B       
## 63 CTLP        0.457 Volatile          435 Cantaloupe, Inc.  Techn… 637.17M     
## 64 ROP         0.215 Stable             51 Roper Technologi… Techn… 54.74B      
## 65 TDY         0.236 Stable             92 Teledyne Micropa… Techn… 21.65B      
## 66 MSI         0.225 Stable             42 Motorola Solutio… Techn… 76.60B      
## 67 GLW         0.340 Volatile           58 Corning Incorpor… Techn… 41.19B      
## 68 MU          0.640 Volatile           27 Micron Technolog… Techn… 116.67B     
## 69 STM         0.520 Volatile           86 STMicroelectroni… Techn… 23.63B      
## # ℹ 5 more variables: `% Change` <chr>, Volume <chr>, Revenue <chr>,
## #   `Market Cap Numeric` <dbl>, MarketLeader <chr>

It is assumed to select 72 stocks however, the selected stocks became 69 because 1 section has 1 stocks, and 1 section has 2 stocks although, it is assumed to have 3 for each section.

library(dplyr)

# 1. See all unique company names
unique_companies <- unique(selected_stocks$`Company Name`)
print(unique_companies)
##  [1] "Ingredion Incorporated"                 
##  [2] "Tootsie Roll Industries, Inc."          
##  [3] "Flowers Foods, Inc."                    
##  [4] "SunOpta Inc."                           
##  [5] "J&J Snack Foods Corp."                  
##  [6] "G. Willi-Food International Ltd."       
##  [7] "Tyson Foods, Inc."                      
##  [8] "Church & Dwight Co., Inc."              
##  [9] "General Mills, Inc."                    
## [10] "Archer-Daniels-Midland Company"         
## [11] "The Est茅e Lauder Companies Inc."       
## [12] "The Hershey Company"                    
## [13] "National Fuel Gas Company"              
## [14] "Sabine Royalty Trust"                   
## [15] "North European Oil Royalty Trust"       
## [16] "Centrus Energy Corp."                   
## [17] "Riley Exploration Permian, Inc."        
## [18] "Shell plc"                              
## [19] "TotalEnergies SE"                       
## [20] "Exxon Mobil Corporation"                
## [21] "EOG Resources, Inc."                    
## [22] "The Williams Companies, Inc."           
## [23] "Valero Energy Corporation"              
## [24] "Enstar Group Limited"                   
## [25] "White Mountains Insurance Group, Ltd."  
## [26] "RLI Corp."                              
## [27] "First Busey Corporation"                
## [28] "ConnectOne Bancorp, Inc."               
## [29] "Globe Life Inc."                        
## [30] "Royal Bank of Canada"                   
## [31] "Aflac Incorporated"                     
## [32] "The Bank of New York Mellon Corporation"
## [33] "U.S. Bancorp"                           
## [34] "ING Groep N.V."                         
## [35] "BlackRock, Inc."                        
## [36] "Quest Diagnostics Incorporated"         
## [37] "Utah Medical Products, Inc."            
## [38] "AptarGroup, Inc."                       
## [39] "Eterna Therapeutics Inc."               
## [40] "PetMed Express, Inc."                   
## [41] "ResMed Inc."                            
## [42] "Abbott Laboratories"                    
## [43] "Medtronic plc"                          
## [44] "Cencora, Inc."                          
## [45] "Novo Nordisk A/S"                       
## [46] "Bio-Rad Laboratories, Inc."             
## [47] "RadNet, Inc."                           
## [48] "Barrett Business Services, Inc."        
## [49] "Casella Waste Systems, Inc."            
## [50] "Maximus, Inc."                          
## [51] "Applied Industrial Technologies, Inc."  
## [52] "Ennis, Inc."                            
## [53] "AAON, Inc."                             
## [54] "Illinois Tool Works Inc."               
## [55] "Republic Services, Inc."                
## [56] "Canadian Pacific Kansas City Limited"   
## [57] "The Boeing Company"                     
## [58] "Southwest Airlines Co."                 
## [59] "C.H. Robinson Worldwide, Inc."          
## [60] "Amdocs Limited"                         
## [61] "ADTRAN Holdings, Inc."                  
## [62] "Power Integrations, Inc."               
## [63] "Cantaloupe, Inc."                       
## [64] "Roper Technologies, Inc."               
## [65] "Teledyne Micropac, Inc"                 
## [66] "Motorola Solutions, Inc."               
## [67] "Corning Incorporated"                   
## [68] "Micron Technology, Inc."                
## [69] "STMicroelectronics N.V."
# 2. Create the summary table
company_table <- selected_stocks %>%
  select(Sector, MarketLeader, VolatilityClass, `Company Name`, Symbol) %>%
  mutate(
    `Market leadership status` = ifelse(MarketLeader == TRUE, "Leader", "Non-Leader"),
    `Volatility status` = VolatilityClass
  ) %>%
  select(Sector, `Market leadership status`, `Volatility status`, `Company Name`, Symbol) %>%
  distinct()  # remove duplicates if any

# View the summary table
print(company_table, n = Inf)
## # A tibble: 69 × 5
##    Sector       Market leadership st…¹ `Volatility status` `Company Name` Symbol
##    <chr>        <chr>                  <chr>               <chr>          <chr> 
##  1 Consumer St… Non-Leader             Stable              Ingredion Inc… INGR  
##  2 Consumer St… Non-Leader             Stable              Tootsie Roll … TR    
##  3 Consumer St… Non-Leader             Stable              Flowers Foods… FLO   
##  4 Consumer St… Non-Leader             Volatile            SunOpta Inc.   STKL  
##  5 Consumer St… Non-Leader             Volatile            J&J Snack Foo… JJSF  
##  6 Consumer St… Non-Leader             Volatile            G. Willi-Food… WILC  
##  7 Consumer St… Leader                 Stable              Tyson Foods, … TSN   
##  8 Consumer St… Leader                 Stable              Church & Dwig… CHD   
##  9 Consumer St… Leader                 Stable              General Mills… GIS   
## 10 Consumer St… Leader                 Volatile            Archer-Daniel… ADM   
## 11 Consumer St… Leader                 Volatile            The Est茅e La… EL    
## 12 Consumer St… Leader                 Volatile            The Hershey C… HSY   
## 13 Energy Sect… Non-Leader             Stable              National Fuel… NFG   
## 14 Energy Sect… Non-Leader             Stable              Sabine Royalt… SBR   
## 15 Energy Sect… Non-Leader             Volatile            North Europea… NRT   
## 16 Energy Sect… Non-Leader             Volatile            Centrus Energ… LEU   
## 17 Energy Sect… Non-Leader             Volatile            Riley Explora… REPX  
## 18 Energy Sect… Leader                 Stable              Shell plc      SHEL  
## 19 Energy Sect… Leader                 Stable              TotalEnergies… TTE   
## 20 Energy Sect… Leader                 Stable              Exxon Mobil C… XOM   
## 21 Energy Sect… Leader                 Volatile            EOG Resources… EOG   
## 22 Energy Sect… Leader                 Volatile            The Williams … WMB   
## 23 Energy Sect… Leader                 Volatile            Valero Energy… VLO   
## 24 Financial S… Non-Leader             Stable              Enstar Group … ESGR  
## 25 Financial S… Non-Leader             Stable              White Mountai… WTM   
## 26 Financial S… Non-Leader             Stable              RLI Corp.      RLI   
## 27 Financial S… Non-Leader             Volatile            First Busey C… BUSE  
## 28 Financial S… Non-Leader             Volatile            ConnectOne Ba… CNOB  
## 29 Financial S… Non-Leader             Volatile            Globe Life In… GL    
## 30 Financial S… Leader                 Stable              Royal Bank of… RY    
## 31 Financial S… Leader                 Stable              Aflac Incorpo… AFL   
## 32 Financial S… Leader                 Stable              The Bank of N… BK    
## 33 Financial S… Leader                 Volatile            U.S. Bancorp   USB   
## 34 Financial S… Leader                 Volatile            ING Groep N.V. ING   
## 35 Financial S… Leader                 Volatile            BlackRock, In… BLK   
## 36 Healthcare … Non-Leader             Stable              Quest Diagnos… DGX   
## 37 Healthcare … Non-Leader             Stable              Utah Medical … UTMD  
## 38 Healthcare … Non-Leader             Stable              AptarGroup, I… ATR   
## 39 Healthcare … Non-Leader             Volatile            Eterna Therap… ERNA  
## 40 Healthcare … Non-Leader             Volatile            PetMed Expres… PETS  
## 41 Healthcare … Non-Leader             Volatile            ResMed Inc.    RMD   
## 42 Healthcare … Leader                 Stable              Abbott Labora… ABT   
## 43 Healthcare … Leader                 Stable              Medtronic plc  MDT   
## 44 Healthcare … Leader                 Stable              Cencora, Inc.  COR   
## 45 Healthcare … Leader                 Volatile            Novo Nordisk … NVO   
## 46 Healthcare … Leader                 Volatile            Bio-Rad Labor… BIO   
## 47 Healthcare … Leader                 Volatile            RadNet, Inc.   RDNT  
## 48 Industrials… Non-Leader             Stable              Barrett Busin… BBSI  
## 49 Industrials… Non-Leader             Stable              Casella Waste… CWST  
## 50 Industrials… Non-Leader             Stable              Maximus, Inc.  MMS   
## 51 Industrials… Non-Leader             Volatile            Applied Indus… AIT   
## 52 Industrials… Non-Leader             Volatile            Ennis, Inc.    EBF   
## 53 Industrials… Non-Leader             Volatile            AAON, Inc.     AAON  
## 54 Industrials… Leader                 Stable              Illinois Tool… ITW   
## 55 Industrials… Leader                 Stable              Republic Serv… RSG   
## 56 Industrials… Leader                 Stable              Canadian Paci… CP    
## 57 Industrials… Leader                 Volatile            The Boeing Co… BA    
## 58 Industrials… Leader                 Volatile            Southwest Air… LUV   
## 59 Industrials… Leader                 Volatile            C.H. Robinson… CHRW  
## 60 Technology … Non-Leader             Stable              Amdocs Limited DOX   
## 61 Technology … Non-Leader             Volatile            ADTRAN Holdin… ADTN  
## 62 Technology … Non-Leader             Volatile            Power Integra… POWI  
## 63 Technology … Non-Leader             Volatile            Cantaloupe, I… CTLP  
## 64 Technology … Leader                 Stable              Roper Technol… ROP   
## 65 Technology … Leader                 Stable              Teledyne Micr… TDY   
## 66 Technology … Leader                 Stable              Motorola Solu… MSI   
## 67 Technology … Leader                 Volatile            Corning Incor… GLW   
## 68 Technology … Leader                 Volatile            Micron Techno… MU    
## 69 Technology … Leader                 Volatile            STMicroelectr… STM   
## # ℹ abbreviated name: ¹​`Market leadership status`

The selected stocks are assumed 72 however, because of the data groupping 3 stocks is not revelead and avoided.The following part of the analysis is cocducted in Python such as Data preparation for Modeling, Modeling, Evaluation of the Models, and Results parts.