Overview

I acquired, transformed and analyzed historical stock price data and reported earnings per share (EPS) for the companies Apple (AAPL), Applovin (APP), and Nividia (NVDA). I performed a hypothesis test on the average percent price change per quarter.

Hypothesis: The quarter following a reported +30% EPS surprise has a greater price appreciation than if the EPS surprise is <30%

\[ H_0: \mu = \mu_0 \]

\[ H_a: \mu > \mu_0 \]

Project Motivation

My project motivation was to deepen my understanding of stock price action to better inform my own investment choices, but also to practice my data science skills on a data set I find challenging.

Data Acquistion - Price History

Stock price data is available for download at nasdaq.com

# Reading in price history raw data csv from github
appl_ph <- read.csv("https://raw.githubusercontent.com/Chung-Brandon/607/refs/heads/main/AAPL_ph.csv")

Data Acquistion - Reported Earnings per Share

Data scraping with rvest on alphaquery.com

url <- "https://www.alphaquery.com/stock/NVDA/earnings-history"
webpage <- read_html(url)

earnings_table <- html_nodes(webpage, 'table')
nvda.eps <- html_table(earnings_table)[[1]]

Data Acquistion - Reported Earnings per Share

Data Transformation - Clean and Mutate PH

# Mutate in the company ticker
app_ph <- app_ph %>%
  mutate(company = "app") %>%
  relocate(company, .before = 1)

# Bind price history tables together
combined_ph <- rbind(app_ph, appl_ph, nvda_ph)

# Removing unnecessary columns
combined_ph <- combined_ph[,(1:3)] %>%
  rename(close = Close.Last,
         date = Date)

Data Transformation - Clean and Mutate PH

Data Transformation - Clean and Mutate EPS

# Converting EPS data into numeric data for calculations
combined_eps <- combined_eps %>%
  mutate(estimated_eps = as.numeric(gsub("[^0-9.]", "", estimated_eps)),
         actual_eps = as.numeric(gsub("[^0-9.]", "", actual_eps)))

# Mutating in EPS surpise %
combined_eps <- combined_eps %>%
  mutate(eps.surprise.p = (((actual_eps - estimated_eps) / estimated_eps) * 100))

# Removing unneccessary columns
combined_eps <- combined_eps[,-c(3,4,5)]

Data Transformation - Clean and Mutate EPS

Data Transformation - Merging and Forming Tidy Table

# standardizing date columns in both combined data frames
combined_eps <- combined_eps %>%
  rename(date = announcement_date)

combined_eps$date <- ymd(combined_eps$date)
combined_ph$date <- mdy(combined_ph$date)

## Using innerjoin to keep only the observations with dates of EPS announcement
tidy.df <- combined_ph %>%
  inner_join(combined_eps, by = c("company", "date"))

Tidy Data

# Converting close variable to numeric data
tidy.df <- tidy.df %>%
  mutate(close = as.numeric(gsub("[^0-9.]", "", close)))

# Calculating the percentage change from EPS announcement to next EPS announcement
tidy.df <- tidy.df %>%
  group_by(company) %>%
  arrange(company, date) %>%
  mutate( percentage_change = (lead(close) - close) / close * 100 )

Tidy Data

Left with 95 observations of 5 variables

## # A tibble: 3 × 2
##   company count
##   <chr>   <int>
## 1 app        15
## 2 appl       40
## 3 nvda       40

Analysis

\[ H_0: \] The percentage change for the period after a >30% EPS surprise is the same for <30% EPS surprise. \[ H_a: \] The percentage change for the period after a >30% EPS surprise is greater than for a <30% EPS surprise.

\(\mu:\) 14.00

\(\mu_{30}:\) 51.467

Analysis

tidy.df <- tidy.df %>%
  mutate(eps.30 = ifelse(percentage_change > 30, "Yes", "No"))
tidy.df %>%
  group_by(eps.30) %>%
  summarise(mean_pchange = mean(percentage_change),
            count = n())

## # A tibble: 3 × 3
##   eps.30 mean_pchange count
##   <chr>         <dbl> <int>
## 1 No             2.22    70
## 2 Yes           51.5     22
## 3 <NA>          NA        3

Analysis

mean_p.change <- mean(tidy.df$percentage_change, na.rm = TRUE)
mean_p.change

## [1] 13.99818

Analysis

Calculating a 95% confidence interval around the eps.30 “Yes” group

# Creating a CI with the t.test function

t_test <- tidy.df[,5]
t_test_result <- t.test(t_test, conf.level = 0.95)
conf_interval <- t_test_result$conf.int
cat("95% Confidence Interval: [", conf_interval[1], ", ", conf_interval[2], "]\n")

## 95% Confidence Interval: [ 7.930708 ,  20.06565 ]

Analysis visualization

Conclusion

Because our EPS surprise >30% group has a mean percent change of 51, outside of the calculated CI, we reject the null hypothesis and can say that we are 95% confident that for the quarter post EPS announcement, if there is an EPS surprise of 30% or greater, we will likely see greater percent stock appreciation.

607 Final Project: Stock Price Movement vs Earnings

Overview

Project Motivation

Data Acquistion - Price History

Data Transformation - Clean and Mutate PH

Data Transformation - Clean and Mutate PH

Data Transformation - Clean and Mutate EPS

Data Transformation - Clean and Mutate EPS

Data Transformation - Merging and Forming Tidy Table

Tidy Data

Tidy Data

Data Visualization

Data Visualization

Analysis

Analysis

Analysis

Analysis

Analysis visualization

Conclusion