- In this presentation, we analyze the relationship between trading volume and stock price using Simple Linear Regression.
- Goal: Predict stock closing price based on trading volume.
- Company Stock being analyzed: Apple.
2025-02-06
\[ y = \beta_0 + \beta_1 x + \epsilon \]
where:
Adjusted (Stock closing price)
Volume (Trading volume)
| Date | Open | High | Low | Close | Volume | Adjusted |
|---|---|---|---|---|---|---|
| 2020-01-02 | 74.0600 | 75.1500 | 73.7975 | 75.0875 | 135480400 | 72.79602 |
| 2020-01-03 | 74.2875 | 75.1450 | 74.1250 | 74.3575 | 146322800 | 72.08831 |
| 2020-01-06 | 73.4475 | 74.9900 | 73.1875 | 74.9500 | 118387200 | 72.66270 |
| 2020-01-07 | 74.9600 | 75.2250 | 74.3700 | 74.5975 | 108872000 | 72.32098 |
| 2020-01-08 | 74.2900 | 76.1100 | 74.2900 | 75.7975 | 132079200 | 73.48435 |
| 2020-01-09 | 76.8100 | 77.6075 | 76.5500 | 77.4075 | 170108400 | 75.04521 |
Below is the code chunk used to produce the scatter plot on the previous slide:
ggplot(df, aes(x = Volume, y = Adjusted)) + geom_point(color = “blue”, alpha = 0.6) + geom_smooth(method = “lm”, color =“red”) + ggtitle(“Trading Volume vs. Adjusted Closing Price”)
Where df is:
df <- getSymbols(“AAPL”, src = “yahoo”, from = “2020-01-01”, to = “2024-01-01”, auto.assign = FALSE)
| Estimate | Stand Error | t value | p value | |
|---|---|---|---|---|
| (Intercept) | 178.2571868 | 1.673066 | 106.54523 | 0 |
| Volume | -0.0000004 | 0.000000 | -26.96953 | 0 |
What this means
Another method that could be used to analyze the relationship between trading volume and stock price is the Pearson Correlation Coefficient, which measures the strength of direction of the linear relationship between two variables:
\[ r = \frac{ \sum (x_i - \bar{x}) (y_i - \bar{y}) }{ \sqrt{ \sum (x_i - \bar{x})^2 } \sqrt{ \sum (y_i - \bar{y})^2 } } \]
where
This method provides insight into how strongly trading volume and price are correlated, rather than simply modeling price as a function of volume.