(1) Overview

Based on material that I’ve read from my personal research regarding investing, I decided to test the notion that stock price returns are normally distributed. To test this idea, I’ve obtained monthly closing prices for $SPY, a SP500 index that was founded on January 22, 1993, from Yahoo Finance. This is roughly 20+ years of data (data collected through August 2020) and I will use the mean and standard deviation of monthly returns to investigate whether returns are normally distributed and use our findings to answer the question:


Would it make sense to include $SPY in my investment portfolio?

(2) Analyzing the Data

The data only contains closing prices and volume figures. This leaves me with the task of manually calculating monthly returns. In addition, we saved the mean and standard deviation as variables to use later.

n<- nrow(data)
data$return<- 0
for (i in 2:n)
  {data$return[i]<- (data$Close[i] - data$Close[i-1])/data$Close[i-1]*100}
data$return<- round(data$return, digits = 2)
MEAN<- mean(data$return)
SD<- sd(data$return)


The average return and standard deviation calculations are as follows. OUr monthly return range is also listed.

## [1] "The average monthly return is  0.705120481927711"
## [1] "The standard deviation for the monthly returns is  4.20926689341531"
## [1] "The max. (+) and min.(-) value is  -16.52"
## [2] "The max. (+) and min.(-) value is  12.7"

Visualizing the Data

## [1] -0.6278495

Figure 1: The data is negative skewed so the distribution of returns slightly lean to the right, which means that we are more likely to see a positive return in general. Our skewness is highlighted just below the chart.


Figure 2: It appears that most of the returns fall within the distribution curve. Our data greatly resembles a normal distribution.
Figure 3: When we plot the values agains the Q-Q plot we see there a few outliers that do not follow the 45 degree angle line. Devaitions from this blue line represent differences from the normal distribution. The data shows a departure from normal in the tails. The data may be slightly non-normal but is normal enough to provide a deeper analysis.

Understanding Probabilities

Via the graphs we’ve plotted so far, highly unlikely, investing in this index can you land a monthly return higher than 10%. Similarly, you could experience a monthly loss greater than 10% as well. How likely are both these scenarios?

## [1] "The probability of achieving a monthly return greater than 10% is  1.36 %"
## [1] "The probability of achieving a monthly loss greater than 10% is  0.55 %"

It’s very unlikely that we would obtain such a dramatic gain/loss on a monthly basis. Noted earlier, the distribution of returns are slightly skewed towards positive % returns. To illustrate I plotted the Cumulative Distribution Function (CDF) below.


Figure 4: The CDF illustrates the probability of obtaining a value smaller than a random given value, in this case monthly returns. As given by the graph, there is a 40% of obtaining a monthly loss which means the chance of having a position return is 60%.

Conclusion

We’ve analyzed the average monthly returns for $SPY for the period between January 1993 and August 2020. There is a bout a 30% spread in terms of returns and we calculated the average return of approximately .7%, which is rougly an 8.8% average annual return. Furthermore, there is 60% chance that your monthly return will be a positive one. I’d include $SPY in my portfolio given the odds are in my favor in obtaining a positive return. Furthermore, the standard deviation is 4% which does not make this a volatile investment.