A Time Series Analysis of Amazon’s Monthly Opening Stock Price

2025-12-11

What Did I Do and Why?

Thesis: Analyzed Amazon’s monthly opening price
Scope: Jan 2009 to Oct 2021
Personal Motivation: Improved financial literacy, valuable market knowledge
Target Audience: Investors, economists, media analysts

Methods and Technology Used

R and RStudio (knitr, forecast, lubridate)
Exponential Smoothing
Holt / Holt-Winters

“Weighted Averaging” ~ think a step above Moving Average

Background of Dataset

Sourced from Kaggle
High data workability
- No missing values, consistent data entry and variable encoding
Came with a daily observation for every market-open day from May 15, 1997 to October 27, 2021
- First few observations seen below

Date	Open	High	Low	Close	Adj Close	Volume
1997-05-15	2.437500	2.500000	1.927083	1.958333	1.958333	72156000
1997-05-16	1.968750	1.979167	1.708333	1.729167	1.729167	14700000
1997-05-19	1.760417	1.770833	1.625000	1.708333	1.708333	6106800
1997-05-20	1.729167	1.750000	1.635417	1.635417	1.635417	5467200
1997-05-21	1.635417	1.645833	1.375000	1.427083	1.427083	18853200
1997-05-22	1.437500	1.447917	1.312500	1.395833	1.395833	11776800

My Procedure

Research Question: What is the history and how can I predict the future?

Constraints: Only looking at market opens on the first day of every month on timeline from January 2009 to October of 2021

Opening price = strong understanding colloquially
Day-to-day = too many observations
Year-to-year = too few observations
Chosen timeline: n = 154

First few observations of dataset after programming manipulation

Month	Year	Open	Season
Jan	2009	52.01	Winter
Feb	2009	62.87	Winter
Mar	2009	68.35	Spring
Apr	2009	78.28	Spring
May	2009	77.84	Spring
Jun	2009	83.19	Summer

Data Visualization

Priority 1: Observe Trend
What we see from plot below:
- Positive
- Not linear
- (Increase ~ 2017 to 2022) >>> (Increase ~ 2009 to 2016)

Data Visualization

Priority 2: Observe Seasonality
- Seasonal differences exacerbate as time goes on

Training Vs Testing Data

Training Data ≠ Testing Data
- Poses risk of overfitting
- Avoids true “real world” application
Training → Jan 2009 to Oct 2020 (142 month period)
Testing → Nov 2020 to Oct 2021 (12 month period)

Model Creation - Exponential Smoothing and Holt

Accredit these ideas to Charles Holt, Professor of Business and Finance at University of Austin
Developed throughout the mid to late 1950’s
Holt expanded on the foundations of SES (discovered in previous decade) by considering trend

Model Type	Explanation	Trend?	Seasonality?	Assumptions
Simple Exponential Smoothing (SES)	Weighted average; More recent observations = More weight	No	No	Values are not changing much from what they’ve recently been
Holt Additive	Generalized SES, but ADDS trend parameter	Yes	No	Values increasing/decreasing in LINEAR fashion
Holt Additive W/Damp	Generalized SES, ADDS trend parameter, but DAMPS parameter to reduce abs[trend’s impact]	Yes	No	Linear increase/decrease, but will weaken in strength over time
Holt Multiplicative W/Damp	Like additive iteration, but MULTIPLIES trend parameter	Yes	No	Values increasing/decreasing by a RATE

Note: Holt Multiplicative without Damp is rarely ever used
- Approaches $\infty$
- Computationally unstable
- Not realistic

Model Creation - Holt-Winters

Holt and a student of his, Peter-Winters, then went even further and developed a method that also factored in seasonality
Officially published these findings in a 1960 paper “Forecasting Sales by Exponentially Weighted Moving Averages”

Model Type	Explanation	Trend?	Seasonality?	Assumptions
Holt-Winters (HW) Additive	Adds Data’s trend and constant oscillation due to seasonality	Yes	Yes	Linear upward/downward overall trend; seasonal fluctuations from a constant
HW Additive W/Damp	Add’s a DAMPED trend and constant seasonality	Yes	Yes	Upward/downward overall trend appears linear, but its strengh is decreasing; seasonal fluctuations remain constant
HW Multiplicative	Multiplies Data’s trend and proportional oscillation due to seasonality	Yes	Yes	Upward/downward overall trend per a certain rate; seasonal fluctuations behave proportionally
HW Multiplicative W/Damp	Multiplies a DAMPED trend and proportional seasonality	Yes	Yes	Upward/downward overall trend per a certain rate, but will decrease in magnitude; proportional season fluctuations

Model Forecasts

Model Forecasts for Monthly Amazon Stock Openings ($USD)
Month	True Values	SES	Holt Add	Holt Add W/ Damp	Holt Multiply W/ Damp	HW Add	HW Multiply	HW Add W/ Damp	HW Multiply W/ Damp
Nov 2020	3147.33	3241.35	3340.39	3289.48	3300.31	3331.56	3212.56	3318.53	3197.97
Dec 2020	3199.93	3241.35	3439.42	3327.95	3360.33	3419.13	3237.60	3389.56	3200.23
Jan 2021	3206.54	3241.35	3538.45	3358.72	3409.14	3526.80	3386.01	3470.33	3326.71
Feb 2021	3267.66	3241.35	3637.48	3383.34	3448.69	3636.39	3473.67	3557.02	3472.25
Mar 2021	3074.58	3241.35	3736.51	3403.03	3480.66	3719.15	3493.00	3614.45	3430.89
Apr 2021	3347.73	3241.35	3835.53	3418.79	3506.45	3847.23	3780.15	3711.32	3696.34
May 2021	3261.31	3241.35	3934.56	3431.39	3527.23	3979.00	3984.88	3811.29	3830.22
Jun 2021	3360.01	3241.35	4033.59	3441.48	3543.93	4087.03	4274.96	3880.84	3946.92
Jul 2021	3612.71	3241.35	4132.62	3449.54	3557.35	4207.22	4678.05	3961.56	4258.44
Aug 2021	3310.76	3241.35	4231.65	3456.00	3568.13	4293.29	4670.11	4008.19	4273.38
Sept 2021	3432.44	3241.35	4330.68	3461.16	3576.77	4395.84	4695.49	4064.64	4165.20
Oct 2021	3325.98	3241.35	4429.70	3465.29	3583.70	4479.89	4599.32	4099.32	4050.32

The non-damped Holt and HW models drastically overestimated
The damped HW models overestimated but not as much
Most accurate were SES and damped Holt models

Model Evaluation

Before plotting, wanted to use a table to review each model’s accuracy measures
Using R’s accuracy function (forecast package), calculated error metrics below
- Note that calculations are based on training data
Model’s with the best ranking in said metric are marked

Model Accuracy Summary Table
	ME	RMSE	MAE	MPE	MAPE	MASE	ACF1
SES	22.4633	78.0381	43.8711	2.6342	5.8786	0.1942	0.2736
Holt Add	6.3619	72.6341	39.5756	0.3233	5.5699	0.1752	0.14
Holt Add W/ Damp	10.27	72.0711	39.6539	1.124	5.7019	0.1755	-0.0207
Holt Multiply W/ Damp	8.728	71.712	39.3588	0.9515	5.7406	0.1742	-0.0186
HW Add	6.1619	70.4673	41.937	0.3457	7.9333	0.1856	0.1267
HW Multiply	6.1625	67.4218	41.9806	0.1399	6.6633	0.1858	0.0722
HW Add W/ Damp	9.0695	70.5768	42.2699	0.7657	8.0253	0.1871	0.1107
HW Multiply W/ Damp	11.6052	61.579	39.8682	0.9885	6.2923	0.1765	0.1686

Damped HW Multiplicative model had best ranking in three of the seven error measurements

Model Plotting

Next step was to look at plots (real and recorded values versus what models predicted)

SES and damped Holt models are only ones that remotely resemble the true values

Choosing Between Top Models

Prediction and Error Breakdown for SES and Additive Holt Models
Month	True Values	SES	SES Errors	Holt Add W/ Damp	Holt Errors
Nov 2020	3147.33	3241.35	94.02	3289.48	142.15
Dec 2020	3199.93	3241.35	41.42	3327.95	128.02
Jan 2021	3206.54	3241.35	34.81	3358.72	152.18
Feb 2021	3267.66	3241.35	-26.31	3383.34	115.68
Mar 2021	3074.58	3241.35	166.77	3403.03	328.45
Apr 2021	3347.73	3241.35	-106.38	3418.79	71.06
May 2021	3261.31	3241.35	-19.96	3431.39	170.08
Jun 2021	3360.01	3241.35	-118.66	3441.48	81.47
Jul 2021	3612.71	3241.35	-371.36	3449.54	-163.17
Aug 2021	3310.76	3241.35	-69.41	3456	145.24
Sept 2021	3432.44	3241.35	-191.09	3461.16	28.72
Oct 2021	3325.98	3241.35	-84.63	3465.29	139.31
Average	3295.58	3241.35	-54.23	3407.18	111.6

The SES model:
- Four overestimations
- Eight underestimations
- Range of Error [19.96 to 371.36]
The Holt Additive Damped model:
- Eleven overestimations
- One underestimation
- Range of Error [28.72 to 328.45]
Slight edge to SES model, still need to dig further

Choosing Between Top Models

Will calculate forecasts from each of these two models versus the known values (test data) to make ultimate decision
Two measures of absolute error (ME and MSE)
Two measures of relative error (MPE and MAPE)

SES

Holt Add W/ Damp

ME

-54.23

111.60

MSE

21039.21

24131.76

MPE

-1.49

3.51

MAPE

3.28

4.26

	SES	Holt Add W/ Damp
ME	-54.23	111.60
MSE	21039.21	24131.76
MPE	-1.49	3.51
MAPE	3.28	4.26

Final Model Selction

Simple Exponential Smoothing (SES) was the most effective and accurate for the purposes of this analysis

SES Accuracy Measures Calculated via Testing Data

ME	-54.23
MSE	21039.21
MPE	-1.49
MAPE	3.28

Conclusions/Explanations

Why did the simplest model perform the best?

1.) The massive spike in the rate of Amazon’s stock increase from about 2018 to October of 2021

Because of the “out of norm” rate of increase, all the non-damped models (Holt and HW) drastically overforecasted

Conclusions/Explanations

2.) Seasonal oscillations mirrored the overall data trend, disparities grew at exponential rate over the last ~ 4 years of data set

Damped HW models assume depreciation over time in trend, not in season-by-season disparities
- *This is because usually seasonal oscillations are constant (Additive) or proportional (Multiplicative) independently of whether or not trend is increasing or decreasing

Conclusions/Explanations

3.) Damping parameter phi (pronounced “fee”) ($\phi$) was too high

$\phi$ always exists between 0 and 1
- If $\phi$ = 1, then the damp is non-existent
Automatically calculated via R’s forecast function
Per $ selector, $\phi$) ~ 0.80 for damped Holt Additive and damped Holt Multiplicative models

I did attempt to manually rebuild these two Holt models with a lower phi value, but was consistently met with errors
- The value of phi, alpha and beta are all interdependent on one another
- I could not even lower phi’s value to 0.79

Takeaways / Lessons

Subsetting into training vs testing sets is very important
“More complex” does not necessarily = better fit for situation at hand
- Sometimes more accuracy is retained by not considering all possible factors
The value of investing long term in companies you believe in
- Say you bought 100 shares of Amazon in January of 2009 for $5,201
  - Those shares would be worth $30,288 by January of 2015
  - By January of 2018. they would be worth $130,138
  - If you continued to hold, they would be worth $332,598 by October of 2022

Changes/Ideas for Future Analyses

Perform an associative and external analysis, looking at what specific factors across timeline (Jan 2009 - October 2021) contributed to stock’s monthly increase/decrease
Perform a time series analysis on more recent data and on a tighter timeline for Amazon
- Ex: Jan 2022 to Dec 2025, observations at weekly intervals (sample size ~ 150)
  - As we saw, the year of 2020 had a massive impact on both trend and seasonality
Perform a time series analysis on the same timeline but with a different company

References

Original Dataset Source:

https://www.kaggle.com/datasets/kannan1314/amazon-stock-price-all-time/data

My Complete Report:

https://rpubs.com/Chris_Bahm/1372994