Team 8:
Ankit
Uday Kolli
Sai Charan Pappala
Vamshidhar Reddy Kanamanthareddy
Stock market prices are very volatile and every investor is interested to know it’s future behavior so that they can invest and gain profits. Prediction and analysis of Stock prices are some of the most difficult jobs to complete. There can be various reasons for this like market fluctuations, news, customer sentiments, pandemic etc. We considered some of these factors as our predictors and are predicting stock price of IBM.
In your proposal you mentioned things like customer sentiment and news can have a big impact on stock prices. It may be helpful to pull in secondary data such as most trending news headlines at certain times to help provide context behind different stock price changes.
It may also be helpful to consider the period between close on a Friday and open on a Monday since there is a bigger chance news or events can happen over this gap compared to the gap between two weekdays.
As you mentioned that you will be predicting the stock prices, how many years of stock data are you considering using? Based on what factors are you going to consider the predictors, how are you going to weigh them?
Are you planning to incorporate seasonality factors in your time series model?
Through Intraday trading are you going to focus on one of the stocks pattern over the time?
Since Stock market values are dynamic in nature, how are you going to plan on collecting the data
How are you planning to merge secondary data coming from different sources
Explain the models being used?
## 'data.frame': 20365 obs. of 30 variables:
## $ time : chr "1/27/2021 15:00" "1/27/2021 9:45" "1/27/2021 15:30" "1/27/2021 15:15" ...
## $ open : num 112 110 112 111 109 ...
## $ high : num 112 111 112 112 109 ...
## $ low : num 111 109 111 111 108 ...
## $ close : num 112 110 111 112 109 ...
## $ volume : int 471830 1364954 431226 434875 2090 21911 16405 645 1140 4480 ...
## $ grossProfit : num 10523000000 10523000000 10523000000 10523000000 10523000000 ...
## $ totalRevenue : num 20367000000 20367000000 20367000000 20367000000 20367000000 ...
## $ costOfRevenue : num 9844000000 9844000000 9844000000 9844000000 9844000000 ...
## $ costofGoodsAndServicesSold : int 181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 ...
## $ operatingIncome : num -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 ...
## $ sellingGeneralAndAdministrative : num 7233000000 7233000000 7233000000 7233000000 7233000000 ...
## $ researchAndDevelopment : int 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 ...
## $ operatingExpenses : num 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 ...
## $ netInterestIncome : int -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 ...
## $ interestIncome : int 15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 ...
## $ interestExpense : int 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 ...
## $ otherNonOperatingIncome : int -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 ...
## $ depreciation : int 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 ...
## $ depreciationAndAmortization : int 610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 ...
## $ incomeBeforeTax : int 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 ...
## $ incomeTaxExpense : int 24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 ...
## $ interestAndDebtExpense : int 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 ...
## $ netIncomeFromContinuingOperations: int 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 ...
## $ comprehensiveIncomeNetOfTax : num 603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 ...
## $ ebit : num 1697000000 1697000000 1697000000 1697000000 1697000000 ...
## $ ebitda : num 2307000000 2307000000 2307000000 2307000000 2307000000 ...
## $ netIncome : num 1356000000 1356000000 1356000000 1356000000 1356000000 ...
## $ RETAIL_SALES_VALUE : int 464362 464362 464362 464362 464362 464362 464362 464362 464362 464362 ...
## $ CONSUMER_SENTIMENT_VALUE : num 79 79 79 79 79 79 79 79 79 79 ...
Close Price vs Time
Results Summary
## Model Details:
## ==============
##
## H2ORegressionModel: glm
## Model ID: metalearner_AUTO_StackedEnsemble_BestOfFamily_1_AutoML_1_20220427_145008
## GLM Model: summary
## family link regularization
## 1 gaussian identity Elastic Net (alpha = 0.5, lambda = 0.02387 )
## lambda_search
## 1 nlambda = 100, lambda.max = 238.7, lambda.min = 0.02387, lambda.1se = 4.7961
## number_of_predictors_total number_of_active_predictors number_of_iterations
## 1 2 2 100
## training_frame
## 1 levelone_training_StackedEnsemble_BestOfFamily_1_AutoML_1_20220427_145008
##
## Coefficients: glm coefficients
## names coefficients standardized_coefficients
## 1 Intercept -0.057496 118.350181
## 2 GLM_1_AutoML_1_20220427_145008 0.517151 5.648339
## 3 GBM_1_AutoML_1_20220427_145008 0.483332 5.277908
##
## H2ORegressionMetrics: glm
## ** Reported on training data. **
##
## MSE: 0.8560418
## RMSE: 0.9252253
## MAE: 0.220518
## RMSLE: 0.007848946
## Mean Residual Deviance : 0.8560418
## R^2 : 0.9928826
## Null Deviance :1966126
## Null D.o.F. :16346
## Residual Deviance :13993.71
## Residual D.o.F. :16344
## AIC :43857.86
##
##
## H2ORegressionMetrics: glm
## ** Reported on validation data. **
##
## MSE: 1.083884
## RMSE: 1.041098
## MAE: 0.2086107
## RMSLE: 0.008941312
## Mean Residual Deviance : 1.083884
## R^2 : 0.9909645
## Null Deviance :482192.2
## Null D.o.F. :4016
## Residual Deviance :4353.962
## Residual D.o.F. :4014
## AIC :11731.33
##
##
## H2ORegressionMetrics: glm
## ** Reported on cross-validation data. **
## ** 5-fold cross-validation on training data (Metrics computed for combined holdout predictions) **
##
## MSE: 0.8564391
## RMSE: 0.92544
## MAE: 0.2207876
## RMSLE: 0.007850558
## Mean Residual Deviance : 0.8564391
## R^2 : 0.9928793
## Null Deviance :1966253
## Null D.o.F. :16346
## Residual Deviance :14000.21
## Residual D.o.F. :16344
## AIC :43865.45
##
##
## Cross-Validation Metrics Summary:
## mean sd cv_1_valid cv_2_valid
## mae 0.221593 0.019105 0.219056 0.216726
## mean_residual_deviance 0.856413 0.387850 0.722927 0.799693
## mse 0.856413 0.387850 0.722927 0.799693
## null_deviance 393250.700000 6534.264600 399864.000000 383053.560000
## r2 0.992897 0.003197 0.994094 0.993280
## residual_deviance 2795.913000 1266.910000 2361.081000 2574.213100
## rmse 0.903576 0.223504 0.850251 0.894256
## rmsle 0.007678 0.001820 0.007334 0.007628
## cv_3_valid cv_4_valid cv_5_valid
## mae 0.252671 0.219343 0.200170
## mean_residual_deviance 1.357798 1.075568 0.326075
## mse 1.357798 1.075568 0.326075
## null_deviance 394160.560000 397642.660000 391532.660000
## r2 0.988715 0.991163 0.997234
## residual_deviance 4448.146000 3513.881300 1082.243900
## rmse 1.165246 1.037096 0.571030
## rmsle 0.009819 0.008677 0.004932
Actual Vs Predicted
| variable | relative_importance | scaled_importance | percentage |
|---|---|---|---|
| open | 40419040.0000 | 1.0000000 | 0.4830200 |
| interestAndDebtExpense | 13699305.0000 | 0.3389320 | 0.1637109 |
| low | 11024121.0000 | 0.2727457 | 0.1317416 |
| netInterestIncome | 7085295.0000 | 0.1752960 | 0.0846715 |
| comprehensiveIncomeNetOfTax | 5702817.5000 | 0.1410924 | 0.0681504 |
| high | 3845381.2500 | 0.0951379 | 0.0459535 |
| time | 955244.2500 | 0.0236335 | 0.0114155 |
| C1 | 386200.6875 | 0.0095549 | 0.0046152 |
| interestExpense | 131670.7500 | 0.0032576 | 0.0015735 |
| CONSUMER_SENTIMENT_VALUE | 88099.7656 | 0.0021797 | 0.0010528 |
| ebit | 62750.7617 | 0.0015525 | 0.0007499 |
| RETAIL_SALES_VALUE | 59967.8086 | 0.0014837 | 0.0007166 |
| volume | 41056.0781 | 0.0010158 | 0.0004906 |
| researchAndDevelopment | 35900.2773 | 0.0008882 | 0.0004290 |
| costOfRevenue | 26065.8438 | 0.0006449 | 0.0003115 |
| sellingGeneralAndAdministrative | 25070.3418 | 0.0006203 | 0.0002996 |
| depreciationAndAmortization | 24752.1914 | 0.0006124 | 0.0002958 |
| totalRevenue | 22012.6387 | 0.0005446 | 0.0002631 |
| ebitda | 10004.7773 | 0.0002475 | 0.0001196 |
| interestIncome | 8479.3467 | 0.0002098 | 0.0001013 |
| operatingIncome | 8211.8125 | 0.0002032 | 0.0000981 |
| costofGoodsAndServicesSold | 4510.1973 | 0.0001116 | 0.0000539 |
| otherNonOperatingIncome | 2490.0085 | 0.0000616 | 0.0000298 |
| incomeBeforeTax | 2267.3567 | 0.0000561 | 0.0000271 |
| operatingExpenses | 1908.6106 | 0.0000472 | 0.0000228 |
| depreciation | 1867.4905 | 0.0000462 | 0.0000223 |
| incomeTaxExpense | 1649.4888 | 0.0000408 | 0.0000197 |
| netIncome | 1628.8761 | 0.0000403 | 0.0000195 |
| grossProfit | 1529.6710 | 0.0000378 | 0.0000183 |
| netIncomeFromContinuingOperations | 551.2494 | 0.0000136 | 0.0000066 |
Stock Investments are subject to market, or systematic, risk. This is because there is no way to predict what will happen in the future or whether a given asset will increase or decrease in value. Because the market cannot be accurately predicted or completely controlled, no investment is risk-free.Please do your due diligence before investing