Stock Prediction Proposal

Team 8:
Ankit
Uday Kolli
Sai Charan Pappala
Vamshidhar Reddy Kanamanthareddy

Problem Description

Stock market prices are very volatile and every investor is interested to know it’s future behavior so that they can invest and gain profits. Prediction and analysis of Stock prices are some of the most difficult jobs to complete. There can be various reasons for this like market fluctuations, news, customer sentiments, pandemic etc. We considered some of these factors as our predictors and are predicting stock price of IBM.

Recap Of Proposed Data Analytics Plan

As part of our analytical plan we planned on performing actions:

Key Peer Comments Summary

Data Summary

## 'data.frame':    20365 obs. of  30 variables:
##  $ time                             : chr  "1/27/2021 15:00" "1/27/2021 9:45" "1/27/2021 15:30" "1/27/2021 15:15" ...
##  $ open                             : num  112 110 112 111 109 ...
##  $ high                             : num  112 111 112 112 109 ...
##  $ low                              : num  111 109 111 111 108 ...
##  $ close                            : num  112 110 111 112 109 ...
##  $ volume                           : int  471830 1364954 431226 434875 2090 21911 16405 645 1140 4480 ...
##  $ grossProfit                      : num  10523000000 10523000000 10523000000 10523000000 10523000000 ...
##  $ totalRevenue                     : num  20367000000 20367000000 20367000000 20367000000 20367000000 ...
##  $ costOfRevenue                    : num  9844000000 9844000000 9844000000 9844000000 9844000000 ...
##  $ costofGoodsAndServicesSold       : int  181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 181000000 ...
##  $ operatingIncome                  : num  -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 -4000000000 ...
##  $ sellingGeneralAndAdministrative  : num  7233000000 7233000000 7233000000 7233000000 7233000000 ...
##  $ researchAndDevelopment           : int  1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 1611000000 ...
##  $ operatingExpenses                : num  2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 2000000000 ...
##  $ netInterestIncome                : int  -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 -317000000 ...
##  $ interestIncome                   : int  15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 15000000 ...
##  $ interestExpense                  : int  317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 ...
##  $ otherNonOperatingIncome          : int  -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 -247000000 ...
##  $ depreciation                     : int  1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 1089000000 ...
##  $ depreciationAndAmortization      : int  610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 610000000 ...
##  $ incomeBeforeTax                  : int  1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 1380000000 ...
##  $ incomeTaxExpense                 : int  24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 24000000 ...
##  $ interestAndDebtExpense           : int  317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 317000000 ...
##  $ netIncomeFromContinuingOperations: int  1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 1264000000 ...
##  $ comprehensiveIncomeNetOfTax      : num  603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 603000000 ...
##  $ ebit                             : num  1697000000 1697000000 1697000000 1697000000 1697000000 ...
##  $ ebitda                           : num  2307000000 2307000000 2307000000 2307000000 2307000000 ...
##  $ netIncome                        : num  1356000000 1356000000 1356000000 1356000000 1356000000 ...
##  $ RETAIL_SALES_VALUE               : int  464362 464362 464362 464362 464362 464362 464362 464362 464362 464362 ...
##  $ CONSUMER_SENTIMENT_VALUE         : num  79 79 79 79 79 79 79 79 79 79 ...

Data Visualization

Data Visualization(Contd.)

Close Price vs Time

H20.AutoML Results

Results Summary

## Model Details:
## ==============
## 
## H2ORegressionModel: glm
## Model ID:  metalearner_AUTO_StackedEnsemble_BestOfFamily_1_AutoML_1_20220427_145008 
## GLM Model: summary
##     family     link                               regularization
## 1 gaussian identity Elastic Net (alpha = 0.5, lambda = 0.02387 )
##                                                                  lambda_search
## 1 nlambda = 100, lambda.max = 238.7, lambda.min = 0.02387, lambda.1se = 4.7961
##   number_of_predictors_total number_of_active_predictors number_of_iterations
## 1                          2                           2                  100
##                                                              training_frame
## 1 levelone_training_StackedEnsemble_BestOfFamily_1_AutoML_1_20220427_145008
## 
## Coefficients: glm coefficients
##                            names coefficients standardized_coefficients
## 1                      Intercept    -0.057496                118.350181
## 2 GLM_1_AutoML_1_20220427_145008     0.517151                  5.648339
## 3 GBM_1_AutoML_1_20220427_145008     0.483332                  5.277908
## 
## H2ORegressionMetrics: glm
## ** Reported on training data. **
## 
## MSE:  0.8560418
## RMSE:  0.9252253
## MAE:  0.220518
## RMSLE:  0.007848946
## Mean Residual Deviance :  0.8560418
## R^2 :  0.9928826
## Null Deviance :1966126
## Null D.o.F. :16346
## Residual Deviance :13993.71
## Residual D.o.F. :16344
## AIC :43857.86
## 
## 
## H2ORegressionMetrics: glm
## ** Reported on validation data. **
## 
## MSE:  1.083884
## RMSE:  1.041098
## MAE:  0.2086107
## RMSLE:  0.008941312
## Mean Residual Deviance :  1.083884
## R^2 :  0.9909645
## Null Deviance :482192.2
## Null D.o.F. :4016
## Residual Deviance :4353.962
## Residual D.o.F. :4014
## AIC :11731.33
## 
## 
## H2ORegressionMetrics: glm
## ** Reported on cross-validation data. **
## ** 5-fold cross-validation on training data (Metrics computed for combined holdout predictions) **
## 
## MSE:  0.8564391
## RMSE:  0.92544
## MAE:  0.2207876
## RMSLE:  0.007850558
## Mean Residual Deviance :  0.8564391
## R^2 :  0.9928793
## Null Deviance :1966253
## Null D.o.F. :16346
## Residual Deviance :14000.21
## Residual D.o.F. :16344
## AIC :43865.45
## 
## 
## Cross-Validation Metrics Summary: 
##                                 mean          sd    cv_1_valid    cv_2_valid
## mae                         0.221593    0.019105      0.219056      0.216726
## mean_residual_deviance      0.856413    0.387850      0.722927      0.799693
## mse                         0.856413    0.387850      0.722927      0.799693
## null_deviance          393250.700000 6534.264600 399864.000000 383053.560000
## r2                          0.992897    0.003197      0.994094      0.993280
## residual_deviance        2795.913000 1266.910000   2361.081000   2574.213100
## rmse                        0.903576    0.223504      0.850251      0.894256
## rmsle                       0.007678    0.001820      0.007334      0.007628
##                           cv_3_valid    cv_4_valid    cv_5_valid
## mae                         0.252671      0.219343      0.200170
## mean_residual_deviance      1.357798      1.075568      0.326075
## mse                         1.357798      1.075568      0.326075
## null_deviance          394160.560000 397642.660000 391532.660000
## r2                          0.988715      0.991163      0.997234
## residual_deviance        4448.146000   3513.881300   1082.243900
## rmse                        1.165246      1.037096      0.571030
## rmsle                       0.009819      0.008677      0.004932

H20.AutoML Results(Contd.)

Actual Vs Predicted

Key Takeaways

variable relative_importance scaled_importance percentage
open 40419040.0000 1.0000000 0.4830200
interestAndDebtExpense 13699305.0000 0.3389320 0.1637109
low 11024121.0000 0.2727457 0.1317416
netInterestIncome 7085295.0000 0.1752960 0.0846715
comprehensiveIncomeNetOfTax 5702817.5000 0.1410924 0.0681504
high 3845381.2500 0.0951379 0.0459535
time 955244.2500 0.0236335 0.0114155
C1 386200.6875 0.0095549 0.0046152
interestExpense 131670.7500 0.0032576 0.0015735
CONSUMER_SENTIMENT_VALUE 88099.7656 0.0021797 0.0010528
ebit 62750.7617 0.0015525 0.0007499
RETAIL_SALES_VALUE 59967.8086 0.0014837 0.0007166
volume 41056.0781 0.0010158 0.0004906
researchAndDevelopment 35900.2773 0.0008882 0.0004290
costOfRevenue 26065.8438 0.0006449 0.0003115
sellingGeneralAndAdministrative 25070.3418 0.0006203 0.0002996
depreciationAndAmortization 24752.1914 0.0006124 0.0002958
totalRevenue 22012.6387 0.0005446 0.0002631
ebitda 10004.7773 0.0002475 0.0001196
interestIncome 8479.3467 0.0002098 0.0001013
operatingIncome 8211.8125 0.0002032 0.0000981
costofGoodsAndServicesSold 4510.1973 0.0001116 0.0000539
otherNonOperatingIncome 2490.0085 0.0000616 0.0000298
incomeBeforeTax 2267.3567 0.0000561 0.0000271
operatingExpenses 1908.6106 0.0000472 0.0000228
depreciation 1867.4905 0.0000462 0.0000223
incomeTaxExpense 1649.4888 0.0000408 0.0000197
netIncome 1628.8761 0.0000403 0.0000195
grossProfit 1529.6710 0.0000378 0.0000183
netIncomeFromContinuingOperations 551.2494 0.0000136 0.0000066

Stock Investments are subject to market, or systematic, risk. This is because there is no way to predict what will happen in the future or whether a given asset will increase or decrease in value. Because the market cannot be accurately predicted or completely controlled, no investment is risk-free.
Please do your due diligence before investing