This report analyzes intraday 5-minute stock movement direction
(TargetVariable, 0/1) and builds a logistic regression
using the four Variable74 series (levels + lag-12
differences), with the response defined as the 13-period
lag of the target.
We check missingness by column and apply median imputation to numeric predictors (the target is not imputed).
| variable | missing_rate |
|---|---|
| Variable167LAST | 0.0133401 |
| Variable168LAST | 0.0133401 |
| Variable169LAST | 0.0133401 |
| Variable170LAST | 0.0133401 |
| Variable171LAST | 0.0133401 |
| Variable172LAST | 0.0133401 |
| Variable173LAST | 0.0133401 |
| Variable174LAST | 0.0133401 |
| Variable175LAST | 0.0133401 |
| Variable176LAST | 0.0133401 |
| Variable177LAST | 0.0133401 |
| Variable178LAST | 0.0133401 |
| Variable179LAST | 0.0133401 |
| Variable180LAST | 0.0133401 |
| Variable157OPEN | 0.0001689 |
| Variable157HIGH | 0.0001689 |
| Variable157LOW | 0.0001689 |
| Variable157LAST | 0.0001689 |
| Timestamp | 0.0000000 |
| TargetVariable | 0.0000000 |
The test set is the last 2539 periods (each period is 5 minutes).
| test_periods | minutes_in_test | training_prop_target_1 |
|---|---|---|
| 2539 | 12695 | 0.6027195 |
Model specification: - Y =
TargetVariable lagged by 13 periods - X =
Variable74 levels + lag-12 differences
| term | estimate | std.error | statistic | p.value | odds_ratio | conf.low | conf.high |
|---|---|---|---|---|---|---|---|
| (Intercept) | -6.3971 | 1.7212 | -3.7166 | 0.0002 | 1.700000e-03 | 0.0001 | 4.860000e-02 |
| Variable74OPEN | -2.0705 | 9.8477 | -0.2103 | 0.8335 | 1.261000e-01 | 0.0000 | 3.042871e+07 |
| Variable74HIGH | 30.3154 | 10.3353 | 2.9332 | 0.0034 | 1.464923e+13 | 23347.3915 | 9.191598e+21 |
| Variable74LOW | -26.8219 | 10.8340 | -2.4757 | 0.0133 | 0.000000e+00 | 0.0000 | 3.700000e-03 |
| Variable74LAST_PRICE | -0.8619 | 10.3361 | -0.0834 | 0.9335 | 4.224000e-01 | 0.0000 | 2.654536e+08 |
| d74OPEN | -14.4349 | 6.7643 | -2.1340 | 0.0328 | 0.000000e+00 | 0.0000 | 3.082000e-01 |
| d74HIGH | -48.2618 | 7.4886 | -6.4447 | 0.0000 | 0.000000e+00 | 0.0000 | 0.000000e+00 |
| d74LOW | -14.2190 | 7.3647 | -1.9307 | 0.0535 | 0.000000e+00 | 0.0000 | 1.240900e+00 |
| d74LAST | 17.5875 | 7.1894 | 2.4463 | 0.0144 | 4.346584e+07 | 32.9944 | 5.726054e+13 |
Rule: predict 1 when probability > 0.5, else 0.
Confusion matrix template: - a = TP, b = FP, c = FN, d = TN
## $confusion_matrix
## Actual Value
## Predicted Value 0 1
## 0 812 59
## 1 149 1519
##
## $Accuracy
## [1] 0.918078
##
## $Recall_per_template
## [1] 0.9106715