Here is a run-down of the different variations of the Data , Years, Outcome variable and Tickers I’ve run through the NB-classifier. Note for all these I report the out-of-sample validation set performance along AUC, Accuracy, Precision, Recall
I have split this into two tables (since it’s quite a few variations).
over-reaction = 1(drift < 0)over-reaction = 1(drift < 0 - 1/2 SD(drift))Another important distinction is whether or not we filter to major events. Tables 3 and 4 summarize different model variations with data sub-set to only key events.
I can produce ROC curves and confusion matrices as well (since I’ve saved the predictions), but I think this is already a ton of information
| Model.variation | Data | Years | Outcome.variable | Tickers |
|---|---|---|---|---|
| var_1 | Title | All | 90 day drift < 0 | All companies |
| var_2 | Full text | All | 90 day drift < 0 | All companies |
| var_3 | Title | 2015 to 2018 | 90 day drift < 0 | All companies |
| var_4 | Full text | 2015 to 2018 | 90 day drift < 0 | All companies |
| var_5 | Title | All | 90 day drift < 0 | Excluding top 20 |
| var_6 | Full text | All | 90 day drift < 0 | Excluding top 20 |
| var_7 | Title | All | 90 day drift < 0 | Top 20 |
| var_8 | Full text | All | 90 day drift < 0 | Top 20 |
| var_9 | Title | 2015 to 2018 | 90 day drift < 0 | Top 20 |
| var_10 | Full text | 2015 to 2018 | 90 day drift < 0 | Top 20 |
| var_11 | Title | 2015 to 2018 | 90 day drift < 0 | Excluding top 20 |
| var_12 | Full text | 2015 to 2018 | 90 day drift < 0 | Excluding top 20 |
| var_13 | Title | All | 3 day drift < 0 | All companies |
| var_14 | Full text | All | 3 day drift < 0 | All companies |
| var_15 | Title | 2015 to 2018 | 3 day drift < 0 | All companies |
| var_16 | Full text | 2015 to 2018 | 3 day drift < 0 | All companies |
| var_17 | Title | All | 3 day drift < 0 | Excluding top 20 |
| var_18 | Full text | All | 3 day drift < 0 | Excluding top 20 |
| var_19 | Title | All | 3 day drift < 0 | Top 20 |
| var_20 | Full text | All | 3 day drift < 0 | Top 20 |
| var_21 | Title | 2015 to 2018 | 3 day drift < 0 | Top 20 |
| var_22 | Full text | 2015 to 2018 | 3 day drift < 0 | Top 20 |
| var_23 | Title | 2015 to 2018 | 3 day drift < 0 | Excluding top 20 |
| var_24 | Full text | 2015 to 2018 | 3 day drift < 0 | Excluding top 20 |
| Model.variation | Data | Years | Outcome.variable | Tickers |
|---|---|---|---|---|
| var_25 | Title | All | 90 day drift < 0-0.5*SD | All companies |
| var_26 | Full text | All | 90 day drift < 0-0.5*SD | All companies |
| var_27 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | All companies |
| var_28 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | All companies |
| var_29 | Title | All | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_30 | Full text | All | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_31 | Title | All | 90 day drift < 0-0.5*SD | Top 20 |
| var_32 | Full text | All | 90 day drift < 0-0.5*SD | Top 20 |
| var_33 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | Top 20 |
| var_34 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | Top 20 |
| var_35 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_36 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_37 | Title | All | 3 day drift < 0-0.5*SD | All companies |
| var_38 | Full text | All | 3 day drift < 0-0.5*SD | All companies |
| var_39 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | All companies |
| var_40 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | All companies |
| var_41 | Title | All | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_42 | Full text | All | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_43 | Title | All | 3 day drift < 0-0.5*SD | Top 20 |
| var_44 | Full text | All | 3 day drift < 0-0.5*SD | Top 20 |
| var_45 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | Top 20 |
| var_46 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | Top 20 |
| var_47 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_48 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | Excluding top 20 |
| Model.variation | Data | Years | Outcome.variable | Tickers |
|---|---|---|---|---|
| var_49 | Title | All | 90 day drift < 0 | All companies |
| var_50 | Full text | All | 90 day drift < 0 | All companies |
| var_51 | Title | 2015 to 2018 | 90 day drift < 0 | All companies |
| var_52 | Full text | 2015 to 2018 | 90 day drift < 0 | All companies |
| var_53 | Title | All | 90 day drift < 0 | Excluding top 20 |
| var_54 | Full text | All | 90 day drift < 0 | Excluding top 20 |
| var_55 | Title | All | 90 day drift < 0 | Top 20 |
| var_56 | Full text | All | 90 day drift < 0 | Top 20 |
| var_57 | Title | 2015 to 2018 | 90 day drift < 0 | Top 20 |
| var_58 | Full text | 2015 to 2018 | 90 day drift < 0 | Top 20 |
| var_59 | Title | 2015 to 2018 | 90 day drift < 0 | Excluding top 20 |
| var_60 | Full text | 2015 to 2018 | 90 day drift < 0 | Excluding top 20 |
| var_61 | Title | All | 3 day drift < 0 | All companies |
| var_62 | Full text | All | 3 day drift < 0 | All companies |
| var_63 | Title | 2015 to 2018 | 3 day drift < 0 | All companies |
| var_64 | Full text | 2015 to 2018 | 3 day drift < 0 | All companies |
| var_65 | Title | All | 3 day drift < 0 | Excluding top 20 |
| var_66 | Full text | All | 3 day drift < 0 | Excluding top 20 |
| var_67 | Title | All | 3 day drift < 0 | Top 20 |
| var_68 | Full text | All | 3 day drift < 0 | Top 20 |
| var_69 | Title | 2015 to 2018 | 3 day drift < 0 | Top 20 |
| var_70 | Full text | 2015 to 2018 | 3 day drift < 0 | Top 20 |
| var_71 | Title | 2015 to 2018 | 3 day drift < 0 | Excluding top 20 |
| var_72 | Full text | 2015 to 2018 | 3 day drift < 0 | Excluding top 20 |
| Model.variation | Data | Years | Outcome.variable | Tickers |
|---|---|---|---|---|
| var_73 | Title | All | 90 day drift < 0-0.5*SD | All companies |
| var_74 | Full text | All | 90 day drift < 0-0.5*SD | All companies |
| var_75 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | All companies |
| var_76 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | All companies |
| var_77 | Title | All | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_78 | Full text | All | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_79 | Title | All | 90 day drift < 0-0.5*SD | Top 20 |
| var_80 | Full text | All | 90 day drift < 0-0.5*SD | Top 20 |
| var_81 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | Top 20 |
| var_82 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | Top 20 |
| var_83 | Title | 2015 to 2018 | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_84 | Full text | 2015 to 2018 | 90 day drift < 0-0.5*SD | Excluding top 20 |
| var_85 | Title | All | 3 day drift < 0-0.5*SD | All companies |
| var_86 | Full text | All | 3 day drift < 0-0.5*SD | All companies |
| var_87 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | All companies |
| var_88 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | All companies |
| var_89 | Title | All | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_90 | Full text | All | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_91 | Title | All | 3 day drift < 0-0.5*SD | Top 20 |
| var_92 | Full text | All | 3 day drift < 0-0.5*SD | Top 20 |
| var_93 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | Top 20 |
| var_94 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | Top 20 |
| var_95 | Title | 2015 to 2018 | 3 day drift < 0-0.5*SD | Excluding top 20 |
| var_96 | Full text | 2015 to 2018 | 3 day drift < 0-0.5*SD | Excluding top 20 |
Here are two plots summarizing the different within-sample metrics for the first iteration of the Naive-Bayes classifier. Note that the x-axis is sorted on AUC.
NOTE The idea behind these plots is to measure which variations achieve maximum AUC. The second table records precision and recall with the x-axis still sorted after AUC. That way you can easily look up those metrics for any AUC performance.
Here I’m recording the top 15 words with the highest positive (i.e. over-reaction) and negative class prediction. The column Class is color coded to avoid confusion.
NOTE Columns in grey are those trained on titles only !
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for positive-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_1 | var_2 | var_3 | var_4 | var_5 | var_6 | var_7 | var_8 | var_9 | var_10 | var_11 | var_12 | var_13 | var_14 | var_15 | var_16 | var_17 | var_18 | var_19 | var_20 | var_21 | var_22 | var_23 | var_24 |
| 0 | pos | wealth | musk | pattern | musk | rival | silver | helps | musk | said | correction | delay | comcast | pattern | musk | downgrades | musk | iran | wells | pattern | musk | complaint | musk | proxy | wells |
| 1 | pos | success | solar | max | solar | recession | solar | tillerson | xom | getting | musk | engineering | solar | charter | technological | pattern | lithium | mean | western | merck | spotlights | johnson | revolutions | iran | western |
| 2 | pos | bln | life | settles | life | auction | chemical | adviser | hour | slides | peak | broader | chemical | downgrades | drilling | bombardier | kick | patterson | drilling | problems | revolutions | reversal | spotlights | biopharma | chemical |
| 3 | pos | worst | tesla | chrysler | loans | access | life | huawei | netflix | success | netflix | rival | loans | bombardier | phenomenon | charter | spotlights | infosys | chemical | virtual | tickers | problems | kick | expanded | ta |
| 4 | pos | fiat | french | wealth | peak | expeditors | drugs | norway | mining | forecasts | xom | drugs | cable | netflix | iphones | cat | technological | drilling | lithium | joint | steel | adobe | tickers | finish | apple |
| 5 | pos | pattern | loans | holds | mining | copper | loans | hacking | correction | tillerson | device | technical | life | notable | western | super | iphones | uber | metals | industries | technological | merck | technological | aflac | sequentially |
| 6 | pos | getting | round | fiat | wrote | drugs | german | getting | crash | hacking | access | bull | deere | bigger | opportunity | efficiency | wells | eu | carmakers | mother | guides | solar | infosys | series | |
| 7 | pos | comps | exploration | worst | face | challenges | copper | carmakers | peak | huawei | produce | expeditors | drugs | rig | iphone | ubs | iphone | autodesk | poor | estate | kick | pattern | mere | patterson | thomson |
| 8 | pos | access | wave | pound | automakers | old | crisis | faang | tesla | row | autonomous | paypal | factor | oracle | advantage | rig | bigger | viacom | final | oracle | surpassed | decline | gaap | drilling | inventory |
| 9 | pos | charter | mining | 737 | bull | delay | pfizer | fees | battery | cancer | miles | slumps | pfizer | 300 | creating | decline | ibm | 34 | prime | gopro | mere | lost | adjusted | aircraft | highest |
| 10 | pos | 737 | german | police | motors | progress | merck | apps | haven | adviser | automakers | lose | book | cable | gaap | aes | phenomenon | wants | series | decline | solar | joint | mother | ubs | weekly |
| 11 | pos | police | miles | eur | investigation | stop | cable | zone | minister | carmakers | safe | organic | channel | cat | property | universal | advantage | equities | dividends | reversal | revenues | virtual | bigger | pultegroup | bonds |
| 12 | pos | cvx | makers | consumers | changed | slumps | channel | consumers | tariffs | holds | motors | aircraft | rig | 200 | dividend | cable | western | central | quality | summit | bigger | dax | revenues | uber | 5g |
| 13 | pos | plunge | wrote | makers | smart | paypal | recession | buyers | imports | apps | drivers | old | verizon | aluminum | tickers | notable | gaap | fb | exploration | qatar | phenomenon | volume | surpassed | wells | units |
| 14 | pos | consumers | stimulus | access | tesla | dax | blue | worst | bp | bears | class | scripts | face | adobe | operational | 300 | aug | icahn | thomson | turnaround | advantage | qatar | advantage | transportation | housing |
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for negative-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_1 | var_2 | var_3 | var_4 | var_5 | var_6 | var_7 | var_8 | var_9 | var_10 | var_11 | var_12 | var_13 | var_14 | var_15 | var_16 | var_17 | var_18 | var_19 | var_20 | var_21 | var_22 | var_23 | var_24 |
| 0 | neg | mylan | climb | mylan | begun | prologis | steel | narrower | save | members | save | prologis | steel | place | wave | max | pe | anti | macy | police | wave | police | pe | anti | macy |
| 1 | neg | mills | 2007 | controls | pure | deliver | restaurant | lag | begun | narrower | pure | pain | brazil | 737 | pe | place | steel | gt | steel | max | pe | raytheon | kevin | amerisourcebergen | steel |
| 2 | neg | yellen | restaurant | soaring | kevin | railroad | brazil | members | climb | crm | kevin | deliver | chipotle | struggle | ge | 737 | miss | place | imports | models | intel | tool | intel | instead | imports |
| 3 | neg | half | pe | yellen | climb | fmc | chipotle | revisions | drug | half | begun | fmc | restaurant | breaks | steel | hurricane | lockheed | etfc | peg | guide | begun | books | begun | place | shale |
| 4 | neg | buffett | steel | mills | 2007 | free | wait | lines | 2007 | lag | climb | anti | blackrock | disappointing | choice | reit | begun | boom | brazil | lines | choice | candidate | proposal | paypal | utilities |
| 5 | neg | panel | taxes | deliver | miss | regulators | sa | half | esp | memo | 2007 | strike | wait | police | fda | fear | kevin | streaming | utilities | regulatory | campaign | halt | wireless | gt | pe |
| 6 | neg | states | jpmorgan | half | pe | pain | blackrock | hathaway | generation | operating | drug | boom | advisory | lines | canadian | police | ge | stop | morgan | passes | jets | moved | pure | boom | lockheed |
| 7 | neg | missile | defense | blow | restaurant | anti | amounts | snapshot | pe | snapchat | yellen | head | pe | mobil | climb | lines | choice | lululemon | trump | shipping | metal | models | choice | beer | peg |
| 8 | neg | blow | brazil | vietnam | believes | boom | indicators | drug | invest | revisions | generation | struggles | sa | slows | restaurant | obamacare | budget | jwn | insurers | disappointing | brent | max | campaign | lululemon | express |
| 9 | neg | begins | invest | water | defense | strike | screening | begins | chase | 05 | esp | healthy | taxes | advance | budget | states | beijing | copper | reduction | hours | safety | ericsson | compares | roche | military |
| 10 | neg | employees | texas | operating | invest | improved | wizard | fda | taxes | lines | invest | free | class | regulatory | australian | dish | campaign | rife | advertising | pays | canadian | guide | jets | tie | insurers |
| 11 | neg | past | believes | buffett | black | head | units | past | believes | hathaway | pe | regulators | restaurants | jets | easing | mobil | martin | emerging | talks | fear | climb | advance | brent | emerging | nordstrom |
| 12 | neg | strike | vgm | states | steel | panel | restaurants | delta | healthcare | begins | surprises | improved | begun | aum | approach | don | pure | tie | fda | traffic | id | gap | save | jwn | brazil |
| 13 | neg | vale | pegged | begins | taxes | winning | website | estimate | score | employees | believes | estate | utilities | dish | beijing | breaks | compares | lennar | cap | advance | network | pentagon | network | goodyear | hedge |
| 14 | neg | softbank | suggest | employees | generation | mylan | 96 | transfer | airlines | books | healthcare | rtn | estate | fear | monetary | jets | restaurant | sap | shale | breaks | networks | regulatory | website | csco | defense |
NOTE Columns in grey are those trained on titles only !
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for positive-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_25 | var_26 | var_27 | var_28 | var_29 | var_30 | var_31 | var_32 | var_33 | var_34 | var_35 | var_36 | var_37 | var_38 | var_39 | var_40 | var_41 | var_42 | var_43 | var_44 | var_45 | var_46 | var_47 | var_48 |
| 0 | pos | success | yahoo | getting | insurers | gopro | chemical | summer | uber | strikes | mining | pattern | chemical | limited | iphones | dump | chemical | pvh | chemical | problems | revolutions | problems | revolutions | lawyer | wells |
| 1 | pos | broader | uber | huge | uber | recession | drugs | weeks | exxon | wary | uber | crisis | drugs | novartis | iphone | limited | iphones | sky | wells | irish | spotlights | party | nvidia | stryker | chemical |
| 2 | pos | event | alibaba | event | yahoo | germany | silver | semiconductor | mining | summer | exxon | extends | wells | summit | technological | properties | iphone | adding | mutual | quiet | tickers | debut | spotlights | pvh | mortgage |
| 3 | pos | taxes | exploration | eur | life | pattern | insurers | slides | eur | summary | xom | germany | saudi | aes | tickers | progressive | technological | pgr | usd | michigan | mother | 04 | esp | sky | yields |
| 4 | pos | spend | life | broader | face | success | bull | details | tariffs | try | lithium | aerospace | insurers | opportunity | breakthrough | appeals | tickers | vol | jun | pipeline | kick | provide | tickers | pgr | unfavorable |
| 5 | pos | battery | survey | battery | mart | cbs | rsi | fewer | yahoo | chicago | netflix | ciena | square | foxconn | musk | abbvie | qualcomm | nsc | mortgage | limited | technological | reveals | mother | nsc | repurchase |
| 6 | pos | steps | mart | spend | wal | 200 | bear | lynch | chevron | getting | gold | cbs | bull | senator | phenomenon | opportunity | watch | live | repurchase | johnson | esp | michigan | kick | shack | article |
| 7 | pos | cbs | changes | extends | changes | note | changes | minister | drivers | weeks | tariffs | adi | life | february | taxes | rig | breakthrough | garden | resource | changed | iphones | quiet | technological | steady | cancer |
| 8 | pos | 1b | wal | steps | safe | aerospace | life | purchase | netflix | ubs | drivers | agrees | exploration | supreme | brexit | aes | ibm | pmi | amounts | nvda | mere | limited | mere | cheap | west |
| 9 | pos | appeals | lithium | wild | peg | nxp | russia | police | swiss | semiconductor | yahoo | petroleum | takes | watchdog | apple | novartis | musk | greece | unemployment | party | qualcomm | pipeline | qualcomm | wholesale | jun |
| 10 | pos | purchase | face | army | tariffs | adi | pharmaceuticals | inches | imports | huge | face | profits | changes | provide | bigger | foxconn | drilling | autodesk | partners | wire | phenomenon | bancorp | iphones | brent | blackrock |
| 11 | pos | retirement | tariffs | funding | mining | ciena | wells | license | face | purchase | safe | spend | shale | securities | esp | straight | fast | shack | article | gs | taxes | johnson | phenomenon | bsx | car |
| 12 | pos | 93 | drugs | gaap | disappoints | german | wary | usd | minister | youtube | nxp | individual | wealth | fast | senator | phenomenon | viacom | employment | debut | iphone | natural | iphone | ntap | drilling | |
| 13 | pos | founder | eur | 1b | device | agrees | fixed | irish | team | barnes | device | steady | germany | appeals | qualcomm | provide | efficiency | old | eu | breakthrough | nvda | advantage | pcar | uncover | |
| 14 | pos | fix | device | cbs | drivers | extends | aircraft | try | alibaba | lynch | life | success | peg | mexican | watch | decline | bigger | impress | western | sharing | solar | gs | breakthrough | centers | gdp |
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for negative-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_25 | var_26 | var_27 | var_28 | var_29 | var_30 | var_31 | var_32 | var_33 | var_34 | var_35 | var_36 | var_37 | var_38 | var_39 | var_40 | var_41 | var_42 | var_43 | var_44 | var_45 | var_46 | var_47 | var_48 |
| 0 | neg | embraer | save | embraer | save | enhances | family | rtn | save | operating | save | enhances | education | 737 | proposed | 737 | proposed | anti | macy | maersk | jets | maersk | jets | anti | nordstrom |
| 1 | neg | hackers | 2007 | barrick | 2007 | transport | amounts | lag | 2007 | rtn | 2007 | tower | family | poor | verizon | csco | steel | rogers | education | fcc | russian | multi | square | 07 | macy |
| 2 | neg | solution | mega | hackers | mega | panel | education | agrees | airlines | hackers | mega | abt | utilities | csco | search | philip | verizon | transfer | steel | multi | wave | forces | russian | rogers | education |
| 3 | neg | barrick | boeing | solution | greatest | pre | beta | chase | lockheed | hailing | matras | solution | chipotle | embraer | interview | roche | chevron | abc | equal | shut | proposed | fcc | chevron | transfer | steel |
| 4 | neg | rtn | toyota | chubb | ge | tower | leverage | revisions | kevin | scores | 2025 | panel | reits | iii | republican | jwn | bitcoin | amerisourcebergen | aum | bigger | south | advance | test | ev | cisco |
| 5 | neg | cpi | ge | rtn | boeing | intuit | mutual | phase | greatest | leasing | names | customers | leverage | deliveries | patients | strike | patients | private | cisco | cheap | search | satellite | orders | amerisourcebergen | verizon |
| 6 | neg | solarcity | kevin | solarcity | kevin | affirmed | transportation | leasing | drug | range | lockheed | therapy | homes | marriott | chevron | deliveries | republican | ev | verizon | commodity | test | goods | proposed | life | promising |
| 7 | neg | shake | gm | shake | names | climate | homes | card | names | phase | kevin | transport | macy | strike | bitcoin | fy18 | search | tex | party | max | chevron | concern | unemployment | dxc | hedge |
| 8 | neg | positioned | transportation | positioned | begun | regulators | blackrock | embraer | silver | revisions | phase | pain | trust | roche | steel | iii | test | txt | promising | military | russia | winners | south | private | aum |
| 9 | neg | bankruptcy | airlines | kb | lockheed | therapy | restaurant | slightly | toyota | lag | greatest | affirmed | transportation | shut | test | turns | orders | life | peg | greece | export | siemens | bid | won | equal |
| 10 | neg | wti | family | climb | snap | chubb | trust | acquire | suggest | memo | airlines | climate | restaurant | prime | wave | marriott | valuation | philip | simple | fails | safety | safe | search | abc | imports |
| 11 | neg | expedia | lockheed | argentina | climb | solution | japanese | memo | pure | chase | suggest | regulators | delivery | refinery | grade | embraer | bid | sues | pe | testing | bid | bigger | rights | instead | percentage |
| 12 | neg | cnx | snap | campaign | transportation | abt | ending | suit | boeing | agrees | pretty | intuit | estate | shipping | advertising | prime | advertising | jwn | trump | books | bad | 737 | goods | centerpoint | bitcoin |
| 13 | neg | revisions | begun | activision | airlines | fe | holiday | desk | ge | card | revisions | content | blackrock | peru | south | peru | wireless | fourth | percentage | working | vote | military | moves | motor | trump |
| 14 | neg | mellon | pe | xl | martin | midstream | pe | spotlight | healthcare | operations | 42 | takata | restaurants | forces | plant | forces | homes | robust | hedge | payrolls | rights | max | hong | txt | programs |
NOTE Columns in grey are those trained on titles only !
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for positive-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_49 | var_50 | var_51 | var_52 | var_53 | var_54 | var_55 | var_56 | var_57 | var_58 | var_59 | var_60 | var_61 | var_62 | var_63 | var_64 | var_65 | var_66 | var_67 | var_68 | var_69 | var_70 | var_71 | var_72 |
| 0 | pos | projects | blockchain | tool | solar | amkor | aig | bubble | estimize | problem | hour | fslr | aig | margins | estimize | upgraded | tesla | pxd | estimize | vietnam | buffett | advanced | tesla | pxd | aig |
| 1 | pos | bln | solar | hrl | blockchain | iot | ns | approve | lithium | freeze | chrysler | family | ns | lay | tesla | 200 | merck | stryker | western | leader | alibaba | navy | steel | unions | premiums |
| 2 | pos | fiat | estimize | signals | sa | progress | providers | enter | blockchain | finds | plants | iot | solar | 200 | mining | supreme | favorable | goods | aig | spend | estimize | scrutiny | gold | stryker | wynn |
| 3 | pos | family | merck | comps | providers | slumps | solar | bribery | possibility | visa | peak | comps | etfs | fees | favorable | navy | western | unions | drilling | california | tesla | leader | 61 | away | western |
| 4 | pos | betting | peak | iot | walmart | acquisitions | reform | wealth | walmart | seven | italy | acquisitions | providers | banker | metals | margins | style | tariff | pe | backs | care | subdued | metals | session | wells |
| 5 | pos | canadian | lithium | projects | officials | comps | etfs | giants | game | secures | lines | movil | label | summit | lines | proxy | uber | drilling | wells | lng | mining | country | active | drilling | policies |
| 6 | pos | correction | beijing | lows | minutes | projects | coal | projects | holiday | creates | blockchain | cat | wells | supreme | fuel | create | liquidity | margins | exploration | black | steel | precious | watch | tariff | barrels |
| 7 | pos | approve | dropped | acquisitions | dropped | fslr | german | pairs | eps | dent | enterprise | reits | comcast | navy | community | double | white | ease | wynn | irish | property | veteran | care | actelion | labor |
| 8 | pos | progress | wealth | family | pacific | family | merck | police | sec | bears | savings | retailers | sa | success | western | opel | metals | proxy | grade | stance | plant | freeze | fuel | margins | accenture |
| 9 | pos | police | officials | dismal | construction | bonus | cell | irish | weekend | doubles | question | schwab | election | california | white | patents | fuel | nsc | policies | northrop | fuel | significant | picture | hog | resorts |
| 10 | pos | signals | mcdonald | settles | basket | president | mortgage | improving | ounce | largest | walmart | ntrs | western | spend | merck | breakout | operational | create | metals | summit | community | upgraded | miles | fslr | reform |
| 11 | pos | really | fears | presence | changed | hrl | wells | path | hour | emails | game | hrl | housing | pop | question | technical | 2012 | forex | barrels | safe | watch | renault | mentioned | ciena | favorable |
| 12 | pos | path | anticipated | patents | election | neutrality | housing | bargain | community | approve | australia | tie | reform | reporting | force | lay | yen | ciena | favorable | customers | uber | 28 | eur | shine | inventory |
| 13 | pos | ericsson | stimulus | googl | pharmaceuticals | ntrs | french | peak | war | businesses | reserves | president | apr | november | independent | drilling | reform | stakes | russia | pipeline | institutions | spend | pipeline | february | petroleum |
| 14 | pos | wage | pharmaceuticals | related | asian | tie | obama | old | peak | huge | clearly | actelion | mortgage | retirement | pharmaceuticals | california | bitcoin | michael | style | 200 | musk | johnson | musk | agrees | merck |
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for negative-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_49 | var_50 | var_51 | var_52 | var_53 | var_54 | var_55 | var_56 | var_57 | var_58 | var_59 | var_60 | var_61 | var_62 | var_63 | var_64 | var_65 | var_66 | var_67 | var_68 | var_69 | var_70 | var_71 | var_72 |
| 0 | neg | talc | picking | delta | advice | giants | accenture | thoughts | buffett | computing | samsung | giants | accenture | 2b | wave | gaming | sec | base | tata | worse | wave | instead | vix | 5b | macy |
| 1 | neg | motor | portfolios | talc | chicago | carmax | advice | citibank | aluminum | operating | vix | selects | card | cliff | justice | presence | proposal | nrg | etfs | modi | strategic | counter | sec | spite | ebay |
| 2 | neg | struggles | advice | struggles | picking | sherwin | express | break | yellen | inquiry | yellen | palo | screen | road | ge | does | justice | 2b | payment | led | network | frankfurt | yahoo | nrg | card |
| 3 | neg | recession | yellen | argentina | yellen | stryker | women | valuation | quality | upgraded | transportation | defends | advice | alnylam | telecom | site | cisco | commerce | ns | offerings | shareholder | ipos | electronics | csco | tata |
| 4 | neg | half | profitable | half | recommendation | driving | card | cuba | activities | create | dividends | alto | chicago | shopping | etfs | heat | ge | advance | card | discount | sec | components | cyber | 2b | payment |
| 5 | neg | expedia | chicago | rule | attractive | winning | chicago | struggles | commodities | corporation | immediate | stryker | express | legal | budget | smartphone | budget | crash | retailers | models | campaign | pc | bp | commerce | ns |
| 6 | neg | solution | commodities | motor | centers | self | picking | spy | computing | cook | quality | sherwin | nse | family | payment | analog | yuan | education | uk | officer | budget | daihatsu | immediate | don | solar |
| 7 | neg | tells | chase | tells | profitable | winners | game | moody | specific | moody | india | driving | disclosure | enterprise | ipo | road | payment | csco | solar | books | yahoo | nigeria | samsung | cautious | etfs |
| 8 | neg | lam | disclosure | stay | portfolios | awarded | portfolios | stance | shale | selling | buffett | self | southern | fedex | settlement | legal | aerospace | mccormick | telecom | soon | eventually | concern | ecb | education | utility |
| 9 | neg | continental | attractive | confirms | immediate | constellation | profitable | venture | common | hours | aluminum | awarded | game | attacks | shareholder | weaker | bo | don | utility | shutdown | family | shutdown | shareholder | customer | actually |
| 10 | neg | moody | storage | shell | estate | rocket | southern | cliff | premium | accenture | attractive | rogers | profitable | cbre | ecb | forces | solar | retailers | advertising | hours | ge | nominees | family | alcoa | plant |
| 11 | neg | emissions | recommendation | acn | toyota | gaming | interview | reason | copper | stellar | premium | agency | recommendation | dell | fda | modi | retailers | gaming | outperform | shareholder | chance | cheaper | italy | play | chicago |
| 12 | neg | aluminum | brazil | east | buys | flavors | recommendation | tells | devices | computer | moment | rocket | community | site | samsung | political | card | extends | rival | phone | airbus | jitters | martin | autonation | justice |
| 13 | neg | oct | cloud | hasbro | tesla | verdict | advertising | range | score | daihatsu | oct | winning | women | payrolls | hikes | alnylam | uncertainty | bring | uncertainty | moody | items | track | jets | tjx | areas |
| 14 | neg | surgical | toyota | macau | brazil | design | disclosure | federal | india | pc | client | verdict | picking | political | stanley | candidate | shareholder | smaller | department | imf | aerospace | models | phone | dax | pacific |
NOTE Columns in grey are those trained on titles only !
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for positive-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_73 | var_74 | var_75 | var_76 | var_77 | var_78 | var_79 | var_80 | var_81 | var_82 | var_83 | var_84 | var_85 | var_86 | var_87 | var_88 | var_89 | var_90 | var_91 | var_92 | var_93 | var_94 | var_95 | var_96 |
| 0 | pos | csx | estimize | shared | tariffs | arthritis | bats | fighter | tariffs | rbs | toys | airbus | aig | leadership | steel | funding | tesla | lay | deere | lack | pounds | unknown | musk | jury | fargo |
| 1 | pos | fee | alibaba | truck | solar | government | aig | 64 | alibaba | ultra | tariffs | deere | teva | proxy | opec | lithium | steel | pursuit | estimize | virtual | musk | worm | tesla | tmus | steel |
| 2 | pos | movil | tariffs | drugs | drugs | rough | estimize | national | ounce | national | miles | lymphoma | premiums | extended | tesla | qatar | opec | iv | steel | lear | opec | pollution | opec | britain | wells |
| 3 | pos | chipmaker | drugs | movil | alibaba | ubnt | premiums | hate | eps | swings | exploration | lvs | biotech | summit | estimize | massive | inventory | reporting | wells | shifts | copper | txt | bpd | bear | cigna |
| 4 | pos | losing | solar | hathaway | investigation | ubiquiti | salesforce | hathaway | summit | swing | sellers | related | barrels | feature | copper | reaches | aig | betting | es | qatar | tariffs | ppi | aluminum | israeli | inventory |
| 5 | pos | pairs | ounce | coo | search | little | biotech | problems | game | arrest | cheaper | blow | biogen | cryptocurrency | inventory | reality | automakers | files | aig | tone | alibaba | practices | frank | nov | aig |
| 6 | pos | players | sep | patents | protection | founder | reform | 88 | reserves | surpass | plants | adbe | chip | stryker | summer | makers | tariffs | tmus | inventory | shoe | aluminum | prenatal | metals | mksi | es |
| 7 | pos | astrazeneca | protection | factories | shanghai | lingers | barrels | 7th | uber | supreme | esp | loses | salesforce | transportation | automakers | machines | internal | reduces | restaurant | fees | tesla | prepare | tariffs | mks | restaurant |
| 8 | pos | property | reduced | fee | reduced | nktr | iron | haul | anti | newest | book | gol | wholesale | nsc | yen | broke | uber | israeli | residential | fidelity | asx | tweaks | oct | bce | wynn |
| 9 | pos | fiat | investigation | accounts | reserves | russell | reduced | gainers | ruling | newly | search | golf | nvidia | florida | uber | borrowers | tv | hrb | pivot | let | producer | tussle | probably | fl | residential |
| 10 | pos | betting | bitcoin | months | chemical | leverage | scores | 73 | independent | noc | canada | government | drugs | lost | investigation | bolt | gm | triumph | drilling | thrive | automakers | oceans | automakers | bbby | liquidity |
| 11 | pos | 1b | round | fiat | sa | tribune | drugs | newest | nov | nomura | import | lynparza | reduced | pxd | social | edges | yen | bitcoin | bitcoin | shorting | brands | object | housing | issued | employees |
| 12 | pos | halts | reserves | ways | steel | leidos | versus | accelerate | party | number | australia | looms | reform | guides | association | melanoma | barrels | bites | ratios | tgt | oct | oakland | mining | brown | association |
| 13 | pos | surpass | pharmaceuticals | guilty | iii | tree | label | lyft | wave | e3 | reserves | resistance | solar | highest | alibaba | beverage | bearish | funding | western | fined | barrels | nzd | facility | enrollment | auto |
| 14 | pos | lmt | blockchain | wave | pharmaceuticals | training | solar | settles | leave | oilwell | investigation | resmed | jets | perspective | 07 | leadership | association | little | martin | fitb | bernanke | turns | inventories | stericycle | restaurants |
| Top words for model variations | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 most important words for negative-class prediction | |||||||||||||||||||||||||
| Rank | Class | var_73 | var_74 | var_75 | var_76 | var_77 | var_78 | var_79 | var_80 | var_81 | var_82 | var_83 | var_84 | var_85 | var_86 | var_87 | var_88 | var_89 | var_90 | var_91 | var_92 | var_93 | var_94 | var_95 | var_96 |
| 0 | neg | nba | yellen | selects | advice | zumiez | expedia | account | buffett | zodiac | yellen | zumiez | expedia | road | wave | lincoln | sec | iot | bats | polish | wave | nvidia | 737 | orders | cisco |
| 1 | neg | ffo | homes | nba | yellen | reg | tgr | mazda | yellen | final | buffett | forces | screening | apollo | sec | lending | centers | initiatives | biotech | non | sec | chance | exploring | biotech | |
| 2 | neg | ecolab | picking | way | material | bloomberg | chipotle | 95 | arm | tight | rating | february | screen | wage | merger | winners | card | anti | 5g | kill | chance | press | sec | porphyria | 5g |
| 3 | neg | courts | advice | lay | picking | blackberry | screen | a320 | switch | ties | sharp | silver | chipotle | chubb | fda | layoffs | bit | lnc | simple | philip | airbus | presidential | dividends | gas | uk |
| 4 | neg | cuba | material | achc | toyota | biopharma | advice | maersk | common | dutch | electronics | filing | wizard | base | cisco | advance | yahoo | lift | uk | uncertainty | bitcoin | preserve | solutions | jpmorgan | scale |
| 5 | neg | lifting | portfolios | blocks | disclosure | turns | wizard | malware | indexes | california | picking | fines | advice | federal | simple | kroger | justice | leukemia | cisco | phones | client | prep | paris | judge | bit |
| 6 | neg | live | toyota | hopes | immediate | replaces | amounts | aar | activities | textron | bankers | pain | disclosure | singapore | drugs | investments | merger | leaving | roche | jury | strategic | zero | airbus | finance | spectrum |
| 7 | neg | dominion | amounts | horton | homes | florida | silver | massachusetts | user | jnj | hardware | fitch | devry | challenges | shareholder | conmed | dividends | amex | scale | jp | yuan | polish | immediate | enhances | plant |
| 8 | neg | pass | dead | leverage | replace | homes | means | toyota | federal | chair | flight | homes | microchip | yahoo | additional | ford | jd | foods | try | quality | plus | discount | step | lilly | |
| 9 | neg | dead | subject | smartphone | frame | bhp | exploration | location | joint | samsung | florida | movement | wrapup | yuan | constellation | fda | jci | bought | prison | discount | woos | projected | kbh | foods | |
| 10 | neg | confirms | profitable | ddr | portfolios | london | material | mh370 | regional | cad | french | flu | bloomberg | barclays | card | 21 | uk | alto | lilly | problem | citi | plunged | anticipated | narrows | bought |
| 11 | neg | uncertainty | taxes | rock | transactions | bets | bloomberg | action | typically | juncture | toyota | following | enrollment | adi | listed | holding | shareholder | itri | education | trouble | eventually | plugs | africa | narrower | crisis |
| 12 | neg | stryker | nuclear | softbank | visit | uncertainty | ppg | models | nuclear | technologies | activities | fred | leverage | fomc | regulatory | coo | test | allegheny | automation | project | arm | plot | suggest | primary | card |
| 13 | neg | softbank | leverage | cvs | profitable | luv | subject | moody | recommendations | faraday | material | fb | material | strike | verizon | cuba | solution | irish | drugs | promote | smartphone | pledges | yellen | kills | parts |
| 14 | neg | expanded | activities | transunion | community | beer | metric | morris | airbus | rock | advice | fy17 | visit | boss | payment | democrats | drugs | investigation | plant | prospects | export | pledge | shareholder | printing | brexit |