Downloaded Data DQ Engine — Complete Data Input Reference

Overview

This document describes every data file the DDQ engine can consume, the exact folder structure, file naming conventions, column headers with sample rows, and what to expect from each broker source.


Folder Structure

~/downloaded_data_dq/
└── data/
    └── raw/
        ├── EOD/
        │   ├── Dhan/         ← place BSE_*.csv and NSE_*.csv files here
        │   ├── Kite/
        │   └── Upstox/
        └── INTRADAY/
            ├── Dhan/
            ├── Kite/
            └── Upstox/

File Naming Convention

Instrument Type Pattern Example
Equity EOD {EXCH}_{SYMBOL}.csv BSE_TCS.csv
ETF EOD {EXCH}_{SYMBOL}.csv NSE_NIFTYBEES.csv
Index EOD {EXCH}_{INDEX}.csv NSE_NIFTY50.csv
Equity Futures {EXCH}_{SYM}_FUT_{YYYYMMDD}.csv NSE_TCS_FUT_20250327.csv
Index Futures {EXCH}_{IDX}_FUT_{YYYYMMDD}.csv NSE_NIFTY50_FUT_20250327.csv
Equity Options Call {EXCH}_{SYM}_CE_{STRIKE}_{YYYYMMDD}.csv NSE_TCS_CE_4000_20250327.csv
Equity Options Put {EXCH}_{SYM}_PE_{STRIKE}_{YYYYMMDD}.csv NSE_TCS_PE_4000_20250327.csv
Index Options Call {EXCH}_{IDX}_CE_{STRIKE}_{YYYYMMDD}.csv NSE_NIFTY50_CE_24000_20250327.csv
Index Options Put {EXCH}_{IDX}_PE_{STRIKE}_{YYYYMMDD}.csv NSE_NIFTY50_PE_24000_20250327.csv
Intraday (all types) Same naming as EOD BSE_TCS.csv

Where: - {EXCH} = BSE or NSE - {SYMBOL} / {SYM} = uppercase symbol (e.g. TCS, HERO_MOTOCORP, RELIANCE) - {IDX} = index name (e.g. NIFTY50, SENSEX, BANKNIFTY) - {YYYYMMDD} = expiry date in YYYYMMDD format (e.g. 20250327) - {STRIKE} = strike price as integer (e.g. 24000, 3500)


═══════════════════════════════════════════════════════════

EOD FILE FORMATS

═══════════════════════════════════════════════════════════

1. Equity / ETF / Index — All Three Sources

All three sources (Dhan, Kite, Upstox) use identical columns for EOD:

Columns:

date, open, high, low, close, adjusted_close, volume, open_interest

Date format: DD-MM-YYYY (e.g. 25-08-2004)

Sample — Dhan BSE TCS (unadjusted, from 2004):

date,open,high,low,close,adjusted_close,volume,open_interest
25-08-2004,123.50,126.00,121.80,124.20,124.20,6536309,11348
26-08-2004,124.10,125.60,121.90,122.35,122.35,2243016,336349
27-08-2004,122.85,123.00,119.80,120.30,120.30,1503317,647483

Sample — Kite/Upstox BSE TCS (split-adjusted, from 2008):

date,open,high,low,close,adjusted_close,volume,open_interest
01-01-2008,266.30,269.50,263.00,263.40,263.40,252568,979504
02-01-2008,264.40,265.70,258.25,262.20,262.20,453472,609991

Index (volume=0, adjusted_close=blank):

date,open,high,low,close,adjusted_close,volume,open_interest
01-01-2008,6140.00,6205.35,6073.70,6144.35,,0,0
02-01-2008,6165.00,6178.20,6093.00,6099.80,,0,0

Key differences per source: | Source | Prices | History starts | adj_close | |—|—|—|—| | Dhan | Unadjusted (raw exchange prices) | Earliest (2003–2010 for most) | = close | | Kite | Split-adjusted | ~2008 | Adjusted for splits/bonus | | Upstox | Split-adjusted | ~2008 | Adjusted (may be NaN for some symbols) |


2. Equity Futures — EOD

Columns:

date, open, high, low, close, adjusted_close, volume, open_interest, expiry_date

Sample — NSE TCS futures:

date,open,high,low,close,adjusted_close,volume,open_interest,expiry_date
01-01-2025,4112.00,4145.00,4095.00,4130.00,,285600,185400,2025-03-27
02-01-2025,4130.00,4155.00,4118.00,4148.00,,312000,190200,2025-03-27

3. Index Futures — EOD

Same as Equity Futures. Example filename: NSE_NIFTY50_FUT_20250327.csv

date,open,high,low,close,adjusted_close,volume,open_interest,expiry_date
01-01-2025,24100.00,24380.00,24050.00,24320.00,,1250000,8500000,2025-03-27

4. Equity Options — EOD

Columns:

date, open, high, low, close, adjusted_close, volume, open_interest,
expiry_date, strike_price, option_type, implied_volatility

Sample — NSE TCS 4000 CE:

date,open,high,low,close,adjusted_close,volume,open_interest,expiry_date,strike_price,option_type,implied_volatility
01-01-2025,185.00,210.00,175.00,205.00,,125000,385000,2025-03-27,4000,CE,22.5
02-01-2025,205.00,225.00,195.00,218.00,,142000,392000,2025-03-27,4000,CE,23.1

5. Index Options — EOD

Same as Equity Options. Example filename: NSE_NIFTY50_CE_24000_20250327.csv

date,open,high,low,close,adjusted_close,volume,open_interest,expiry_date,strike_price,option_type,implied_volatility
01-01-2025,485.00,520.00,470.00,510.00,,125000,3850000,2025-03-27,24000,CE,18.5

═══════════════════════════════════════════════════════════

INTRADAY FILE FORMATS (1-minute bars)

═══════════════════════════════════════════════════════════

Dhan — Variant A (Equity, standard symbols)

Columns:

open, high, low, close, volume, timestamp, trade_date, security_id
  • timestamp = Unix epoch (seconds) e.g. 1735703100
  • trade_date = YYYY-MM-DD HH:MM:SS+05:30
  • security_id = Dhan’s internal instrument ID (integer)
open,high,low,close,volume,timestamp,trade_date,security_id
4100.00,4107.75,4088.95,4100.95,213,1735703100,2025-01-01 09:15:00+05:30,532540
4101.70,4106.40,4094.75,4106.40,114,1735703160,2025-01-01 09:16:00+05:30,532540

Dhan — Variant B (some small-cap symbols like CP_CAP)

Columns (minimal 6-column format):

trade_date, open, high, low, close, volume
trade_date,open,high,low,close,volume
2025-01-01 09:15:00+05:30,399.00,399.00,395.20,395.20,1
2025-01-01 09:16:00+05:30,395.20,395.20,395.20,395.20,0

Kite — All symbols

Columns:

date, open, high, low, close, volume
  • date = YYYY-MM-DD HH:MM:SS+05:30
date,open,high,low,close,volume
2025-01-01 09:15:00+05:30,4100.00,4107.75,4088.95,4100.95,271
2025-01-01 09:16:00+05:30,4101.60,4105.00,4094.75,4105.00,107

Upstox — All symbols

Columns:

datetime, open, high, low, close, volume, oi, datetime_IST
  • datetime = YYYY-MM-DDTHH:MM:SS+05:30 (ISO format with T)
  • datetime_IST = YYYY-MM-DD HH:MM:SS+05:30 (space format)
  • oi = open interest
datetime,open,high,low,close,volume,oi,datetime_IST
2025-01-01T09:15:00+05:30,4100.00,4107.75,4088.95,4101.60,223,0,2025-01-01 09:15:00+05:30
2025-01-01T09:16:00+05:30,4101.70,4106.40,4094.75,4106.40,104,0,2025-01-01 09:16:00+05:30

═══════════════════════════════════════════════════════════

COMPLETE UNIVERSE — ALL FILES REQUIRED

═══════════════════════════════════════════════════════════

Phase 1 (Current — Equity Stocks)

Symbol Exchange Sources EOD File Intraday File
TCS BSE + NSE Dhan, Kite, Upstox BSE_TCS.csv BSE_TCS.csv
HERO_MOTOCORP BSE + NSE Dhan, Kite, Upstox BSE_HERO_MOTOCORP.csv same
CP_CAP BSE + NSE Dhan, Kite, Upstox BSE_CP_CAP.csv same

Minimum to get running: 18 files (3 symbols × 2 exchanges × 3 sources × EOD only)


Phase 2 — Additional Equity Stocks

Add any BSE/NSE listed equity by placing files with the correct naming:

data/raw/EOD/Dhan/BSE_RELIANCE.csv
data/raw/EOD/Kite/BSE_RELIANCE.csv
data/raw/EOD/Upstox/BSE_RELIANCE.csv
data/raw/INTRADAY/Dhan/BSE_RELIANCE.csv
... etc.

Also add the symbol to config/instruments.yaml under the equity: section.


Phase 3 — ETFs

Symbol Exchange Description
NIFTYBEES NSE + BSE Nifty50 ETF (Nippon)
BANKBEES NSE + BSE Nifty Bank ETF
GOLDBEES NSE + BSE Gold ETF
JUNIORBEES NSE Nifty Next 50 ETF
LIQUIDBEES NSE Liquid / overnight ETF

File naming: NSE_NIFTYBEES.csv (same format as equity)


Phase 4 — Indices

Symbol Exchange Description
NIFTY50 NSE NSE flagship index
SENSEX BSE BSE flagship index
BANKNIFTY NSE Bank Nifty index
NIFTY_MIDCAP_100 NSE Midcap 100
FINNIFTY NSE Financial services index
NIFTY_SMALLCAP_100 NSE Smallcap index

Special: Index files have volume=0 and adjusted_close is blank.

data/raw/EOD/Dhan/NSE_NIFTY50.csv
data/raw/EOD/Dhan/BSE_SENSEX.csv
data/raw/EOD/Dhan/NSE_BANKNIFTY.csv

Phase 5 — Equity Futures (per contract)

Each expiry is a separate file. NSE provides 3 monthly contracts:

Pattern Example
NSE_{SYM}_FUT_{YYYYMMDD}.csv NSE_TCS_FUT_20250327.csv

Typical symbols with F&O: TCS, RELIANCE, INFY, HDFCBANK, ICICIBANK, SBIN, WIPRO, TATAMOTORS, BAJFINANCE, AXISBANK, KOTAKBANK, etc.

Expiry cadence: Last Thursday of each month (March/June/Sept/Dec quarterly + monthly). Files needed per symbol per year: ~12 monthly + near-term.


Phase 6 — Index Futures

Pattern Example
NSE_{IDX}_FUT_{YYYYMMDD}.csv NSE_NIFTY50_FUT_20250327.csv
BSE_{IDX}_FUT_{YYYYMMDD}.csv BSE_SENSEX_FUT_20250327.csv

Indices with futures: NIFTY50, BANKNIFTY, FINNIFTY, NIFTY_MIDCAP_100, SENSEX


Phase 7 — Equity Options (per strike per expiry)

Pattern:

NSE_{SYM}_CE_{STRIKE}_{YYYYMMDD}.csv   ← Call
NSE_{SYM}_PE_{STRIKE}_{YYYYMMDD}.csv   ← Put

Example strikes for TCS (spot ~4000):

NSE_TCS_CE_3600_20250327.csv
NSE_TCS_CE_3800_20250327.csv
NSE_TCS_CE_4000_20250327.csv   ← ATM
NSE_TCS_CE_4200_20250327.csv
NSE_TCS_CE_4400_20250327.csv
NSE_TCS_PE_3600_20250327.csv
... (same strikes for PE)

Strike intervals: ₹100 for most equity options, ₹50 for some.


Phase 8 — Index Options (per strike per expiry)

Pattern:

NSE_NIFTY50_CE_{STRIKE}_{YYYYMMDD}.csv
NSE_NIFTY50_PE_{STRIKE}_{YYYYMMDD}.csv
BSE_SENSEX_CE_{STRIKE}_{YYYYMMDD}.csv

Example NIFTY50 strikes (spot ~24000):

NSE_NIFTY50_CE_22000_20250327.csv
NSE_NIFTY50_CE_22500_20250327.csv
NSE_NIFTY50_CE_23000_20250327.csv
NSE_NIFTY50_CE_23500_20250327.csv
NSE_NIFTY50_CE_24000_20250327.csv   ← ATM
NSE_NIFTY50_CE_24500_20250327.csv
NSE_NIFTY50_CE_25000_20250327.csv
NSE_NIFTY50_CE_25500_20250327.csv
NSE_NIFTY50_CE_26000_20250327.csv
... (mirror for PE)

Strike intervals: ₹50 for NIFTY50, ₹100 for BANKNIFTY, ₹100 for SENSEX.


═══════════════════════════════════════════════════════════

instruments.yaml — HOW TO ADD NEW SYMBOLS

═══════════════════════════════════════════════════════════

Add new equity under the equity: section in config/instruments.yaml:

equity:
  RELIANCE:
    display_name: "Reliance Industries"
    bse_code: "500325"
    nse_symbol: "RELIANCE"
    isin: "INE002A01018"
    instrument_type: Equity
    listing_date: "1977-11-15"
    lot_size: 1
    tick_size: 0.05
    price_band_pct: 20.0
    file_prefix_bse: BSE_RELIANCE
    file_prefix_nse: NSE_RELIANCE
    is_fno: true
    fno_lot_size: 250
    eod_data:
      dhan: { start: "2000-01-03", adjusted: false }
      kite: { start: "2008-01-01", adjusted: true  }
      upstox: { start: "2008-01-01", adjusted: true }

For an ETF, use instrument_type: ETF. For an Index, use instrument_type: Index. For Futures, use instrument_type: Equity_Futures or Index_Futures. For Options, use instrument_type: Equity_Options or Index_Options.


═══════════════════════════════════════════════════════════

REFERENCE DATA FILES (Future / Optional)

═══════════════════════════════════════════════════════════

These are not required now but enable advanced tests when available:

File Location Purpose
data/raw/constituents/{INDEX}.csv e.g. NIFTY50.csv Index reconstruction test (IDX-001)
data/raw/nav/{ETF}.csv e.g. NIFTYBEES.csv ETF NAV vs close (ETF-001)
data/raw/corporate_actions/{SYMBOL}.csv CA dates Better CA detection

Constituent file format:

date, constituent, weight, close
2025-01-01, TCS, 0.0412, 4100.50
2025-01-01, RELIANCE, 0.0890, 1280.30

NAV file format:

date, nav
2025-01-01, 244.8500
2025-01-02, 246.1200

═══════════════════════════════════════════════════════════

EXPECTED FILE COUNTS BY PHASE

═══════════════════════════════════════════════════════════

Phase Instrument Files (approx.)
1 3 equity stocks (BSE+NSE, 3 sources, EOD+Intraday) 36
2 30 equity stocks (same) 360
3 6 ETFs (BSE+NSE, 3 sources, EOD+Intraday) 72
4 6 indices (primary exchange, 3 sources, EOD+Intraday) 36
5 30 equity F&O symbols × 12 monthly contracts × 3 sources 1,080
6 5 index futures × 12 monthly × 3 sources × 2 exchanges 360
7 Equity options: 30 symbols × ~10 strikes × 2 (CE/PE) × 12 expiries × 3 sources ~21,600
8 Index options: 5 indices × ~50 strikes × 2 × 52 weekly × 3 sources ~78,000

Practical note: Phases 7–8 generate enormous file counts. For options, it’s common to store only liquid strikes (ATM ± 5 strikes) and near-term expiries (current + next 2 months). That brings Phase 7+8 down to ~3,000 files.


═══════════════════════════════════════════════════════════

NOTES ON DATA SOURCES

═══════════════════════════════════════════════════════════

Dhan API

  • EOD: /charts/historical endpoint → download as CSV
  • Intraday: /charts/intraday endpoint → 1-min bars
  • Provides: security_id, unadjusted prices
  • History: Some symbols from 2000+

Kite (Zerodha) API

  • EOD: /instruments/historical/{instrument_token}/{interval} endpoint
  • Intraday: Same endpoint with minute interval
  • Provides: split-adjusted prices, from ~2000 for most symbols

Upstox API

  • EOD: /v2/historical-candle/{symbol}/{interval} endpoint
  • Intraday: Same with 1minute interval
  • Provides: split-adjusted, similar history to Kite
  • Known issue: adj_close is NaN for some symbols (HERO_MOTOCORP confirmed)