KB / operations
Collection Schedule
Last verified
Data collection runs on five named profiles. Four of them (overnight, extended, full, market) are driven by a static 40-slot list in app/config.py:SCHEDULE. The fifth (hot) runs on its own 5-minute timer, only during regular trading hours.
The five profiles
| Profile | When | Cadence | Sources |
|---|---|---|---|
overnight | 00:00, 02:00, 04:00, 06:00, 08:00, 20:00, 22:00 ET | ~2h off-hours | news, liquidity |
extended | 08:30, 09:00, 09:15, 16:30, 17:00, 18:00 ET | Pre / post-market | market, news, vol_structure, liquidity, credit, fred, energy, macro_indicators, cot, aaii |
full | 09:31, 12:00, 15:45, 16:01 ET | 4× per trading day | All 16 sources |
market | 09:45 – 15:30 ET, every 15 min | 15 min during RTH | news |
hot | 09:30 – 16:00 ET, every 5 min | hot_looper (independent) | market, vol_structure, energy |
The static SCHEDULE carries 40 time slots across the four background-looper profiles. The hot profile is not in SCHEDULE — it runs on its own interval and only during RTH.
Why five profiles
Each profile is a tradeoff between freshness and load. The shapes:
overnight— the markets are closed but news and Fed liquidity keep moving. A light two-source pull every ~2h keeps the dashboard’s news feed warm and catches overnight Fed-balance-sheet operations without hammering yfinance.extended— pre-market (08:30, 09:00, 09:15) and post-market (16:30, 17:00, 18:00) windows. Pull the macro skeleton plus weekly positioning (cot,aaii) so the open and close both ship with fresh data.full— the four anchor cycles. Open (09:31), midday (12:00), pre-close (15:45), and close (16:01). Every source fires, including the slow yfinance breadth batch.market— the 15-minute intraday cadence during RTH. Pulls onlynews, the one genuinely intraday-fast source.credit,liquidity, andfredwere dropped from this tick (2026-06): they publish on daily/weekly cadence, so refetching them every 15 min was wasted work — they stay covered byextended(6 slots) andfull(4 slots) through the trading day.hot— the 5-minute “fast pulse” during RTH. Three sources only (market,vol_structure,energy) so the dashboard’s health score and intraday sparklines stay near-live without burning the Schwab budget.
Cycle locking
generate_report_logic() is wrapped by a module-level asyncio.Lock (_report_lock). The startup invocation and the two loopers all queue on the same lock — a full cycle that runs long won’t get clobbered by the next market tick; the market cycle just waits its turn. Cycle-shared globals (_source_cache, _prev_context) are mutated inside the lock; any code that reads or writes them from outside the cycle must take the lock first. C5 (2026-06): the slow post-capture re-render now runs outside _report_lock (its report.md write is serialized by a dedicated _render_lock), so a long full cycle no longer delays a queued hot cycle by its re-render duration.
Release-aware scheduling (forward-looking)
The signal registry’s release_calendar slot (declared in signal_definitions.json) carries {publishes, weekday, approximate_time_et, lag_days} for every metric. A release-aware scheduler (queued in IDEAS.md) will read these to fire on the upstream publish window rather than the static SCHEDULE — e.g. AAII fires at the Thursday window, not on every slot that happens to include aaii. The current SCHEDULE + PROFILES design is the bootstrap; the registry shape is the destination.
See also
- Data sources — what each source pulls and from where.
- Lifecycle — what happens inside one cycle, end-to-end.
- Source health — how cycle outcomes get tracked.
- Code:
app/config.py:SCHEDULE/PROFILES,app/main.py:background_looper/hot_looper.