AIQ Deep Research
Multi-step research powered by the NVIDIA AI Q&A (AIQ) Deep Research blueprint: planner → researcher → synthesizer → citer. Fans sub-queries out to Tavily + Finnhub + EDGAR in parallel, then produces a Bloomberg-grade markdown note with inline citations.
What it is · how it works · why it matters
Multi-step research using NVIDIA's AI Q&A (AIQ) Deep Research blueprint. Plans sub-queries, fans them out in parallel, synthesizes a Bloomberg-grade note with inline citations.
Pipeline: planner → researcher → synthesizer → citer. Planner (Nemotron Super) decomposes into 3-5 sub-queries. Researcher fans out to Tavily / Finnhub / EDGAR in parallel. Synthesizer joins; citer inserts inline links.
Single-shot LLM answers tend to produce ungrounded citations. The blueprint's planner/synthesizer separation grounds each claim in retrieval results. Our internal eval gold set measures factuality differences vs. single-shot baselines.
Overview
Two engines, picked at request time:
- AIQ Deep Research adapter (literal blueprint) — straight port of the NVIDIA AI Q&A Deep Research blueprint. Uses the blueprint's planner + synthesizer prompts unchanged.
- In-house variant — adds finance-specific tool calls (Finnhub fundamentals, EDGAR filings, news ranking) on top of the same Deep Research skeleton.
Both live under src/traderspace/agents/aiq_deep_research_adapter.py and deep_research_agent.py. The Research workbench picks the in-house variant by default; the chat tool deep_research can pick either.
Pipeline
question
↓
PLANNER (Nemotron 3 Super)
produces N sub-queries, each tagged with a source
↓
RESEARCHER (parallel fan-out)
for each sub-query:
pick the right source (Tavily for web, Finnhub for fundamentals, EDGAR for filings)
pull top K results
summarize per-result
↓
SYNTHESIZER (Nemotron 3 Super)
joins all summaries into one coherent markdown
↓
CITER
inserts inline citation links keyed back to source URLs / filing IDs
↓
markdown answer + sources list
How to use it
From the Research page
Type a question into the DeepResearch box. The agent streams its plan, then each sub-query result as it arrives, then the synthesized answer.
From chat
Ask any sufficiently broad question in the embedded chat. Kimi will route to deep_research if the question looks research-shaped (multi-source, multi-step).
From the REST API
curl -X POST http://127.0.0.1:8015/api/research/deep \
-H 'content-type: application/json' \
-d '{"symbol":"NVDA","question":"What is driving short interest the last 5 days?"}'
What good answers look like
Replies are evaluated on citation grounding. A well-formed DeepResearch reply:
- Has at least 3 inline citations.
- Names the source by domain (e.g. "(seekingalpha.com)", "(NVDA 10-Q · 2025-04-30)").
- Contradicts itself across sources when sources disagree — and says so explicitly.
- Includes a "what would change my mind" closing paragraph.
If the answer is one long paragraph with no inline citations, the synthesizer pass lost the citation map — usually because the researcher pass returned too many short results for the synthesizer prompt window. Rerun with a tighter question.
Tuning the pipeline
- Planner sub-query count —
N_SUBQUERIESin the agent file. Default 4. Raise for harder questions. - Per-source results —
K_RESULTS. Default 3 Tavily, 5 Finnhub news. - Model — defaults to Nemotron Super 120B; switch via the
modelarg.
REST surface
| Verb | Path | Purpose |
|---|---|---|
| POST | /api/research/deep | Body: {symbol, question, engine?}. Streams SSE. |
| GET | /api/research/deep/history?limit=10 | Recent runs (audit). |