Agent roster (36)
All 36 agent identities, by tier. 36 core agents on the A2A event bus (32 active by default + 4 example agents inactive until activated via the Agent Builder). Tier 4b includes the Phase-2-6 lifecycle agents — BenchmarkingAgent, ValidationNarrationAgent, ThesisTranslationAgent, RealizedCompareAgent, PostTradeAnalystAgent. Tier 4a includes CapitalAllocationAgent + CapitalAllocationApprovalAgent (multi-sleeve capital allocator). Tier 8 wires NVIDIA NeMo-RL via NeMoRLTrainingAgent + NeMoRLFeedbackAgent for LLM post-training (DPO/GRPO/SFT). Tier 8b (Phase 8 DPO closure) ships PolicyPromotionAgent + InferenceServerAgent: completed training runs auto-validate, auto-promote to the policy registry, and hot-reload vLLM on localhost:8024 so the next narration call hits the trained policy via policy_router. Tier 2/3 · Signal Discovery (new — Phase 9) adapts the NVIDIA-AI-Blueprints quantitative-signal-discovery-agent closed loop: SignalGeneratorAgent proposes JSON-AST formulas over a 66-operator vocabulary, SignalCodeGeneratorAgent compiles the AST through a strict whitelist (no exec() on LLM output), OptimizationAdvisorAgent critiques rejections, and SignalDiscoveryOrchestratorAgent drives the iter loop. Accepted formulas promote to sleeves via Grinold-Kahn α-tilt on cuFOLIO scenarios. Each agent name links to its full skill card. The Model column links to the model card on build.nvidia.com for agents that call an LLM.
Roster
| tier | agent | identity | model | subscribes | emits | bus? |
|---|---|---|---|---|---|---|
| 1 · Data | DataAgent | I curate market data, flag anomalies, and kick off a HybridRAG refresh whenever fresh data lands. | nvidia/nemotron-3-super-120b-a12b ↗ | Scheduler.tick.eod | DataReady, DataAnomaly, ResearchKickoff | live |
| 1.5 · Engineering | FeatureEngineeringAgent | I turn raw data into model-ready features. | nvidia/nemotron-3-super-120b-a12b ↗ | DataReady | FeaturesReady | live |
| 2 · Research | PredictiveModelingAgent | I train ML models and forecast returns. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady | PredictionReady | live |
| 2 · Research | DeepResearchAgent | I run multi-step AIQ Deep Research (planner → researcher → synthesizer → citer). | nvidia/nemotron-3-super-120b-a12b ↗ | DeepResearchRequested | DeepResearchComplete | live |
| 2 · Research | FundamentalAgent | I read 10-Ks, 10-Qs, transcripts. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady | ResearchComplete | live |
| 2 · Research | TechnicalAgent | I read price and volume only. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady | ResearchComplete | live |
| 2 · Research | SentimentAgent | I read what humans are saying. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady | ResearchComplete | live |
| 2 · Research | AIFactorAgent | I find latent factors and regimes. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady | ResearchComplete, RegimeTag | live |
| 3 · Synthesis | SignalAgent | I fuse every research view (technical · fundamental · sentiment · ai-factor · predictive · deep-research · hybridrag) and the regime tag into a per-ticker conviction signal. When top-candidate conviction is weak (< 0.30) I auto-kick a DeepResearchRequested on that name so the next cascade has stronger views. | nvidia/nemotron-3-super-120b-a12b ↗ | ResearchComplete, PredictionReady, DeepResearchComplete, HybridRAGComplete, RegimeTag | SignalProposed, DeepResearchRequested | live |
| 3 · Synthesis | MetaAgent | I challenge today's signal and learn from yesterday's. When my critique fails, I auto-fire a BacktestRequested so we know whether the failure is regime-local or systemic. | nvidia/nemotron-3-super-120b-a12b ↗ | SignalProposed, DailyReportReady | CritiqueClean, CritiqueFailed, StrategyTuned, BacktestRequested | live |
| 4 · Construction | BacktestAgent | I run candidate strategies on history; when a run clears the Sharpe + max-DD floor I promote it as a candidate the PM records. | nvidia/nemotron-3-super-120b-a12b ↗ | BacktestRequested | BacktestReport, BacktestStrategyPromoted | live |
| 4b · Lifecycle | BenchmarkingAgent | I run train/val/test split-aware sweeps; the test fold is evaluated exactly once. The val_test_gap field surfaces likely train/val overfit. | — (no LLM) | BenchmarkRequested | BenchmarkReport | live |
| 4b · Lifecycle | ValidationNarrationAgent | Phase 2. Nemotron 3 Super 120B narrates the 10-check ValidationReport. Routed through policy_router(decision_type="validation_narration") so a DPO'd policy serves when one is promoted; falls back to cloud Nemotron otherwise. Async daemon writes the narrative back atomically so the API returns immediately. | nvidia/nemotron-3-super-120b-a12b ↗ | ValidationReportReady | ValidationNarrationRequested, ValidationNarrationReady | live |
| 4b · Lifecycle | ThesisTranslationAgent | Phase 3. Plain-English thesis → typed StrategySpec via Nemotron 3 Super 120B. Routed through policy_router(decision_type="thesis_translation"). Handles non-numeric thresholds defensively (coerces to description suffix). | nvidia/nemotron-3-super-120b-a12b ↗ | ThesisTranslationRequested | ThesisTranslationReady | live |
| 4b · Lifecycle | RealizedCompareAgent | Phase 4. Triple-side compare: backtest expectation ↔ paper fills ↔ live fills joined by strategy_version_id. Surfaces slippage drift, behavior match, n_warnings. Subscribes to BenchmarkReport for auto-refresh. | — (no LLM) | BenchmarkReport, RealizedCompareRequested | RealizedCompareReady | live |
| 4b · Lifecycle | PostTradeAnalystAgent | Phase 6. Drift detection across 5 dimensions scored to [-1,+1]. Deterministic 7-action recommendation engine (hold / reduce / pause / re_run_validation / retrain / change_params / retire). Nemotron narrates the recommendation verbatim — never overrides the logic. | nvidia/nemotron-3-super-120b-a12b ↗ | RealizedCompareReady, PostTradeAnalysisRequested | PostTradeAnalysisReady | live |
| 10 · Regime | RegimeDetectorAgent | I match the live indicator state against the regime catalog; when nothing matches I draft a new YAML for the operator to review. | nvidia/nemotron-3-super-120b-a12b ↗ | FeaturesReady, RegimeScanRequested | RegimeMatchProposed, RegimeDraftProposed, RegimesChanged | live |
| 4 · Construction | PortfolioOptimizationAgent | I solve the CVaR problem (cuFOLIO) and blend the result with the active NemoRL policy when one is loaded. MetaAgent's CritiqueFailed gates me — I skip the rebalance and emit RebalanceSkipped. | nvidia/nemotron-3-super-120b-a12b ↗ | SignalProposed, CritiqueFailed | RebalanceProposed, RebalanceSkipped | live |
| 4 · Construction | PortfolioConstructionAgent | I make the math executable. | nvidia/nemotron-3-super-120b-a12b ↗ | RebalanceProposed | RebalanceConstructed | live |
| 4 · Construction | CapitalAllocationAgent | I split the household across sleeves before anyone sizes a trade. | nvidia/nemotron-3-super-120b-a12b ↗ | MultiSleeveRebalanceRequested | CapitalAllocationProposed | live |
| 4 · Construction | CapitalAllocationApprovalAgent | When the PM approves an allocation, I gate it through Compliance and make it the household's truth. | nvidia/nemotron-3-super-120b-a12b ↗ | CapitalAllocationApproved | AllocationCleared, AllocationBlocked, CapitalAllocated | live |
| 5 · Compliance | ComplianceAgent | Nothing trades unless it's clean. Phase 7: also runs at the strategy-definition layer on ValidationReportReady — scans the sleeve universe against the restricted list (GME/AMC/SPCE) and universe_size>50, emits ComplianceAdvisory regardless of finding count so the audit log proves a scan ran. | nvidia/nemotron-3-super-120b-a12b ↗ | RebalanceConstructed, CapitalAllocationApproved, ValidationReportReady | RebalanceCleared, RebalanceBlocked, AllocationCleared, AllocationBlocked, ComplianceAdvisory | live |
| 6 · Execution | ExecutionAgent | I place orders and follow the schedule. | nvidia/nemotron-3-super-120b-a12b ↗ | RebalanceApproved | OrderPlaced, OrderFilled, OrderCancelled | live |
| 6 · Execution | LiveMonitorAgent | I watch positions during the day. | nvidia/nemotron-3-super-120b-a12b ↗ | MarketTick | RiskBreach | live |
| 6 · Execution | ReportingAgent | I tell you what happened — KPIs + IS/VWAP/slippage. | nvidia/nemotron-3-super-120b-a12b ↗ | OrderFilled, Scheduler.tick.eod_close | DailyReportReady | live |
| 7 · Oversight | PortfolioManagerAgent | I am the PM. I observe every meaningful event on the A2A bus — research views, signal fusions, compliance verdicts, fills, RL retrains, AutoResearch sessions, HybridRAG retrievals, regime shifts — so when the operator asks me a question I can synthesize across the whole platform. I approve/reject rebalances, auto-fire HybridRAGQuery on >15% concentration risk, halt AutoResearch on hard compliance vetoes, and record candidate strategies. | nvidia/nemotron-3-super-120b-a12b ↗ | ResearchComplete, PredictionReady, RegimeTag, DeepResearchComplete, HybridRAGComplete, SignalProposed, CritiqueClean, CritiqueFailed, BacktestReport, BacktestStrategyPromoted, RebalanceProposed, RebalanceConstructed, RebalanceCleared, RebalanceBlocked, RebalanceSkipped, OrderPlaced, OrderFilled, OrderRejected, TrainNemoRLRequested, PreferenceModelUpdated, PreferenceRecorded, NeMoRLTrainingStarted, NeMoRLTrainingProgress, NeMoRLTrainingComplete, NeMoRLTrainingCancelled, CapitalAllocationProposed, AllocationCleared, AllocationBlocked, CapitalAllocated, DataAnomaly, RegimesChanged, RegimeMatchProposed, RegimeDraftProposed, PMChatQuery | RebalanceApproved, RebalanceRejected, PMChatResponse, CandidateStrategyRecorded, HybridRAGQuery, AutoResearchStop | live |
| 8 · Feedback | NeMoRLTrainingAgent | I launch NVIDIA NeMo-RL training runs (SFT/DPO/PPO/GRPO/DAPO/GDPO/RM/distillation) via the bridge — subprocess into the dedicated Python 3.13 env where nemo-rl 0.6.0 lives. | — (no LLM) | TrainNemoRLRequested | NeMoRLTrainingStarted | live |
| 8 · Feedback | NeMoRLFeedbackAgent | I count preference pairs from PreferenceLearningAgent (approve/reject on rebalances + narrations). Once enough accumulate, I emit TrainNemoRLRequested(algo='dpo') so the Nemotron policy gets a fresh DPO retrain on the latest user feedback. | — (no LLM) | PreferenceRecorded, RebalanceDecided | TrainNemoRLRequested | live |
| 8b · DPO closure | PolicyPromotionAgent | Phase 8. I close the DPO loop. On NeMoRLTrainingComplete I locate the per-run checkpoint, run NeMo-RL's DCP→HF converter as a subprocess, parse the final eval metrics from the run log, compare against baseline (prior active policy or fallback floor), and auto-promote the candidate to the policy registry when the gate passes. Emits PolicyCandidateRegistered on every completed run, PolicyPromoted on gate pass, PolicyPromotionFailed on validation failure. | — (no LLM) | NeMoRLTrainingComplete | PolicyCandidateRegistered, PolicyPromoted, PolicyPromotionFailed | live · new |
| 8b · DPO closure | InferenceServerAgent | Phase 8. I hot-reload vLLM on localhost:8024 whenever a policy is promoted. On PolicyPromoted I read the checkpoint path off the payload, stop the running vLLM (if any), start a fresh one against the new checkpoint, and emit LocalInferenceReloadStarted. During cold-start (~30-60s for an 8B model) policy_router falls back to build.nvidia.com automatically — no downtime in the demo. | — (no LLM — manages vLLM subprocess) | PolicyPromoted | LocalInferenceReloadStarted | live · new |
| 2 · Discovery | SignalGeneratorAgent | Phase 9. I propose alpha-signal formulas as JSON-AST trees over a 66-operator vocabulary (TS_*, CS_*, Rank_*, Decay_*, math, norm, data, cond — adapted from NVIDIA-AI-Blueprints quantitative-signal-discovery-agent calculator.json). Routed through policy_router("signal_generation", temp=0.8). When a prior iteration failed, I see the OptimizationAdvisor's critique + the best-so-far formula in my prompt (Grinold-Kahn-aware ranking) and propose strictly better candidates. | nvidia/nemotron-3-super-120b-a12b ↗ | SignalGenerationRequested | SignalCandidatesGenerated, SignalGenerationFailed | live · new |
| 3 · Discovery | SignalCodeGeneratorAgent | Phase 9. I validate the LLM-emitted JSON AST against a strict operator whitelist and compile each formula to a vectorized pandas/numpy callable via OPERATOR_REGISTRY. No exec() on LLM output — safer than the upstream blueprint's "LLM emits Python" step. Aliases (TS_StdDev→TS_Std, TS_ZScore→TS_Zscore) resolve transparently so common spelling variants don't waste an iteration. | — (no LLM — deterministic compile) | SignalCandidatesGenerated | SignalCodeCompiled, SignalCompilationFailed | live · new |
| 2 · Discovery | OptimizationAdvisorAgent | Phase 9. When a batch fails the acceptance gate (|IC|≥0.02 AND p≤0.05), I write concrete operator-level feedback for the next iteration — "Try TS_Rank instead of CS_Rank on the momentum factor, add TS_Zscore for vol normalization, gate when TS_Std rank is high." Routed through policy_router("optimization_advisor", temp=0.5). Few-shot prompt teaches the canonical critique shape (operator substitution + reasoning + predicted impact). | nvidia/nemotron-3-super-120b-a12b ↗ | IterationFailed | OptimizationAdviceGenerated, OptimizationAdviceFailed | live · new |
| 3 · Discovery | SignalDiscoveryOrchestratorAgent | Phase 9. I drive the closed-loop discovery workflow end-to-end. For each iteration: SignalGenerator → SignalCodeGenerator → evaluate (Mean IC + p-value + IR + decay + spread Sharpe/CAGR on real yfinance bars matching the operator's window) → acceptance gate → if rejected, OptimizationAdvisor critiques and the feedback binds the next generator call. On acceptance the formula persists to data/discovery/signals/<id>.json; promote-to-sleeve writes a configs/sleeves/discovered_*.yaml with the Grinold-Kahn α-tilt wiring for cuFOLIO. Universe is auto-resolved from intent via a fourth LLM role (universe_resolution). | nvidia/nemotron-3-super-120b-a12b ↗ | SignalDiscoveryRequested | SignalDiscoveryStarted, UniverseResolved, SignalCandidatesGenerated, SignalEvaluated, SignalRejected, SignalAccepted, OptimizationAdviceGenerated, SignalDiscoveryComplete, SignalDiscoveryFailed | live · new |
| 8 · Feedback | PreferenceLearningAgent | I turn every Approve/Override/Reject into a DPO training row. | — (no LLM) | OrderFilled, RebalanceApproved, RebalanceRejected | PreferenceModelUpdated | live · new |
| 8 · Feedback | AuditAgent | I write every meaningful bus event to immutable JSONL — full rebalance lifecycle, capital allocator lifecycle, NeMo-RL training lifecycle, the Phase 0-8 lifecycle (Benchmark, Validation, Thesis, RealizedCompare, PostTrade, Compliance advisories, policy candidate/promote/reload), critique verdicts, data anomalies. Phase 7: every audit row is tagged with strategy_version_id, strategy_id, and mode so post-hoc analysis can filter by version without parsing payload internals. 51 events subscribed — the chain is fully replayable. | — (no LLM) | (51 events) Trade lifecycle · NeMo-RL training lifecycle · Capital allocation lifecycle · Backtest/strategy lifecycle · Critique/tuning · Data integrity · Stress-regime lifecycle · Research/prediction · Phase 0-6 lifecycle (BenchmarkReport, BenchmarkRequested, ValidationReportReady, ValidationNarrationRequested, ValidationNarrationReady, ThesisTranslationReady, RealizedCompareReady, RealizedCompareRequested, PostTradeAnalysisReady, PostTradeAnalysisRequested, ComplianceAdvisory) · Phase 8 closure (PolicyCandidateRegistered, PolicyPromoted, PolicyPromotionFailed, LocalInferenceReloadStarted) · Phase 9 discovery (SignalDiscoveryStarted, UniverseResolved, SignalCandidatesGenerated, SignalEvaluated, SignalRejected, SignalAccepted, OptimizationAdviceGenerated, SignalDiscoveryComplete, llm.reasoning.* trace spans) | — | live |
| 2 · Research | MacroRegimeAgent | I tag the cross-asset regime from VIX, 10Y yield, and DXY. | — (no LLM) | FeaturesReady | RegimeTag | live |
| 2 · Research | InsiderActivityAgent | I score symbols by net insider buying from SEC Form 4 in the last 30 days. | — (no LLM) | FeaturesReady | ResearchComplete | live |
| 2 · Research | OptionsFlowAgent | I score symbols by call/put open-interest skew on the options chain. | — (no LLM) | FeaturesReady | ResearchComplete | live |
| 2 · Research | DividendQualityAgent | I score symbols by yield × payout sustainability × 5y dividend growth. | — (no LLM) | FeaturesReady | ResearchComplete | live |
| 2 · Research | HybridRAGAgent | I extract a typed knowledge graph from a company's filings and news, fuse it with vector retrieval, and answer multi-hop questions with citations. | nvidia/nemotron-3-super-120b-a12b ↗ | HybridRAGQuery, ResearchKickoff | HybridRAGComplete | live |
| 8 · Feedback | NemoRLAutoResearchOrchestrator | Karpathy-pattern meta-loop over NeMo-RL. Each iteration, Nemotron 3 Super proposes a typed config edit (KL penalty, learning rate, batch size); the loop launches the inner DPO/GRPO/SFT run via the NeMo-RL bridge, parses the eval metric, keeps or reverts. | nvidia/nemotron-3-super-120b-a12b ↗ | — | TrainNemoRLRequested | live |
Per-agent skill cards
Each agent has a dedicated skill card with subscribes, emits, primary model, and Python entry points. The cards mirror skills/agents/<id>.md in the repo and are regenerated from identities.py.
Live bus status · activate / deactivate
Every agent currently registered on the live A2A event bus. Toggle active flags, fire a test trigger, and reach the Agent Builder.
Where to find them
- Identities:
src/traderspace/agents/identities.py(single source of truth for the roster). - Concrete bus agents:
src/traderspace/bus/agents/(21 modules + 4 example agents inexamples/). - Per-agent skill cards (markdown):
skills/agents/<id>.md— auto-generated fromidentities.py. - Per-agent docs pages (HTML):
mockups/docs/agents/<id>.html— rendered from the markdown. - Regenerate after editing identities:
PYTHONPATH=src python scripts/gen_skills.py # markdown skill cards PYTHONPATH=src python scripts/gen_agent_skill_docs.py # HTML docs pages PYTHONPATH=src python scripts/gen_docs.py # roster + all docs