[ reference · agents ]

Agent roster (36)

All 36 agent identities, by tier. 36 core agents on the A2A event bus (32 active by default + 4 example agents inactive until activated via the Agent Builder). Tier 4b includes the Phase-2-6 lifecycle agents — BenchmarkingAgent, ValidationNarrationAgent, ThesisTranslationAgent, RealizedCompareAgent, PostTradeAnalystAgent. Tier 4a includes CapitalAllocationAgent + CapitalAllocationApprovalAgent (multi-sleeve capital allocator). Tier 8 wires NVIDIA NeMo-RL via NeMoRLTrainingAgent + NeMoRLFeedbackAgent for LLM post-training (DPO/GRPO/SFT). Tier 8b (Phase 8 DPO closure) ships PolicyPromotionAgent + InferenceServerAgent: completed training runs auto-validate, auto-promote to the policy registry, and hot-reload vLLM on localhost:8024 so the next narration call hits the trained policy via policy_router. Tier 2/3 · Signal Discovery (new — Phase 9) adapts the NVIDIA-AI-Blueprints quantitative-signal-discovery-agent closed loop: SignalGeneratorAgent proposes JSON-AST formulas over a 66-operator vocabulary, SignalCodeGeneratorAgent compiles the AST through a strict whitelist (no exec() on LLM output), OptimizationAdvisorAgent critiques rejections, and SignalDiscoveryOrchestratorAgent drives the iter loop. Accepted formulas promote to sleeves via Grinold-Kahn α-tilt on cuFOLIO scenarios. Each agent name links to its full skill card. The Model column links to the model card on build.nvidia.com for agents that call an LLM.

Roster

tier	agent	identity	model	subscribes	emits	bus?
1 · Data	`DataAgent`	I curate market data, flag anomalies, and kick off a HybridRAG refresh whenever fresh data lands.	`nvidia/nemotron-3-super-120b-a12b` ↗	`Scheduler.tick.eod`	`DataReady`, `DataAnomaly`, `ResearchKickoff`	live
1.5 · Engineering	`FeatureEngineeringAgent`	I turn raw data into model-ready features.	`nvidia/nemotron-3-super-120b-a12b` ↗	`DataReady`	`FeaturesReady`	live
2 · Research	`PredictiveModelingAgent`	I train ML models and forecast returns.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`	`PredictionReady`	live
2 · Research	`DeepResearchAgent`	I run multi-step AIQ Deep Research (planner → researcher → synthesizer → citer).	`nvidia/nemotron-3-super-120b-a12b` ↗	`DeepResearchRequested`	`DeepResearchComplete`	live
2 · Research	`FundamentalAgent`	I read 10-Ks, 10-Qs, transcripts.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`TechnicalAgent`	I read price and volume only.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`SentimentAgent`	I read what humans are saying.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`AIFactorAgent`	I find latent factors and regimes.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`	`ResearchComplete`, `RegimeTag`	live
3 · Synthesis	`SignalAgent`	I fuse every research view (technical · fundamental · sentiment · ai-factor · predictive · deep-research · hybridrag) and the regime tag into a per-ticker conviction signal. When top-candidate conviction is weak (< 0.30) I auto-kick a DeepResearchRequested on that name so the next cascade has stronger views.	`nvidia/nemotron-3-super-120b-a12b` ↗	`ResearchComplete`, `PredictionReady`, `DeepResearchComplete`, `HybridRAGComplete`, `RegimeTag`	`SignalProposed`, `DeepResearchRequested`	live
3 · Synthesis	`MetaAgent`	I challenge today's signal and learn from yesterday's. When my critique fails, I auto-fire a BacktestRequested so we know whether the failure is regime-local or systemic.	`nvidia/nemotron-3-super-120b-a12b` ↗	`SignalProposed`, `DailyReportReady`	`CritiqueClean`, `CritiqueFailed`, `StrategyTuned`, `BacktestRequested`	live
4 · Construction	`BacktestAgent`	I run candidate strategies on history; when a run clears the Sharpe + max-DD floor I promote it as a candidate the PM records.	`nvidia/nemotron-3-super-120b-a12b` ↗	`BacktestRequested`	`BacktestReport`, `BacktestStrategyPromoted`	live
4b · Lifecycle	`BenchmarkingAgent`	I run train/val/test split-aware sweeps; the test fold is evaluated exactly once. The `val_test_gap` field surfaces likely train/val overfit.	— (no LLM)	`BenchmarkRequested`	`BenchmarkReport`	live
4b · Lifecycle	`ValidationNarrationAgent`	Phase 2. Nemotron 3 Super 120B narrates the 10-check ValidationReport. Routed through `policy_router(decision_type="validation_narration")` so a DPO'd policy serves when one is promoted; falls back to cloud Nemotron otherwise. Async daemon writes the narrative back atomically so the API returns immediately.	`nvidia/nemotron-3-super-120b-a12b` ↗	`ValidationReportReady`	`ValidationNarrationRequested`, `ValidationNarrationReady`	live
4b · Lifecycle	`ThesisTranslationAgent`	Phase 3. Plain-English thesis → typed `StrategySpec` via Nemotron 3 Super 120B. Routed through `policy_router(decision_type="thesis_translation")`. Handles non-numeric thresholds defensively (coerces to description suffix).	`nvidia/nemotron-3-super-120b-a12b` ↗	`ThesisTranslationRequested`	`ThesisTranslationReady`	live
4b · Lifecycle	`RealizedCompareAgent`	Phase 4. Triple-side compare: backtest expectation ↔ paper fills ↔ live fills joined by `strategy_version_id`. Surfaces slippage drift, behavior match, n_warnings. Subscribes to `BenchmarkReport` for auto-refresh.	— (no LLM)	`BenchmarkReport`, `RealizedCompareRequested`	`RealizedCompareReady`	live
4b · Lifecycle	`PostTradeAnalystAgent`	Phase 6. Drift detection across 5 dimensions scored to [-1,+1]. Deterministic 7-action recommendation engine (hold / reduce / pause / re_run_validation / retrain / change_params / retire). Nemotron narrates the recommendation verbatim — never overrides the logic.	`nvidia/nemotron-3-super-120b-a12b` ↗	`RealizedCompareReady`, `PostTradeAnalysisRequested`	`PostTradeAnalysisReady`	live
10 · Regime	`RegimeDetectorAgent`	I match the live indicator state against the regime catalog; when nothing matches I draft a new YAML for the operator to review.	`nvidia/nemotron-3-super-120b-a12b` ↗	`FeaturesReady`, `RegimeScanRequested`	`RegimeMatchProposed`, `RegimeDraftProposed`, `RegimesChanged`	live
4 · Construction	`PortfolioOptimizationAgent`	I solve the CVaR problem (cuFOLIO) and blend the result with the active NemoRL policy when one is loaded. MetaAgent's CritiqueFailed gates me — I skip the rebalance and emit RebalanceSkipped.	`nvidia/nemotron-3-super-120b-a12b` ↗	`SignalProposed`, `CritiqueFailed`	`RebalanceProposed`, `RebalanceSkipped`	live
4 · Construction	`PortfolioConstructionAgent`	I make the math executable.	`nvidia/nemotron-3-super-120b-a12b` ↗	`RebalanceProposed`	`RebalanceConstructed`	live
4 · Construction	`CapitalAllocationAgent`	I split the household across sleeves before anyone sizes a trade.	`nvidia/nemotron-3-super-120b-a12b` ↗	`MultiSleeveRebalanceRequested`	`CapitalAllocationProposed`	live
4 · Construction	`CapitalAllocationApprovalAgent`	When the PM approves an allocation, I gate it through Compliance and make it the household's truth.	`nvidia/nemotron-3-super-120b-a12b` ↗	`CapitalAllocationApproved`	`AllocationCleared`, `AllocationBlocked`, `CapitalAllocated`	live
5 · Compliance	`ComplianceAgent`	Nothing trades unless it's clean. Phase 7: also runs at the strategy-definition layer on `ValidationReportReady` — scans the sleeve universe against the restricted list (GME/AMC/SPCE) and universe_size>50, emits `ComplianceAdvisory` regardless of finding count so the audit log proves a scan ran.	`nvidia/nemotron-3-super-120b-a12b` ↗	`RebalanceConstructed`, `CapitalAllocationApproved`, `ValidationReportReady`	`RebalanceCleared`, `RebalanceBlocked`, `AllocationCleared`, `AllocationBlocked`, `ComplianceAdvisory`	live
6 · Execution	`ExecutionAgent`	I place orders and follow the schedule.	`nvidia/nemotron-3-super-120b-a12b` ↗	`RebalanceApproved`	`OrderPlaced`, `OrderFilled`, `OrderCancelled`	live
6 · Execution	`LiveMonitorAgent`	I watch positions during the day.	`nvidia/nemotron-3-super-120b-a12b` ↗	`MarketTick`	`RiskBreach`	live
6 · Execution	`ReportingAgent`	I tell you what happened — KPIs + IS/VWAP/slippage.	`nvidia/nemotron-3-super-120b-a12b` ↗	`OrderFilled`, `Scheduler.tick.eod_close`	`DailyReportReady`	live
7 · Oversight	`PortfolioManagerAgent`	I am the PM. I observe every meaningful event on the A2A bus — research views, signal fusions, compliance verdicts, fills, RL retrains, AutoResearch sessions, HybridRAG retrievals, regime shifts — so when the operator asks me a question I can synthesize across the whole platform. I approve/reject rebalances, auto-fire HybridRAGQuery on >15% concentration risk, halt AutoResearch on hard compliance vetoes, and record candidate strategies.	`nvidia/nemotron-3-super-120b-a12b` ↗	`ResearchComplete`, `PredictionReady`, `RegimeTag`, `DeepResearchComplete`, `HybridRAGComplete`, `SignalProposed`, `CritiqueClean`, `CritiqueFailed`, `BacktestReport`, `BacktestStrategyPromoted`, `RebalanceProposed`, `RebalanceConstructed`, `RebalanceCleared`, `RebalanceBlocked`, `RebalanceSkipped`, `OrderPlaced`, `OrderFilled`, `OrderRejected`, `TrainNemoRLRequested`, `PreferenceModelUpdated`, `PreferenceRecorded`, `NeMoRLTrainingStarted`, `NeMoRLTrainingProgress`, `NeMoRLTrainingComplete`, `NeMoRLTrainingCancelled`, `CapitalAllocationProposed`, `AllocationCleared`, `AllocationBlocked`, `CapitalAllocated`, `DataAnomaly`, `RegimesChanged`, `RegimeMatchProposed`, `RegimeDraftProposed`, `PMChatQuery`	`RebalanceApproved`, `RebalanceRejected`, `PMChatResponse`, `CandidateStrategyRecorded`, `HybridRAGQuery`, `AutoResearchStop`	live
8 · Feedback	`NeMoRLTrainingAgent`	I launch NVIDIA NeMo-RL training runs (SFT/DPO/PPO/GRPO/DAPO/GDPO/RM/distillation) via the bridge — subprocess into the dedicated Python 3.13 env where nemo-rl 0.6.0 lives.	— (no LLM)	`TrainNemoRLRequested`	`NeMoRLTrainingStarted`	live
8 · Feedback	`NeMoRLFeedbackAgent`	I count preference pairs from PreferenceLearningAgent (approve/reject on rebalances + narrations). Once enough accumulate, I emit TrainNemoRLRequested(algo='dpo') so the Nemotron policy gets a fresh DPO retrain on the latest user feedback.	— (no LLM)	`PreferenceRecorded`, `RebalanceDecided`	`TrainNemoRLRequested`	live
8b · DPO closure	`PolicyPromotionAgent`	Phase 8. I close the DPO loop. On `NeMoRLTrainingComplete` I locate the per-run checkpoint, run NeMo-RL's DCP→HF converter as a subprocess, parse the final eval metrics from the run log, compare against baseline (prior active policy or fallback floor), and auto-promote the candidate to the policy registry when the gate passes. Emits `PolicyCandidateRegistered` on every completed run, `PolicyPromoted` on gate pass, `PolicyPromotionFailed` on validation failure.	— (no LLM)	`NeMoRLTrainingComplete`	`PolicyCandidateRegistered`, `PolicyPromoted`, `PolicyPromotionFailed`	live · new
8b · DPO closure	`InferenceServerAgent`	Phase 8. I hot-reload vLLM on localhost:8024 whenever a policy is promoted. On `PolicyPromoted` I read the checkpoint path off the payload, stop the running vLLM (if any), start a fresh one against the new checkpoint, and emit `LocalInferenceReloadStarted`. During cold-start (~30-60s for an 8B model) `policy_router` falls back to build.nvidia.com automatically — no downtime in the demo.	— (no LLM — manages vLLM subprocess)	`PolicyPromoted`	`LocalInferenceReloadStarted`	live · new
2 · Discovery	`SignalGeneratorAgent`	Phase 9. I propose alpha-signal formulas as JSON-AST trees over a 66-operator vocabulary (TS_, CS_, Rank_, Decay_, math, norm, data, cond — adapted from NVIDIA-AI-Blueprints `quantitative-signal-discovery-agent` `calculator.json`). Routed through `policy_router("signal_generation", temp=0.8)`. When a prior iteration failed, I see the OptimizationAdvisor's critique + the best-so-far formula in my prompt (Grinold-Kahn-aware ranking) and propose strictly better candidates.	`nvidia/nemotron-3-super-120b-a12b` ↗	`SignalGenerationRequested`	`SignalCandidatesGenerated`, `SignalGenerationFailed`	live · new
3 · Discovery	`SignalCodeGeneratorAgent`	Phase 9. I validate the LLM-emitted JSON AST against a strict operator whitelist and compile each formula to a vectorized pandas/numpy callable via `OPERATOR_REGISTRY`. No `exec()` on LLM output — safer than the upstream blueprint's "LLM emits Python" step. Aliases (`TS_StdDev`→`TS_Std`, `TS_ZScore`→`TS_Zscore`) resolve transparently so common spelling variants don't waste an iteration.	— (no LLM — deterministic compile)	`SignalCandidatesGenerated`	`SignalCodeCompiled`, `SignalCompilationFailed`	live · new
2 · Discovery	`OptimizationAdvisorAgent`	Phase 9. When a batch fails the acceptance gate (\|IC\|≥0.02 AND p≤0.05), I write concrete operator-level feedback for the next iteration — "Try TS_Rank instead of CS_Rank on the momentum factor, add TS_Zscore for vol normalization, gate when TS_Std rank is high." Routed through `policy_router("optimization_advisor", temp=0.5)`. Few-shot prompt teaches the canonical critique shape (operator substitution + reasoning + predicted impact).	`nvidia/nemotron-3-super-120b-a12b` ↗	`IterationFailed`	`OptimizationAdviceGenerated`, `OptimizationAdviceFailed`	live · new
3 · Discovery	`SignalDiscoveryOrchestratorAgent`	Phase 9. I drive the closed-loop discovery workflow end-to-end. For each iteration: SignalGenerator → SignalCodeGenerator → evaluate (Mean IC + p-value + IR + decay + spread Sharpe/CAGR on real yfinance bars matching the operator's window) → acceptance gate → if rejected, OptimizationAdvisor critiques and the feedback binds the next generator call. On acceptance the formula persists to `data/discovery/signals/<id>.json`; promote-to-sleeve writes a `configs/sleeves/discovered_*.yaml` with the Grinold-Kahn α-tilt wiring for cuFOLIO. Universe is auto-resolved from intent via a fourth LLM role (`universe_resolution`).	`nvidia/nemotron-3-super-120b-a12b` ↗	`SignalDiscoveryRequested`	`SignalDiscoveryStarted`, `UniverseResolved`, `SignalCandidatesGenerated`, `SignalEvaluated`, `SignalRejected`, `SignalAccepted`, `OptimizationAdviceGenerated`, `SignalDiscoveryComplete`, `SignalDiscoveryFailed`	live · new
8 · Feedback	`PreferenceLearningAgent`	I turn every Approve/Override/Reject into a DPO training row.	— (no LLM)	`OrderFilled`, `RebalanceApproved`, `RebalanceRejected`	`PreferenceModelUpdated`	live · new
8 · Feedback	`AuditAgent`	I write every meaningful bus event to immutable JSONL — full rebalance lifecycle, capital allocator lifecycle, NeMo-RL training lifecycle, the Phase 0-8 lifecycle (Benchmark, Validation, Thesis, RealizedCompare, PostTrade, Compliance advisories, policy candidate/promote/reload), critique verdicts, data anomalies. Phase 7: every audit row is tagged with `strategy_version_id`, `strategy_id`, and `mode` so post-hoc analysis can filter by version without parsing payload internals. 51 events subscribed — the chain is fully replayable.	— (no LLM)	(51 events) Trade lifecycle · NeMo-RL training lifecycle · Capital allocation lifecycle · Backtest/strategy lifecycle · Critique/tuning · Data integrity · Stress-regime lifecycle · Research/prediction · Phase 0-6 lifecycle (`BenchmarkReport`, `BenchmarkRequested`, `ValidationReportReady`, `ValidationNarrationRequested`, `ValidationNarrationReady`, `ThesisTranslationReady`, `RealizedCompareReady`, `RealizedCompareRequested`, `PostTradeAnalysisReady`, `PostTradeAnalysisRequested`, `ComplianceAdvisory`) · Phase 8 closure (`PolicyCandidateRegistered`, `PolicyPromoted`, `PolicyPromotionFailed`, `LocalInferenceReloadStarted`) · Phase 9 discovery (`SignalDiscoveryStarted`, `UniverseResolved`, `SignalCandidatesGenerated`, `SignalEvaluated`, `SignalRejected`, `SignalAccepted`, `OptimizationAdviceGenerated`, `SignalDiscoveryComplete`, `llm.reasoning.*` trace spans)	—	live
2 · Research	`MacroRegimeAgent`	I tag the cross-asset regime from VIX, 10Y yield, and DXY.	— (no LLM)	`FeaturesReady`	`RegimeTag`	live
2 · Research	`InsiderActivityAgent`	I score symbols by net insider buying from SEC Form 4 in the last 30 days.	— (no LLM)	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`OptionsFlowAgent`	I score symbols by call/put open-interest skew on the options chain.	— (no LLM)	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`DividendQualityAgent`	I score symbols by yield × payout sustainability × 5y dividend growth.	— (no LLM)	`FeaturesReady`	`ResearchComplete`	live
2 · Research	`HybridRAGAgent`	I extract a typed knowledge graph from a company's filings and news, fuse it with vector retrieval, and answer multi-hop questions with citations.	`nvidia/nemotron-3-super-120b-a12b` ↗	`HybridRAGQuery`, `ResearchKickoff`	`HybridRAGComplete`	live
8 · Feedback	`NemoRLAutoResearchOrchestrator`	Karpathy-pattern meta-loop over NeMo-RL. Each iteration, Nemotron 3 Super proposes a typed config edit (KL penalty, learning rate, batch size); the loop launches the inner DPO/GRPO/SFT run via the NeMo-RL bridge, parses the eval metric, keeps or reverts.	`nvidia/nemotron-3-super-120b-a12b` ↗	—	`TrainNemoRLRequested`	live

Per-agent skill cards

Each agent has a dedicated skill card with subscribes, emits, primary model, and Python entry points. The cards mirror skills/agents/<id>.md in the repo and are regenerated from identities.py.

Browse all 36 skill cards →

Live bus status · activate / deactivate

Every agent currently registered on the live A2A event bus. Toggle active flags, fire a test trigger, and reach the Agent Builder.

＋ Build new agent →

loading bus state…

Where to find them

Identities: src/traderspace/agents/identities.py (single source of truth for the roster).
Concrete bus agents: src/traderspace/bus/agents/ (21 modules + 4 example agents in examples/).
Per-agent skill cards (markdown): skills/agents/<id>.md — auto-generated from identities.py.
Per-agent docs pages (HTML): mockups/docs/agents/<id>.html — rendered from the markdown.

Regenerate after editing identities:

PYTHONPATH=src python scripts/gen_skills.py            # markdown skill cards
PYTHONPATH=src python scripts/gen_agent_skill_docs.py  # HTML docs pages
PYTHONPATH=src python scripts/gen_docs.py              # roster + all docs