[ reference · agents ]

Agent roster (36)

All 36 agent identities, by tier. 36 core agents on the A2A event bus (32 active by default + 4 example agents inactive until activated via the Agent Builder). Tier 4b includes the Phase-2-6 lifecycle agents — BenchmarkingAgent, ValidationNarrationAgent, ThesisTranslationAgent, RealizedCompareAgent, PostTradeAnalystAgent. Tier 4a includes CapitalAllocationAgent + CapitalAllocationApprovalAgent (multi-sleeve capital allocator). Tier 8 wires NVIDIA NeMo-RL via NeMoRLTrainingAgent + NeMoRLFeedbackAgent for LLM post-training (DPO/GRPO/SFT). Tier 8b (Phase 8 DPO closure) ships PolicyPromotionAgent + InferenceServerAgent: completed training runs auto-validate, auto-promote to the policy registry, and hot-reload vLLM on localhost:8024 so the next narration call hits the trained policy via policy_router. Tier 2/3 · Signal Discovery (new — Phase 9) adapts the NVIDIA-AI-Blueprints quantitative-signal-discovery-agent closed loop: SignalGeneratorAgent proposes JSON-AST formulas over a 66-operator vocabulary, SignalCodeGeneratorAgent compiles the AST through a strict whitelist (no exec() on LLM output), OptimizationAdvisorAgent critiques rejections, and SignalDiscoveryOrchestratorAgent drives the iter loop. Accepted formulas promote to sleeves via Grinold-Kahn α-tilt on cuFOLIO scenarios. Each agent name links to its full skill card. The Model column links to the model card on build.nvidia.com for agents that call an LLM.

Roster

tieragentidentitymodelsubscribesemitsbus?
1 · DataDataAgentI curate market data, flag anomalies, and kick off a HybridRAG refresh whenever fresh data lands.nvidia/nemotron-3-super-120b-a12bScheduler.tick.eodDataReady, DataAnomaly, ResearchKickofflive
1.5 · EngineeringFeatureEngineeringAgentI turn raw data into model-ready features.nvidia/nemotron-3-super-120b-a12bDataReadyFeaturesReadylive
2 · ResearchPredictiveModelingAgentI train ML models and forecast returns.nvidia/nemotron-3-super-120b-a12bFeaturesReadyPredictionReadylive
2 · ResearchDeepResearchAgentI run multi-step AIQ Deep Research (planner → researcher → synthesizer → citer).nvidia/nemotron-3-super-120b-a12bDeepResearchRequestedDeepResearchCompletelive
2 · ResearchFundamentalAgentI read 10-Ks, 10-Qs, transcripts.nvidia/nemotron-3-super-120b-a12bFeaturesReadyResearchCompletelive
2 · ResearchTechnicalAgentI read price and volume only.nvidia/nemotron-3-super-120b-a12bFeaturesReadyResearchCompletelive
2 · ResearchSentimentAgentI read what humans are saying.nvidia/nemotron-3-super-120b-a12bFeaturesReadyResearchCompletelive
2 · ResearchAIFactorAgentI find latent factors and regimes.nvidia/nemotron-3-super-120b-a12bFeaturesReadyResearchComplete, RegimeTaglive
3 · SynthesisSignalAgentI fuse every research view (technical · fundamental · sentiment · ai-factor · predictive · deep-research · hybridrag) and the regime tag into a per-ticker conviction signal. When top-candidate conviction is weak (< 0.30) I auto-kick a DeepResearchRequested on that name so the next cascade has stronger views.nvidia/nemotron-3-super-120b-a12bResearchComplete, PredictionReady, DeepResearchComplete, HybridRAGComplete, RegimeTagSignalProposed, DeepResearchRequestedlive
3 · SynthesisMetaAgentI challenge today's signal and learn from yesterday's. When my critique fails, I auto-fire a BacktestRequested so we know whether the failure is regime-local or systemic.nvidia/nemotron-3-super-120b-a12bSignalProposed, DailyReportReadyCritiqueClean, CritiqueFailed, StrategyTuned, BacktestRequestedlive
4 · ConstructionBacktestAgentI run candidate strategies on history; when a run clears the Sharpe + max-DD floor I promote it as a candidate the PM records.nvidia/nemotron-3-super-120b-a12bBacktestRequestedBacktestReport, BacktestStrategyPromotedlive
4b · LifecycleBenchmarkingAgentI run train/val/test split-aware sweeps; the test fold is evaluated exactly once. The val_test_gap field surfaces likely train/val overfit.(no LLM)BenchmarkRequestedBenchmarkReportlive
4b · LifecycleValidationNarrationAgentPhase 2. Nemotron 3 Super 120B narrates the 10-check ValidationReport. Routed through policy_router(decision_type="validation_narration") so a DPO'd policy serves when one is promoted; falls back to cloud Nemotron otherwise. Async daemon writes the narrative back atomically so the API returns immediately.nvidia/nemotron-3-super-120b-a12bValidationReportReadyValidationNarrationRequested, ValidationNarrationReadylive
4b · LifecycleThesisTranslationAgentPhase 3. Plain-English thesis → typed StrategySpec via Nemotron 3 Super 120B. Routed through policy_router(decision_type="thesis_translation"). Handles non-numeric thresholds defensively (coerces to description suffix).nvidia/nemotron-3-super-120b-a12bThesisTranslationRequestedThesisTranslationReadylive
4b · LifecycleRealizedCompareAgentPhase 4. Triple-side compare: backtest expectation ↔ paper fills ↔ live fills joined by strategy_version_id. Surfaces slippage drift, behavior match, n_warnings. Subscribes to BenchmarkReport for auto-refresh.(no LLM)BenchmarkReport, RealizedCompareRequestedRealizedCompareReadylive
4b · LifecyclePostTradeAnalystAgentPhase 6. Drift detection across 5 dimensions scored to [-1,+1]. Deterministic 7-action recommendation engine (hold / reduce / pause / re_run_validation / retrain / change_params / retire). Nemotron narrates the recommendation verbatim — never overrides the logic.nvidia/nemotron-3-super-120b-a12bRealizedCompareReady, PostTradeAnalysisRequestedPostTradeAnalysisReadylive
10 · RegimeRegimeDetectorAgentI match the live indicator state against the regime catalog; when nothing matches I draft a new YAML for the operator to review.nvidia/nemotron-3-super-120b-a12bFeaturesReady, RegimeScanRequestedRegimeMatchProposed, RegimeDraftProposed, RegimesChangedlive
4 · ConstructionPortfolioOptimizationAgentI solve the CVaR problem (cuFOLIO) and blend the result with the active NemoRL policy when one is loaded. MetaAgent's CritiqueFailed gates me — I skip the rebalance and emit RebalanceSkipped.nvidia/nemotron-3-super-120b-a12bSignalProposed, CritiqueFailedRebalanceProposed, RebalanceSkippedlive
4 · ConstructionPortfolioConstructionAgentI make the math executable.nvidia/nemotron-3-super-120b-a12bRebalanceProposedRebalanceConstructedlive
4 · ConstructionCapitalAllocationAgentI split the household across sleeves before anyone sizes a trade.nvidia/nemotron-3-super-120b-a12bMultiSleeveRebalanceRequestedCapitalAllocationProposedlive
4 · ConstructionCapitalAllocationApprovalAgentWhen the PM approves an allocation, I gate it through Compliance and make it the household's truth.nvidia/nemotron-3-super-120b-a12bCapitalAllocationApprovedAllocationCleared, AllocationBlocked, CapitalAllocatedlive
5 · ComplianceComplianceAgentNothing trades unless it's clean. Phase 7: also runs at the strategy-definition layer on ValidationReportReady — scans the sleeve universe against the restricted list (GME/AMC/SPCE) and universe_size>50, emits ComplianceAdvisory regardless of finding count so the audit log proves a scan ran.nvidia/nemotron-3-super-120b-a12bRebalanceConstructed, CapitalAllocationApproved, ValidationReportReadyRebalanceCleared, RebalanceBlocked, AllocationCleared, AllocationBlocked, ComplianceAdvisorylive
6 · ExecutionExecutionAgentI place orders and follow the schedule.nvidia/nemotron-3-super-120b-a12bRebalanceApprovedOrderPlaced, OrderFilled, OrderCancelledlive
6 · ExecutionLiveMonitorAgentI watch positions during the day.nvidia/nemotron-3-super-120b-a12bMarketTickRiskBreachlive
6 · ExecutionReportingAgentI tell you what happened — KPIs + IS/VWAP/slippage.nvidia/nemotron-3-super-120b-a12bOrderFilled, Scheduler.tick.eod_closeDailyReportReadylive
7 · OversightPortfolioManagerAgentI am the PM. I observe every meaningful event on the A2A bus — research views, signal fusions, compliance verdicts, fills, RL retrains, AutoResearch sessions, HybridRAG retrievals, regime shifts — so when the operator asks me a question I can synthesize across the whole platform. I approve/reject rebalances, auto-fire HybridRAGQuery on >15% concentration risk, halt AutoResearch on hard compliance vetoes, and record candidate strategies.nvidia/nemotron-3-super-120b-a12bResearchComplete, PredictionReady, RegimeTag, DeepResearchComplete, HybridRAGComplete, SignalProposed, CritiqueClean, CritiqueFailed, BacktestReport, BacktestStrategyPromoted, RebalanceProposed, RebalanceConstructed, RebalanceCleared, RebalanceBlocked, RebalanceSkipped, OrderPlaced, OrderFilled, OrderRejected, TrainNemoRLRequested, PreferenceModelUpdated, PreferenceRecorded, NeMoRLTrainingStarted, NeMoRLTrainingProgress, NeMoRLTrainingComplete, NeMoRLTrainingCancelled, CapitalAllocationProposed, AllocationCleared, AllocationBlocked, CapitalAllocated, DataAnomaly, RegimesChanged, RegimeMatchProposed, RegimeDraftProposed, PMChatQueryRebalanceApproved, RebalanceRejected, PMChatResponse, CandidateStrategyRecorded, HybridRAGQuery, AutoResearchStoplive
8 · FeedbackNeMoRLTrainingAgentI launch NVIDIA NeMo-RL training runs (SFT/DPO/PPO/GRPO/DAPO/GDPO/RM/distillation) via the bridge — subprocess into the dedicated Python 3.13 env where nemo-rl 0.6.0 lives.(no LLM)TrainNemoRLRequestedNeMoRLTrainingStartedlive
8 · FeedbackNeMoRLFeedbackAgentI count preference pairs from PreferenceLearningAgent (approve/reject on rebalances + narrations). Once enough accumulate, I emit TrainNemoRLRequested(algo='dpo') so the Nemotron policy gets a fresh DPO retrain on the latest user feedback.(no LLM)PreferenceRecorded, RebalanceDecidedTrainNemoRLRequestedlive
8b · DPO closurePolicyPromotionAgentPhase 8. I close the DPO loop. On NeMoRLTrainingComplete I locate the per-run checkpoint, run NeMo-RL's DCP→HF converter as a subprocess, parse the final eval metrics from the run log, compare against baseline (prior active policy or fallback floor), and auto-promote the candidate to the policy registry when the gate passes. Emits PolicyCandidateRegistered on every completed run, PolicyPromoted on gate pass, PolicyPromotionFailed on validation failure.(no LLM)NeMoRLTrainingCompletePolicyCandidateRegistered, PolicyPromoted, PolicyPromotionFailedlive · new
8b · DPO closureInferenceServerAgentPhase 8. I hot-reload vLLM on localhost:8024 whenever a policy is promoted. On PolicyPromoted I read the checkpoint path off the payload, stop the running vLLM (if any), start a fresh one against the new checkpoint, and emit LocalInferenceReloadStarted. During cold-start (~30-60s for an 8B model) policy_router falls back to build.nvidia.com automatically — no downtime in the demo.(no LLM — manages vLLM subprocess)PolicyPromotedLocalInferenceReloadStartedlive · new
2 · DiscoverySignalGeneratorAgentPhase 9. I propose alpha-signal formulas as JSON-AST trees over a 66-operator vocabulary (TS_*, CS_*, Rank_*, Decay_*, math, norm, data, cond — adapted from NVIDIA-AI-Blueprints quantitative-signal-discovery-agent calculator.json). Routed through policy_router("signal_generation", temp=0.8). When a prior iteration failed, I see the OptimizationAdvisor's critique + the best-so-far formula in my prompt (Grinold-Kahn-aware ranking) and propose strictly better candidates.nvidia/nemotron-3-super-120b-a12bSignalGenerationRequestedSignalCandidatesGenerated, SignalGenerationFailedlive · new
3 · DiscoverySignalCodeGeneratorAgentPhase 9. I validate the LLM-emitted JSON AST against a strict operator whitelist and compile each formula to a vectorized pandas/numpy callable via OPERATOR_REGISTRY. No exec() on LLM output — safer than the upstream blueprint's "LLM emits Python" step. Aliases (TS_StdDevTS_Std, TS_ZScoreTS_Zscore) resolve transparently so common spelling variants don't waste an iteration.(no LLM — deterministic compile)SignalCandidatesGeneratedSignalCodeCompiled, SignalCompilationFailedlive · new
2 · DiscoveryOptimizationAdvisorAgentPhase 9. When a batch fails the acceptance gate (|IC|≥0.02 AND p≤0.05), I write concrete operator-level feedback for the next iteration — "Try TS_Rank instead of CS_Rank on the momentum factor, add TS_Zscore for vol normalization, gate when TS_Std rank is high." Routed through policy_router("optimization_advisor", temp=0.5). Few-shot prompt teaches the canonical critique shape (operator substitution + reasoning + predicted impact).nvidia/nemotron-3-super-120b-a12bIterationFailedOptimizationAdviceGenerated, OptimizationAdviceFailedlive · new
3 · DiscoverySignalDiscoveryOrchestratorAgentPhase 9. I drive the closed-loop discovery workflow end-to-end. For each iteration: SignalGenerator → SignalCodeGenerator → evaluate (Mean IC + p-value + IR + decay + spread Sharpe/CAGR on real yfinance bars matching the operator's window) → acceptance gate → if rejected, OptimizationAdvisor critiques and the feedback binds the next generator call. On acceptance the formula persists to data/discovery/signals/<id>.json; promote-to-sleeve writes a configs/sleeves/discovered_*.yaml with the Grinold-Kahn α-tilt wiring for cuFOLIO. Universe is auto-resolved from intent via a fourth LLM role (universe_resolution).nvidia/nemotron-3-super-120b-a12bSignalDiscoveryRequestedSignalDiscoveryStarted, UniverseResolved, SignalCandidatesGenerated, SignalEvaluated, SignalRejected, SignalAccepted, OptimizationAdviceGenerated, SignalDiscoveryComplete, SignalDiscoveryFailedlive · new
8 · FeedbackPreferenceLearningAgentI turn every Approve/Override/Reject into a DPO training row.(no LLM)OrderFilled, RebalanceApproved, RebalanceRejectedPreferenceModelUpdatedlive · new
8 · FeedbackAuditAgentI write every meaningful bus event to immutable JSONL — full rebalance lifecycle, capital allocator lifecycle, NeMo-RL training lifecycle, the Phase 0-8 lifecycle (Benchmark, Validation, Thesis, RealizedCompare, PostTrade, Compliance advisories, policy candidate/promote/reload), critique verdicts, data anomalies. Phase 7: every audit row is tagged with strategy_version_id, strategy_id, and mode so post-hoc analysis can filter by version without parsing payload internals. 51 events subscribed — the chain is fully replayable.(no LLM)(51 events) Trade lifecycle · NeMo-RL training lifecycle · Capital allocation lifecycle · Backtest/strategy lifecycle · Critique/tuning · Data integrity · Stress-regime lifecycle · Research/prediction · Phase 0-6 lifecycle (BenchmarkReport, BenchmarkRequested, ValidationReportReady, ValidationNarrationRequested, ValidationNarrationReady, ThesisTranslationReady, RealizedCompareReady, RealizedCompareRequested, PostTradeAnalysisReady, PostTradeAnalysisRequested, ComplianceAdvisory) · Phase 8 closure (PolicyCandidateRegistered, PolicyPromoted, PolicyPromotionFailed, LocalInferenceReloadStarted) · Phase 9 discovery (SignalDiscoveryStarted, UniverseResolved, SignalCandidatesGenerated, SignalEvaluated, SignalRejected, SignalAccepted, OptimizationAdviceGenerated, SignalDiscoveryComplete, llm.reasoning.* trace spans)live
2 · ResearchMacroRegimeAgentI tag the cross-asset regime from VIX, 10Y yield, and DXY.(no LLM)FeaturesReadyRegimeTaglive
2 · ResearchInsiderActivityAgentI score symbols by net insider buying from SEC Form 4 in the last 30 days.(no LLM)FeaturesReadyResearchCompletelive
2 · ResearchOptionsFlowAgentI score symbols by call/put open-interest skew on the options chain.(no LLM)FeaturesReadyResearchCompletelive
2 · ResearchDividendQualityAgentI score symbols by yield × payout sustainability × 5y dividend growth.(no LLM)FeaturesReadyResearchCompletelive
2 · ResearchHybridRAGAgentI extract a typed knowledge graph from a company's filings and news, fuse it with vector retrieval, and answer multi-hop questions with citations.nvidia/nemotron-3-super-120b-a12bHybridRAGQuery, ResearchKickoffHybridRAGCompletelive
8 · FeedbackNemoRLAutoResearchOrchestratorKarpathy-pattern meta-loop over NeMo-RL. Each iteration, Nemotron 3 Super proposes a typed config edit (KL penalty, learning rate, batch size); the loop launches the inner DPO/GRPO/SFT run via the NeMo-RL bridge, parses the eval metric, keeps or reverts.nvidia/nemotron-3-super-120b-a12bTrainNemoRLRequestedlive

Per-agent skill cards

Each agent has a dedicated skill card with subscribes, emits, primary model, and Python entry points. The cards mirror skills/agents/<id>.md in the repo and are regenerated from identities.py.

Browse all 36 skill cards →

Live bus status · activate / deactivate

Every agent currently registered on the live A2A event bus. Toggle active flags, fire a test trigger, and reach the Agent Builder.

+ Build new agent →
loading bus state…

Where to find them

NVTrader v0.1.18 · docs ·⚠ Not financial advice ·Docs home ·App