Launch a training run
POST /api/nemo-rl/launch · routes via the NAT A2A bus through NeMoRLTrainingAgent
Training runs — loading
loading runs from /api/nemo-rl/runs…
Select a run on the left
—
[ live training · 5 traces ]
— points
click a run to stream its log
NemoRL AutoResearch Karpathy pattern · meta-agent over NeMo-RL
idle
Meta-loop pattern from karpathy/autoresearch. Each iteration, Nemotron 3 Super proposes a typed config edit (KL penalty, batch size, learning rate, …); the loop launches the inner training via the NeMo-RL bridge, parses the eval metric, keeps or reverts. Click ⚙ Launch with AutoResearch above to start a session — the meta-agent picks the best config across `budget` iterations.
[ trials · per-trial metric ]
— trials
events:
TrainNemoRLRequested
NeMoRLTrainingStarted
NeMoRLTrainingProgress
NeMoRLTrainingComplete
PM Agent + AuditAgent subscribe to all of these
[ DPO LOOP · PROMOTED POLICIES ]
trained checkpoint → local vLLM → next narration call
vLLM
checking…
[ active policy per decision_type ]
[ all registered policies ]
events:
PolicyCandidateRegistered
PolicyPromoted
PolicyPromotionFailed
LocalInferenceReloadStarted
PolicyPromotionAgent → InferenceServerAgent → policy_router