[ agent · 8b · DPO closure ]

InferenceServerAgent

I manage the local vLLM inference server. On every PolicyPromoted I stop the running vLLM (if any) and start a fresh one against the new checkpoint on localhost:8024 with the OpenAI-compatible API. During cold-start the policy_router falls back to cloud Nemotron automatically.

← Back to roster No LLM (manages vLLM subprocess) Phase 8 · new