Service Mode
Purpose
Service mode keeps persistent worker loops alive so you can reuse the same execution context across multiple runs. This avoids paying startup cost for each request and makes health/SLA checks explicit.
When to use it
Use service mode when you need one or more of the following:
repeated requests on the same app with low latency;
controlled worker lifecycle (start/status/health/stop);
machine-readable health output for monitoring.
Fast path in ORCHESTRATE (web interface)
Open ORCHESTRATE and select your project.
In System settings, configure cluster mode, scheduler, and workers.
In Service mode (persistent workers), click START service once.
Use STATUS service to inspect running/pending workers.
Use HEALTH gate to enforce SLA thresholds from
app_settings.toml.Use STOP service before changing topology or ending the session.
Action semantics
action="start": provisions workers and starts persistent loops.action="status": returns runtime state (running/degraded/idle/stopped/error).action="health": same status snapshot plus JSON export (schemaagi.service.health.v1).action="stop": requests loop termination and optionally shuts down the Dask cluster.
End-to-end CLI example
import asyncio
from agi_cluster.agi_distributor import AGI
from agi_env import AgiEnv
APPS_PATH = "src/agilab/apps/builtin"
APP = "mycode_project"
async def main():
env = AgiEnv(apps_path=APPS_PATH, app=APP, verbose=1)
started = await AGI.serve(env, action="start")
print("START:", started["status"])
status = await AGI.serve(env, action="status")
print("STATUS:", status["status"], status.get("workers_running_count", 0))
health = await AGI.serve(env, action="health")
print("HEALTH:", health["status"], health.get("workers_unhealthy_count", 0))
stopped = await AGI.serve(env, action="stop", shutdown_on_stop=False)
print("STOP:", stopped["status"])
if __name__ == "__main__":
asyncio.run(main())
SLA thresholds
Per-app defaults are stored in [cluster.service_health]:
[cluster.service_health]
allow_idle = false
max_unhealthy = 0
max_restart_rate = 0.25
These values are used by the ORCHESTRATE HEALTH gate and by
tools/service_health_check.py unless overridden on the command line.
Operational checks
Use this checker for automation/monitoring:
uv run python tools/service_health_check.py \
--app mycode_project \
--apps-path src/agilab/apps/builtin
Health JSON is written by default to:
${AGI_SHARE_DIR}/service/<app_target>/health.json
Common pitfalls
Calling
starttwice withoutstopfirst: stop the existing service before restarting.Health status is
idlebut policy requires activity: setallow_idle = falseand enforce with HEALTH gate.Missing health file in external monitor: call
action="health"and verify permissions on the target output path.