ORCHESTRATE
Introduction
Orchestrate walks through the lifecycle required to ship and operate an AGILab
application. It generates ready-to-run snippets, streams logs back into the UI
and keeps app_settings.toml synchronised so that installs, distribution
checks and runs are reproducible.
Main Content Area
System settingsgroups the cluster configuration. Toggle support forpool,cythonandrapids, enable the Dask scheduler and provide IP definitions for workers. The calculated mode hint clarifies how the chosen combination will execute and the settings are written back toapp_settings.toml.Installrenders the install snippet that provisions the project’s virtual environments.INSTALLstreams stdout/stderr intoInstall logsso you know when the worker is ready. A successful install automatically enables theRunsection.Distributeis split into two parts:<module> args: edit the run arguments managed inapp_args.py. You can toggle between the generated form UI and the optional custom snippet saved inapp_args_form.py. Saved values update[args]inapp_settings.toml.Distribute details: generates theAGI.get_distribsnippet and theCHECK DISTRIBUTEaction. When the command succeeds theDistribution treeexpander plots the resulting work plan (DAG or tree) andWorkplanlets you reassign partitions to different workers before saving the modified plan.
Runexposes theAGI.runsnippet together with aBenchmark all modestoggle if you want to iterate through every execution path.RUNstreams logs into theRun logsexpander and stores the output timings inbenchmark.json, which is summarised underBenchmark results.Service mode (persistent workers)keeps long-lived worker loops alive and lets you triggerSTART/STATUS/HEALTH gate/STOPwithout rebuilding the execution context every time.LOAD DATAfetches the latest dataframe path configured for the project and shows an in-place preview. The preview is available even after a rerun.Prepare Data for Pipeline and Analysiscreates (or updates) the CSV that powers the Pipeline and Analysis pages. Use the column selector withSelect allsupport to decide which fields are persisted to${AGILAB_EXPORT_ABS}/<module>/export.csv.
Snippet Handoff to Pipeline
For newcomers, keep Orchestrate and Pipeline in sync with this workflow:
Generate the snippet in Orchestrate (typically
AGI.run).On PIPELINE, open Add step (or New step when starting fresh), pick
Step source = gen stepfor a fresh generation, orStep source =an existing snippet (for exampleAGI_run.pyorlab_snippet.py) to import it directly.For app updates, update
<module> argsinapp_settings.toml/[args]then regenerate or re-import the matching snippet in Pipeline.
This avoids running stale code that still references old app argument values.
Service Mode Health
For a complete operator workflow (web and CLI), see Service Mode.
Use these defaults as a stable baseline for most projects:
Heartbeat timeout:10s.Done artifacts TTL:168h(7 days).Failed artifacts TTL:336h(14 days).Heartbeat artifacts TTL:24h.Done/Failed max files:2000each.Heartbeat max files:1000.
Health gate defaults are persisted per app in [cluster.service_health]:
allow_idle(defaultfalse).max_unhealthy(default0).max_restart_rate(default0.25).
When STATUS runs, Orchestrate displays a health table:
worker: Dask worker address.healthy: overall health evaluation for that worker loop.reason: why the worker is unhealthy (empty when healthy).future_state: Dask future state for the loop task.heartbeat_state: latest worker heartbeat-reported state.heartbeat_age_sec: seconds since latest heartbeat.
Use HEALTH gate to run AGI.serve(..., action="health") and immediately
validate the current state against the per-app SLA thresholds above.
Auto-restart reason values currently include:
loop-finished/loop-error/loop-cancelled.missing-heartbeat.stale-heartbeat(<N>s).
Service health JSON export
Each AGI.serve service action writes a machine-readable health snapshot
(agi.service.health.v1), and action="health" returns that payload
directly.
Default output path:
${AGI_SHARE_DIR}/service/<app_target>/health.json.
Custom output path:
health = await AGI.serve(
app_env,
action="health",
health_output_path="service/custom_health.json",
)
print(health["status"], health["workers_unhealthy_count"])
Field reference:
Troubleshooting and checks
Use these checks if Orchestrate actions do not behave as expected:
If
INSTALLstays stuck, check cluster host reachability, SSH credentials, and whether~/.agilab/.envstill points to valid venv paths.If the generated snippet looks wrong, compare
[args]insrc/<project>/src/app_settings.tomlwith the values shown inapp_args_form.py.If
RUNreturns import errors, verify the target virtual environment contains the same versions assrc/<project>/pyproject.tomland re-run install.If no logs appear, confirm the log expansion is expanded and that the runtime has write access to
~/log/execute/<app>.If an external monitor cannot read service health, call
AGI.serve(..., action="health")and verify thathealth.jsonis written at the expected path.
See also
About AGILab to place Orchestrate in the full page flow.
PIPELINE for running the generated snippet in the Pipeline assistant.
ANALYSIS for launching result views.