Data Connectors
AGILAB data connectors are a lightweight contract for external data systems. They let an app or evidence report reference a named data source without hard-coding local paths, credentials, or provider-specific client details in the app code.
The current public contract is intentionally conservative:
connector definitions live in plain-text TOML catalogs
credentials are referenced through environment variables, never embedded
public evidence validates contracts without opening external networks
live probes stay operator-triggered and optional
legacy raw paths can remain available while apps migrate to connector IDs
This is not a second experiment tracker, model registry, or storage UI. It is the data-access contract around AGILAB workflows.
Catalog Shape
The public sample catalog is:
Each connector is a [[connectors]] TOML entry with a stable id, a
kind, a human label, and kind-specific fields.
Supported public kinds are:
Kind |
Typical target |
Contract boundary |
|---|---|---|
|
read-only warehouse or local SQLite proof |
validates URI, driver, and |
|
OpenSearch / ELK index |
validates URL, index, and credential reference |
|
artifact prefixes in cloud object storage |
validates provider, bucket/container, prefix, and credential reference |
Object Storage Providers
Object-storage connectors currently support these providers:
Provider |
Target URI shape |
Runtime dependency |
Credential hint |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
The s3 provider also accepts the aliases aws_s3, amazon_s3, and
s3_compatible. The runtime dependency column describes what an operator
environment needs for live probes; those packages are not required for the
default public contract-validation evidence.
Account-Free Cloud Emulator Validation
Use the cloud-emulators profile when you need AWS/Azure/GCP connector
confidence without owning cloud accounts:
uv --preview-features extra-build-dependencies run python tools/data_connector_cloud_emulator_report.py --compact
uv --preview-features extra-build-dependencies run python tools/workflow_parity.py --profile cloud-emulators
The profile validates the sample emulator catalog against the same connector facility and runtime-adapter contracts used by real cloud targets. It covers:
Cloud target |
Account-free emulator |
Local endpoint |
What is proven |
|---|---|---|---|
AWS S3 / S3-compatible storage |
MinIO |
|
provider aliasing, bucket/prefix target shape, |
Azure Blob Storage |
Azurite |
|
account/container target shape, |
Google Cloud Storage |
fake-gcs-server |
|
|
Search-index wiring |
local OpenSearch or Elasticsearch |
|
URL/index contract and explicit credential boundary |
This gives API-contract and emulator-compatible validation only. It does not prove real IAM, cloud firewall rules, private endpoints, regional behavior, quota, or billing. Those remain opt-in live smoke checks in a real operator environment with real credentials.
Credential Rule
Remote connectors must use auth_ref = "env:NAME". The value points to an
environment variable name, not to the credential itself.
Examples:
auth_ref = "env:AWS_PROFILE"
auth_ref = "env:AZURE_STORAGE_CONNECTION_STRING"
auth_ref = "env:GOOGLE_APPLICATION_CREDENTIALS"
The reports deliberately avoid materializing credential values. If a connector contains a raw secret-like value, the facility report marks the catalog invalid.
Evidence Reports
The public checks are contract-first:
uv --preview-features extra-build-dependencies run python tools/data_connector_facility_report.py --compact
uv --preview-features extra-build-dependencies run python tools/data_connector_resolution_report.py --compact
uv --preview-features extra-build-dependencies run python tools/data_connector_health_report.py --compact
uv --preview-features extra-build-dependencies run python tools/data_connector_health_actions_report.py --compact
uv --preview-features extra-build-dependencies run python tools/data_connector_runtime_adapters_report.py --compact
Use the live endpoint smoke only when you intentionally want to prove the operator-triggered execution path. The default public mode remains network-free.
How To Read The Boundary
facilityproves the catalog is structurally valid.resolutionproves app/page settings can refer to connector IDs while preserving legacy fallback paths.healthplans status probes but does not execute them by default.health_actionsexposes explicit operator-triggered probe actions.runtime_adaptersmaps each connector to the dependency and operation a runtime would need when an operator opts in.
This keeps the first adoption path simple: a new user can run AGILAB without cloud credentials, while an operator can still see exactly which connector, dependency, and environment variable will be needed before enabling live access.