Provisioning overview

Provisioning is the workflow the platform runs to take a new-tenant request from "click submit" to "tenant is serving traffic at its subdomain." It's the busiest thing the control plane does — multi-phase, idempotent, observable from the audit feed in real time. This page explains how it works so the rest of the tenant-lifecycle docs read cleanly.

The phases

Each phase is idempotent: if a step fails transiently the workflow retries that step without redoing what already completed. The audit feed shows the chronology in real time.

Validate the request. The slug isn't taken; the plan exists; the caller's role admits provisioning.
Allocate the tenant's resources. Storage, identity store, traffic boundary stood up under the tenant's name.
Deploy the tenant's workloads. The services that will serve the tenant's authentication traffic launch and pass their readiness checks.
Mark the tenant active. State flips to active; traffic to https://<slug>-<org>.<domain> starts succeeding.
Send the welcome emails. Best-effort three-email sequence to the first admin: welcome, console URL, onboarding.

Total wall-clock time on a Free-tier tenant is typically 30 seconds to a few minutes. Larger plans with more resources take longer.

What you can observe from the console

While the workflow is running, the tenant's detail page is the best place to watch:

The state badge in the header reads Provisioning.
The Provisioning timeline panel fills in step by step with timestamps.
A "Polling..." indicator next to the timeline heading shows the page is live.

When the workflow completes successfully, the state flips to Active, the Open Admin Console link appears, and the timeline gets a final "Tenant activated" entry.

If a phase fails after retries, the workflow stops and the tenant lands in Failed state. The timeline names the failing step; the audit feed carries the tenant.provisioning.failed event with details. See Recover a stuck provisioning saga for the recovery path.

What you can observe from the audit feed

Provisioning emits a small family of events:

tenant.provisioning.started — the saga is running.
tenant.provisioning.step_completed — once per phase, with a human-readable step name in details.
tenant.provisioning.completed — the workflow reached MarkActive.
tenant.provisioning.failed — the workflow gave up after retries.
tenant.provisioning.compensated — a compensation step ran during rollback.

Filtering the audit feed by tenant.provisioning.* gives you the full chronology for any tenant.

Idempotency, briefly

Each phase is designed to be retryable without side effects. That means:

Workflow retries are safe. If the platform restarts mid-saga, the workflow resumes from the last successful phase and continues.
Activity retries are safe. If a single phase fails transiently (e.g. a brief upstream blip), it retries automatically up to a budget.
Manual retries are safe. If you retry a failed tenant from the console, the workflow restarts from the last failed step, not from scratch.

This is why partial failures don't leave irrecoverable damage — every step knows how to no-op when it's already been done.

When provisioning fails

Phases that fail after retries park the tenant in Failed state and roll back what's been done so far. The compensation runs in reverse order — releasing whatever resources got allocated before the failure point. The audit log captures every compensation step too, so you can reconstruct what happened.

From the Failed state, you have two paths:

Retry. The workflow resumes from the last failed step. If the underlying cause was transient (a brief upstream outage), this usually succeeds.
Decommission. Treat the tenant as terminal. Release any remaining resources and free the slug for reuse.

See Recover a stuck provisioning saga for the diagnostic order of operations.