The Identities page aggregates users across every active tenant your organisation owns. When one of those tenants' identity stores is temporarily unreachable (transient infrastructure issue, partial outage), the aggregate doesn't fail entirely — it serves the rows it can reach and surfaces a per-tenant error notice for the rest. This page explains the degraded surface and what to do.
The partial-failure pattern
Section titled “The partial-failure pattern”If everything is healthy, the Identities page shows one big paginated list with rows from every tenant interleaved. When one tenant's identity store can't be reached:
- Rows from every healthy tenant still appear.
- A small banner at the top of the table lists tenants that couldn't be queried, with a "Retry" link.
- Total count in the page header carries an approximate marker (e.g. "120+ identities") rather than an exact number, since the unreachable tenants' counts aren't included.
What you don't see:
- Rows from the unreachable tenant. They're not "missing" from the database — they just couldn't be fetched right now.
- A hard failure. The page renders fully; you can still search, paginate, and click through to reachable tenants' detail pages.
What to do when you see partial failure
Section titled “What to do when you see partial failure”Three steps, in order:
- Identify which tenant is unreachable. The banner names it. If multiple tenants are affected, the banner lists them all (truncated to first 3 + a "and N more" if there are many).
- Check the affected tenant's state. Open its detail page at
/dashboard/tenants/<id>. If it's in Suspended state, that explains the unreachability — suspended tenants reject identity-store queries. If it's in Active state, the issue is infrastructure-side. - Click Retry on the banner. Transient issues clear themselves; clicking Retry re-attempts the unreachable tenants. If the second attempt also fails, escalate per Incident response runbook.
When this is normal
Section titled “When this is normal”Two cases where partial-failure rendering is expected:
- Tenant is in Suspended state. Suspending a tenant gates its identity store too. The Identities page shows the suspension as the reason in the banner, not a generic "unreachable."
- Tenant is currently provisioning or decommissioning. The identity store may be coming up or shutting down; the aggregate skips it cleanly.
In both cases, the banner copy is specific to the state (e.g. "banking-cymmetri is suspended; users not shown until you resume it") so you know it's not a bug.
When it's a real problem
Section titled “When it's a real problem”Partial-failure rendering should be rare in production. If you see it persistently:
- One tenant always unreachable. That tenant has a real platform-level issue — its identity-store pod isn't running, its database is down, the network path is broken. Treat as an incident; see Incident response runbook.
- Many tenants intermittently unreachable. Aggregate-layer issue — the Identities aggregator can't fan out reliably. Likely platform-team territory; surface upstream.
- Banner stuck even after Retry. UI cache might be holding the old error. Hard-refresh the page (Ctrl/Cmd+Shift+R). If still stuck, sign out and back in.
What partial failure doesn't break
Section titled “What partial failure doesn't break”A useful mental model: the Identities aggregate degrades, but its inputs don't. Specifically:
- The affected tenants' end-users can still sign in. End-users hit the tenant directly at its subdomain; the Identities aggregate only fans out FROM the control plane TO each tenant. If that fan-out fails, end-user auth is unaffected (assuming the tenant itself is healthy).
- The affected tenants' admins can still operate. Tenant admins sign in to their own admin console, not the platform-level Identities page.
- The audit feed still shows everything. Audit events are written by each tenant directly to the platform; they don't depend on the Identities fan-out.
In other words: partial-failure on Identities is a degradation of the platform operator's view, not a customer-facing incident — assuming the underlying cause is the Identities aggregator and not the tenant itself.