Layi Docs

Architecture

Execution Flow

From authenticated request to Temporal workflow, FastMCP tools, and durable persistence

Last updated: December 2025

🔁 Request lifecycle

Every interaction runs as an org-scoped Temporal workflow so we can pause for hours, resume with new context, and audit every tool call. The chronological flow below mirrors today's production orchestrator.

  1. Channel gateway authenticates the user. The Org Manager identity layer handles signup/signin for each org, minting org_id-scoped JWTs that propagate across chat, voice, webhook, and SDK traffic.
  2. Backend starts a Temporal workflow. The workflow input includes org_id, agent_id, workflow_id, payload, and metadata such as thread_id or channel.
  3. Supervisor agent loop runs inside the workflow. Supervisors reference agent config from Cassandra, fetch knowledge snapshots, and emit plans that specialists execute.
  4. Tool execution happens via FastMCP. The workflow calls the `execute_tool_request` activity with org_id, tool_name, and parameters. Tool gateway enforces zero-trust boundaries and Vault-backed credentials.
  5. Elicitation keeps runs alive. If a tool schema check fails, the activity responds with `elicitation_required`. The workflow either pings the user for clarification (Temporal Signal) or fans out to another tool to auto-fill missing data.
  6. Finalization records observability + reward state. Structured summaries, reward scores, and follow-up decisions persist to Cassandra/Redis and stream back to the UI.
Temporal workflows
Hold durable state, retries, HITL pauses, and signals for days or weeks without consuming worker resources.
FastMCP Tool Gateway
Single executor interface that validates schemas, routes to 500+ tools, and isolates credentials inside the gateway.
Org context everywhere
org_id, agent_id, and workflow_id travel through every activity so secrets, caches, and audit trails stay tenant-scoped.

🧩 Tool execution & elicitation

FastMCP activities validate parameters before any external API call. When a required field is missing or ambiguous, the workflow surfaces a clarification event rather than failing, allowing the run to self-heal.

  • Schema validation: Tool definitions include JSON Schema. Invalid payloads raise deterministic errors instead of letting LLM free-form text through.
  • Elicitation loop: Activities return structured payloads (`status=elicitation_required`) with the missing parameter, enabling the workflow to ask the user or call a lookup sub-agent.
  • Safe retries: Temporal retry policies automatically re-run transient failures (network, 429, etc.) and escalate permanent failures to supervisor policy logic.
  • Credential isolation: Each tool call retrieves scoped secrets from Vault based on org_id; raw credentials never reach the LLM or supervisor memory.

🗄️ Workflow persistence & observability

The workflow persistence ADR mandates Cassandra as the operational store for run metadata, observability blocks, and reward payloads. Redis buffers hot data, while Elasticsearch projects runs for search analytics.

Cassandra run ledger
Stores workflow_id, org_id, status, timestamps, observability JSON, and reward summaries for audit and dashboards.
Redis & streaming buffer
Provides low-latency scratch space for in-flight prompts, streaming text, and clarification states.
Elasticsearch projection
Optional near-real-time index for filtering by agent, abstention reason, SLA, or governance flag without replaying events.
Org-manager hooks
Reward payloads trigger org-manager follow-ups and nightly schedules so procedural memory stays current.

🛡️ Error handling & compliance

Reliability layers from the orchestrator ensure every run either completes successfully, clarifies safely, or records a structured abstention.

  • Planner contract enforcement: Plans are normalized, diffed, and validated before execution to avoid unsafe tool sequences.
  • Clarification lifecycle: Pending clarifications persist with masked + raw candidates and expose REST endpoints for human resolution.
  • Observability summary: Each run emits `observability.summary`, `normalizer.usage_summary`, and reward metrics for Grafana and alerting.
  • Org isolation: Org Manager identities, role enforcement, Vault policies, Temporal task queues, and Kubernetes namespaces keep tenants isolated end-to-end.

Agents & Memory