API endpoints

Base URL: https://api.cortexlayer.dev. All POST bodies are JSON; all responses are JSON or, for streaming endpoints, Server-Sent Events.

Authentication

Two credential types — pick by call site:

Credential	Header	Where it’s safe to use	Endpoints
API key (`ck_live_…`)	`Authorization: Bearer <key>`	Server-side only — never ship to a browser	All admin/CRUD endpoints; `/v1/widget/session` mint
Session token (`cs_…`)	`X-Cortex-Session: <token>`	Browser-side, scoped to one agent + one origin, 15 min TTL	`/v1/chat/stream` (widget path)

API keys are HMAC-SHA256 hashed at rest with a server-side pepper; the prefix is indexed for fast lookup, the secret is constant-time compared. Session tokens are opaque — they live in Redis and are revocable by deleting the entry.

Agents

`POST /v1/agents`

Create an agent. Auth: API key. Bodies use camelCase and additionalProperties: false is enforced — unknown fields are rejected.

Minimal runnable body (copy/paste into curl):

{
  "name": "Support bot",
  "systemPrompt": "You are a friendly support agent for ACME Inc.",
  "modelPolicy": {
    "provider": "gemini",
    "model": "gemini-2.5-flash"
  },
  "budget": {
    "maxCostUsd": 0.05,
    "maxSteps": 8,
    "wallClockMs": 30000
  },
  "tools": [],
  "allowedDomains": [],
  "allowedOrigins": ["https://your-site.com"]
}

Field reference (omit optional fields entirely — do not send null):

Field	Type	Required	Constraints
`name`	string	yes	1–128 chars
`systemPrompt`	string	yes	1–10000 chars
`modelPolicy.provider`	enum	yes	`"gemini"` \| `"openai"` \| `"anthropic"`
`modelPolicy.model`	string	yes	see Models; must be in tenant’s plan allowlist
`modelPolicy.temperature`	number	no	0–2
`modelPolicy.maxOutputTokens`	int	no	1–16384
`fallback.provider` / `fallback.model`	object	no	both fields required together if present
`knowledgeBaseId`	uuid	no	must reference a knowledge base owned by the tenant
`budget.maxCostUsd`	number	yes	0–100
`budget.maxSteps`	int	yes	1–32
`budget.wallClockMs`	int	yes	1000–120000
`tools`	array	yes	`ToolDefinition[]`, max 16; pass `[]` for none
`allowedDomains`	array	yes	strings ≤253 chars, max 32; outbound HTTP allowlist for tools
`allowedOrigins`	array	no	full origin URLs (`scheme://host[:port]`), max 32 entries

Returns the created Agent — its id field is a UUID.

`PATCH /v1/agents/:id`

Partial update. Same body shape, all fields optional. Pass "fallback": null to clear a previously set fallback (vs. omitting, which leaves it untouched).

Dashboard playground sessions

POST /v1/agents/:id/playground-session mints a session for dashboard testing only (does not require Origin header, uses agents:write scope instead). It’s not for public widget embeds. Use /v1/widget/session for all production widget integrations.

`POST /v1/widget/session`

Mint a short-lived browser-safe token bound to one agent + the requesting origin. Auth: API key.

{ "agentId": "<agent-uuid>" }

The Origin header is required. The server reads it and validates against the agent’s allowedOrigins. Requests without it are rejected:

{
  "code": "origin_not_allowed",
  "message": "request is missing the Origin header; widget sessions require a browser origin",
  "http": 403
}

Returns 201 Created:

{
  "sessionToken": "cs_...",
  "expiresAt": "2026-04-21T15:30:00Z",
  "messageCap": 50
}

The token’s TTL is 15 minutes; messageCap is the total messages allowed on this session before a fresh mint is required.

Server-side proxy integration: If you proxy this call from your own backend (Express, Fastify, Django, Go, etc.), forward the inbound Origin header verbatim — don’t strip it or replace it with your server’s origin. The allowlist check is strict and rejects mismatches. See the quickstart for language-specific proxy examples.

Chat

`POST /v1/chat/stream`

Streaming chat (Server-Sent Events). Auth: session token (widget path, header X-Cortex-Session) or API key (server path, header Authorization: Bearer). The sessionToken is also required in the body for widget calls.

Minimal runnable body:

{
  "agentId": "00000000-0000-0000-0000-000000000000",
  "sessionToken": "cs_...",
  "messages": [{ "role": "user", "content": "Hi" }]
}

Field reference:

Field	Type	Required	Constraints
`agentId`	uuid	yes	—
`sessionToken`	string	yes	16–256 chars; same value as the `X-Cortex-Session` header
`messages`	array	yes	1–64 items, each `{ role, content }`
`conversationId`	uuid	no	server creates one if omitted
`model`	string	no	per-call override of `modelPolicy.model`
`provider`	enum	no	per-call override of `modelPolicy.provider`
`temperature`	number	no	0–2
`maxOutputTokens`	int	no	1–16384
`requestId`	string	no	8–128 chars; for client-side correlation

The response is always text/event-stream — there is no non-streaming mode. Frame types:

`type`	Payload	Notes
`start`	`requestId`, `runId`, `provider`, `model`, `conversationId`	First frame.
`delta`	`text`	Append to the current assistant bubble.
`tool_call`	`name`, `args`	Tool runtime is about to execute.
`tool_result`	`name`, `output`	Result of the preceding `tool_call`.
`usage`	`inputTokens`, `outputTokens`, `costUsd`	Emitted near end-of-run for cost reporting.
`error`	`code`, `message`	Recoverable; the run is over.
`done`	`finishReason`	Last frame.

Errors that prevent the stream from starting are returned as JSON with the standard envelope.

Rate limits

Limits stack — the strictest one wins. The numbers below are the production defaults; per-tenant overrides may apply on paid plans.

Scope	Limit
Per-IP (global)	100 req/min
Per-API-key	60 req/min sliding window
Per-IP (chat stream)	20 msg/min
Per-tenant	10 simultaneous streams
Per-tenant/day	$2 soft (warn header) / $5 hard (429)
Per-session	50 messages total over 15 min TTL

A 429 response carries Retry-After (seconds) and the standard error envelope (see below) with code: "rate_limit_exceeded" or code: "plan_limit_exceeded".

You can read current consumption via GET /v1/usage (auth: API key with billing:read scope).

Errors

All error responses share one top-level envelope — fields are not nested under an error object:

{
  "code": "schema_validation_failed", // stable machine-readable code
  "message": "...",                   // human-readable; do not parse
  "http": 400,                        // HTTP status, mirrored for convenience
  "details": { /* code-specific */ }  // optional; e.g. Ajv issues array for validation errors
}

The request id is returned as the x-request-id response header, not in the body — include that header value in support tickets.

Codes are stable across versions; messages are not.

API endpoints

Authentication

Agents

POST /v1/agents

PATCH /v1/agents/:id