Skip to content

API endpoints

Base URL: https://api.cortexlayer.dev. All POST bodies are JSON; all responses are JSON or, for streaming endpoints, Server-Sent Events.

Authentication

Two credential types — pick by call site:

CredentialHeaderWhere it’s safe to useEndpoints
API key (ck_live_…)Authorization: Bearer <key>Server-side only — never ship to a browserAll admin/CRUD endpoints; /v1/widget/session mint
Session token (cs_…)X-Cortex-Session: <token>Browser-side, scoped to one agent + one origin, 15 min TTL/v1/chat/stream (widget path)

API keys are HMAC-SHA256 hashed at rest with a server-side pepper; the prefix is indexed for fast lookup, the secret is constant-time compared. Session tokens are opaque — they live in Redis and are revocable by deleting the entry.

Agents

POST /v1/agents

Create an agent. Auth: API key. Bodies use camelCase and additionalProperties: false is enforced — unknown fields are rejected.

Minimal runnable body (copy/paste into curl):

{
"name": "Support bot",
"systemPrompt": "You are a friendly support agent for ACME Inc.",
"modelPolicy": {
"provider": "gemini",
"model": "gemini-2.5-flash"
},
"budget": {
"maxCostUsd": 0.05,
"maxSteps": 8,
"wallClockMs": 30000
},
"tools": [],
"allowedDomains": [],
"allowedOrigins": ["https://your-site.com"]
}

Field reference (omit optional fields entirely — do not send null):

FieldTypeRequiredConstraints
namestringyes1–128 chars
systemPromptstringyes1–10000 chars
modelPolicy.providerenumyes"gemini" | "openai" | "anthropic"
modelPolicy.modelstringyessee Models; must be in tenant’s plan allowlist
modelPolicy.temperaturenumberno0–2
modelPolicy.maxOutputTokensintno1–16384
fallback.provider / fallback.modelobjectnoboth fields required together if present
knowledgeBaseIduuidnomust reference a knowledge base owned by the tenant
budget.maxCostUsdnumberyes0–100
budget.maxStepsintyes1–32
budget.wallClockMsintyes1000–120000
toolsarrayyesToolDefinition[], max 16; pass [] for none
allowedDomainsarrayyesstrings ≤253 chars, max 32; outbound HTTP allowlist for tools
allowedOriginsarraynofull origin URLs (scheme://host[:port]), max 32 entries

Returns the created Agent — its id field is a UUID.

PATCH /v1/agents/:id

Partial update. Same body shape, all fields optional. Pass "fallback": null to clear a previously set fallback (vs. omitting, which leaves it untouched).

Dashboard playground sessions

POST /v1/agents/:id/playground-session mints a session for dashboard testing only (does not require Origin header, uses agents:write scope instead). It’s not for public widget embeds. Use /v1/widget/session for all production widget integrations.

Widget sessions

POST /v1/widget/session

Mint a short-lived browser-safe token bound to one agent + the requesting origin. Auth: API key.

{ "agentId": "<agent-uuid>" }

The Origin header is required. The server reads it and validates against the agent’s allowedOrigins. Requests without it are rejected:

{
"code": "origin_not_allowed",
"message": "request is missing the Origin header; widget sessions require a browser origin",
"http": 403
}

Returns 201 Created:

{
"sessionToken": "cs_...",
"expiresAt": "2026-04-21T15:30:00Z",
"messageCap": 50
}

The token’s TTL is 15 minutes; messageCap is the total messages allowed on this session before a fresh mint is required.

Server-side proxy integration: If you proxy this call from your own backend (Express, Fastify, Django, Go, etc.), forward the inbound Origin header verbatim — don’t strip it or replace it with your server’s origin. The allowlist check is strict and rejects mismatches. See the quickstart for language-specific proxy examples.

Chat

POST /v1/chat/stream

Streaming chat (Server-Sent Events). Auth: session token (widget path, header X-Cortex-Session) or API key (server path, header Authorization: Bearer). The sessionToken is also required in the body for widget calls.

Minimal runnable body:

{
"agentId": "00000000-0000-0000-0000-000000000000",
"sessionToken": "cs_...",
"messages": [{ "role": "user", "content": "Hi" }]
}

Field reference:

FieldTypeRequiredConstraints
agentIduuidyes
sessionTokenstringyes16–256 chars; same value as the X-Cortex-Session header
messagesarrayyes1–64 items, each { role, content }
conversationIduuidnoserver creates one if omitted
modelstringnoper-call override of modelPolicy.model
providerenumnoper-call override of modelPolicy.provider
temperaturenumberno0–2
maxOutputTokensintno1–16384
requestIdstringno8–128 chars; for client-side correlation

The response is always text/event-stream — there is no non-streaming mode. Frame types:

typePayloadNotes
startrequestId, runId, provider, model, conversationIdFirst frame.
deltatextAppend to the current assistant bubble.
tool_callname, argsTool runtime is about to execute.
tool_resultname, outputResult of the preceding tool_call.
usageinputTokens, outputTokens, costUsdEmitted near end-of-run for cost reporting.
errorcode, messageRecoverable; the run is over.
donefinishReasonLast frame.

Errors that prevent the stream from starting are returned as JSON with the standard envelope.

Rate limits

Limits stack — the strictest one wins. The numbers below are the production defaults; per-tenant overrides may apply on paid plans.

ScopeLimit
Per-IP (global)100 req/min
Per-API-key60 req/min sliding window
Per-IP (chat stream)20 msg/min
Per-tenant10 simultaneous streams
Per-tenant/day$2 soft (warn header) / $5 hard (429)
Per-session50 messages total over 15 min TTL

A 429 response carries Retry-After (seconds) and the standard error envelope (see below) with code: "rate_limit_exceeded" or code: "plan_limit_exceeded".

You can read current consumption via GET /v1/usage (auth: API key with billing:read scope).

Errors

All error responses share one top-level envelope — fields are not nested under an error object:

{
"code": "schema_validation_failed", // stable machine-readable code
"message": "...", // human-readable; do not parse
"http": 400, // HTTP status, mirrored for convenience
"details": { /* code-specific */ } // optional; e.g. Ajv issues array for validation errors
}

The request id is returned as the x-request-id response header, not in the body — include that header value in support tickets.

Codes are stable across versions; messages are not.