Error Handling

Understand API error responses and implement robust error handling.

When a non-streaming API request fails, Persly returns JSON error responses. Most runtime/API errors use the error envelope below. FastAPI schema/type validation failures use the standard 422 detail format, while endpoint-level semantic validation errors use the error envelope (typically 400 invalid_request_error).

Error Response Format

{
  "error": {
    "message": "Human-readable error description",
    "type": "invalid_request_error",
    "code": "model_not_found",
    "param": "model"
  }
}

Field	Type	Description
`message`	string	Human-readable explanation of what went wrong
`type`	string	Error category (see table below)
`code`	string	Machine-readable error code
`param`	string \| null	The parameter that caused the error, if applicable

422 Schema Validation Error Format

FastAPI-level schema/type validation failures (for example missing required fields, wrong types, or query/path constraints enforced by FastAPI) return 422 with this shape. Endpoint-level semantic checks return 400 with the error envelope instead.

{
  "detail": [
    {
      "type": "value_error",
      "loc": ["body"],
      "msg": "Value error, include_domains and exclude_domains are mutually exclusive",
      "input": {
        "...": "..."
      }
    }
  ]
}

detail[] items always include loc, msg, and type, and may additionally include fields like input and ctx.

Streaming Error Events (SSE)

For POST /v1/chat/completions with stream: true, failures that occur after streaming starts (for example agent runtime or stream post-processing failures) are sent in-band as SSE events.

Errors detected before streaming starts (authentication, credit balance, request validation, model/parameter checks) still return regular non-2xx JSON error responses.

HTTP response stays 200 with Content-Type: text/event-stream
Error payload arrives as a data: chunk with type: "error"
Stream then terminates with [DONE] without emitting further events
The error chunk may arrive after partial content chunks; clients should treat it as terminal for that stream

data: {"type":"error","error":{"type":"server_error","code":"internal_error","message":"AI processing failed"}}

data: [DONE]

Error Codes

400 — Bad Request

Code	Type	Description
`model_not_found`	`invalid_request_error`	The specified model does not exist
`missing_required_field`	`invalid_request_error`	A required value is empty (for example, an empty `messages` array or empty `query`)
`invalid_request`	`invalid_request_error`	The request contains semantically invalid values. FastAPI/Pydantic schema validation failures return `422 detail[]`, while malformed JSON in manually parsed endpoints returns `400` with `error.code = "invalid_request"`
`content_too_long`	`invalid_request_error`	A chat message exceeds the max content length (32,000 chars)
`too_many_messages`	`invalid_request_error`	Stateless chat request exceeds 200 messages
`batch_size_exceeded`	`invalid_request_error`	Embedding batch size exceeds 100 items
`documents_limit_exceeded`	`invalid_request_error`	Rerank document count exceeds 1,000

422 — Validation Error

Code	Type	Description
—	`detail[].type`	FastAPI schema/type validation failed (body, query, or path). `422` responses use `detail[]` and do not include `error.code`; endpoint-level semantic checks use `400` + `error.code`

401 — Unauthorized

Code	Type	Description
`authentication_error`	`authentication_error`	The API key is missing, invalid, or revoked

403 — Forbidden

Code	Type	Description
`permission_denied`	`permission_error`	You do not have permission to perform this action

402 — Payment Required

Code	Type	Description
`insufficient_credits`	`insufficient_credits`	Credit balance is too low to cover the request cost. Purchase more credits at platform.persly.ai

insufficient_credits is a balance check for inference endpoints only (/v1/chat/completions, /v1/embeddings, /v1/rerank, /v1/finder), not per-second burst-throttling.

500 — Server Error

Code	Type	Description
`internal_error`	`server_error`	An unexpected error occurred on our servers

500 internal_error is generated by the global exception handler for unexpected runtime failures. It may occur even when an endpoint's OpenAPI response list only declares expected application-level errors.

503 — Service Unavailable

Code	Type	Description
`service_unavailable`	`server_error`	A dependent service is temporarily unavailable (for example, Domains catalog source)

Error Handling Examples

Handle 422 separately because it uses detail[] (not the error envelope). For other non-2xx responses, parse error.code / error.message. For streaming (stream: true), handle in-band SSE type: "error" events separately.

import requests

response = requests.post(
    "https://api.persly.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "persly-chat-v1",
        "messages": [{"role": "user", "text": "Hello"}],
    },
)

if response.status_code != 200:
    payload = response.json()
    if response.status_code == 422:
        detail = payload.get("detail")
        first = detail[0] if isinstance(detail, list) and detail else {}
        print(
            f"Validation error [{first.get('type', 'value_error')}]: "
            f"{first.get('msg', 'Invalid request body')}"
        )
    else:
        error = payload.get("error", {})
        code = error.get("code", "unknown")
        message = error.get("message", str(payload))
        print(f"Error [{code}]: {message}")
else:
    data = response.json()
    print(data["message"])

const response = await fetch("https://api.persly.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "persly-chat-v1",
    messages: [{ role: "user", text: "Hello" }],
  }),
});

if (!response.ok) {
  const payload = await response.json();
  if (response.status === 422) {
    const first = payload.detail?.[0];
    console.error(
      `Validation error [${first?.type ?? "value_error"}]: ${first?.msg ?? "Invalid request body"}`
    );
  } else {
    const error = payload?.error;
    console.error(
      `Error [${error?.code ?? "unknown"}]: ${error?.message ?? JSON.stringify(payload)}`
    );
  }
} else {
  const data = await response.json();
  console.log(data.message);
}

Retry Strategy

For transient errors (5xx), implement exponential backoff:

import time
import requests

def call_api(payload, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(
            "https://api.persly.ai/v1/chat/completions",
            headers={"Authorization": "Bearer YOUR_API_KEY"},
            json=payload,
        )
        if response.status_code < 500:
            return response

        wait = 2 ** attempt  # 1s, 2s, 4s
        time.sleep(wait)

    return response  # Return last response after all retries

Error Type	Retry?	Action
400 Bad Request	No	Fix the request parameters
401 Unauthorized	No	Check your API key
422 Validation Error	No	Fix request schema/field values (body or query parameters)
403 Forbidden	No	Contact support
402 Insufficient Credits	No	Purchase more credits at platform.persly.ai
500 Server Error	Yes	Retry with exponential backoff
503 Service Unavailable	Yes	Retry with exponential backoff

Error Handling

On this page