PerslyPersly API

Error Handling

Understand API error responses and implement robust error handling.

When a non-streaming API request fails, Persly returns JSON error responses. Most runtime/API errors use the error envelope below. FastAPI schema/type validation failures use the standard 422 detail format, while endpoint-level semantic validation errors use the error envelope (typically 400 invalid_request_error).

Error Response Format

{
  "error": {
    "message": "Human-readable error description",
    "type": "invalid_request_error",
    "code": "model_not_found",
    "param": "model"
  }
}
FieldTypeDescription
messagestringHuman-readable explanation of what went wrong
typestringError category (see table below)
codestringMachine-readable error code
paramstring | nullThe parameter that caused the error, if applicable

422 Schema Validation Error Format

FastAPI-level schema/type validation failures (for example missing required fields, wrong types, or query/path constraints enforced by FastAPI) return 422 with this shape. Endpoint-level semantic checks return 400 with the error envelope instead.

{
  "detail": [
    {
      "type": "value_error",
      "loc": ["body"],
      "msg": "Value error, include_domains and exclude_domains are mutually exclusive",
      "input": {
        "...": "..."
      }
    }
  ]
}

detail[] items always include loc, msg, and type, and may additionally include fields like input and ctx.

Streaming Error Events (SSE)

For POST /v1/chat/completions with stream: true, failures that occur after streaming starts (for example agent runtime or stream post-processing failures) are sent in-band as SSE events.

Errors detected before streaming starts (authentication, credit balance, request validation, model/parameter checks) still return regular non-2xx JSON error responses.

  • HTTP response stays 200 with Content-Type: text/event-stream
  • Error payload arrives as a data: chunk with type: "error"
  • Stream then terminates with [DONE] without emitting further events
  • The error chunk may arrive after partial content chunks; clients should treat it as terminal for that stream
data: {"type":"error","error":{"type":"server_error","code":"internal_error","message":"AI processing failed"}}

data: [DONE]

Error Codes

400 — Bad Request

CodeTypeDescription
model_not_foundinvalid_request_errorThe specified model does not exist
missing_required_fieldinvalid_request_errorA required value is empty (for example, an empty messages array or empty query)
invalid_requestinvalid_request_errorThe request contains semantically invalid values. FastAPI/Pydantic schema validation failures return 422 detail[], while malformed JSON in manually parsed endpoints returns 400 with error.code = "invalid_request"
content_too_longinvalid_request_errorA chat message exceeds the max content length (32,000 chars)
too_many_messagesinvalid_request_errorStateless chat request exceeds 200 messages
batch_size_exceededinvalid_request_errorEmbedding batch size exceeds 100 items
documents_limit_exceededinvalid_request_errorRerank document count exceeds 1,000

422 — Validation Error

CodeTypeDescription
detail[].typeFastAPI schema/type validation failed (body, query, or path). 422 responses use detail[] and do not include error.code; endpoint-level semantic checks use 400 + error.code

401 — Unauthorized

CodeTypeDescription
authentication_errorauthentication_errorThe API key is missing, invalid, or revoked

403 — Forbidden

CodeTypeDescription
permission_deniedpermission_errorYou do not have permission to perform this action

402 — Payment Required

CodeTypeDescription
insufficient_creditsinsufficient_creditsCredit balance is too low to cover the request cost. Purchase more credits at platform.persly.ai

insufficient_credits is a balance check for inference endpoints only (/v1/chat/completions, /v1/embeddings, /v1/rerank, /v1/finder), not per-second burst-throttling.

500 — Server Error

CodeTypeDescription
internal_errorserver_errorAn unexpected error occurred on our servers

500 internal_error is generated by the global exception handler for unexpected runtime failures. It may occur even when an endpoint's OpenAPI response list only declares expected application-level errors.

503 — Service Unavailable

CodeTypeDescription
service_unavailableserver_errorA dependent service is temporarily unavailable (for example, Domains catalog source)

Error Handling Examples

Handle 422 separately because it uses detail[] (not the error envelope). For other non-2xx responses, parse error.code / error.message. For streaming (stream: true), handle in-band SSE type: "error" events separately.

import requests

response = requests.post(
    "https://api.persly.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "persly-chat-v1",
        "messages": [{"role": "user", "text": "Hello"}],
    },
)

if response.status_code != 200:
    payload = response.json()
    if response.status_code == 422:
        detail = payload.get("detail")
        first = detail[0] if isinstance(detail, list) and detail else {}
        print(
            f"Validation error [{first.get('type', 'value_error')}]: "
            f"{first.get('msg', 'Invalid request body')}"
        )
    else:
        error = payload.get("error", {})
        code = error.get("code", "unknown")
        message = error.get("message", str(payload))
        print(f"Error [{code}]: {message}")
else:
    data = response.json()
    print(data["message"])
const response = await fetch("https://api.persly.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "persly-chat-v1",
    messages: [{ role: "user", text: "Hello" }],
  }),
});

if (!response.ok) {
  const payload = await response.json();
  if (response.status === 422) {
    const first = payload.detail?.[0];
    console.error(
      `Validation error [${first?.type ?? "value_error"}]: ${first?.msg ?? "Invalid request body"}`
    );
  } else {
    const error = payload?.error;
    console.error(
      `Error [${error?.code ?? "unknown"}]: ${error?.message ?? JSON.stringify(payload)}`
    );
  }
} else {
  const data = await response.json();
  console.log(data.message);
}

Retry Strategy

For transient errors (5xx), implement exponential backoff:

import time
import requests

def call_api(payload, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(
            "https://api.persly.ai/v1/chat/completions",
            headers={"Authorization": "Bearer YOUR_API_KEY"},
            json=payload,
        )
        if response.status_code < 500:
            return response

        wait = 2 ** attempt  # 1s, 2s, 4s
        time.sleep(wait)

    return response  # Return last response after all retries
Error TypeRetry?Action
400 Bad RequestNoFix the request parameters
401 UnauthorizedNoCheck your API key
422 Validation ErrorNoFix request schema/field values (body or query parameters)
403 ForbiddenNoContact support
402 Insufficient CreditsNoPurchase more credits at platform.persly.ai
500 Server ErrorYesRetry with exponential backoff
503 Service UnavailableYesRetry with exponential backoff

On this page