Error Handling
Understand API error responses and implement robust error handling.
When a non-streaming API request fails, Persly returns JSON error responses. Most runtime/API errors use the error envelope below. FastAPI schema/type validation failures use the standard 422 detail format, while endpoint-level semantic validation errors use the error envelope (typically 400 invalid_request_error).
Error Response Format
{
"error": {
"message": "Human-readable error description",
"type": "invalid_request_error",
"code": "model_not_found",
"param": "model"
}
}| Field | Type | Description |
|---|---|---|
message | string | Human-readable explanation of what went wrong |
type | string | Error category (see table below) |
code | string | Machine-readable error code |
param | string | null | The parameter that caused the error, if applicable |
422 Schema Validation Error Format
FastAPI-level schema/type validation failures (for example missing required fields, wrong types, or query/path constraints enforced by FastAPI) return 422 with this shape. Endpoint-level semantic checks return 400 with the error envelope instead.
{
"detail": [
{
"type": "value_error",
"loc": ["body"],
"msg": "Value error, include_domains and exclude_domains are mutually exclusive",
"input": {
"...": "..."
}
}
]
}detail[] items always include loc, msg, and type, and may additionally include fields like input and ctx.
Streaming Error Events (SSE)
For POST /v1/chat/completions with stream: true, failures that occur after streaming starts (for example agent runtime or stream post-processing failures) are sent in-band as SSE events.
Errors detected before streaming starts (authentication, credit balance, request validation, model/parameter checks) still return regular non-2xx JSON error responses.
- HTTP response stays
200withContent-Type: text/event-stream - Error payload arrives as a
data:chunk withtype: "error" - Stream then terminates with
[DONE]without emitting further events - The
errorchunk may arrive after partial content chunks; clients should treat it as terminal for that stream
data: {"type":"error","error":{"type":"server_error","code":"internal_error","message":"AI processing failed"}}
data: [DONE]Error Codes
400 — Bad Request
| Code | Type | Description |
|---|---|---|
model_not_found | invalid_request_error | The specified model does not exist |
missing_required_field | invalid_request_error | A required value is empty (for example, an empty messages array or empty query) |
invalid_request | invalid_request_error | The request contains semantically invalid values. FastAPI/Pydantic schema validation failures return 422 detail[], while malformed JSON in manually parsed endpoints returns 400 with error.code = "invalid_request" |
content_too_long | invalid_request_error | A chat message exceeds the max content length (32,000 chars) |
too_many_messages | invalid_request_error | Stateless chat request exceeds 200 messages |
batch_size_exceeded | invalid_request_error | Embedding batch size exceeds 100 items |
documents_limit_exceeded | invalid_request_error | Rerank document count exceeds 1,000 |
422 — Validation Error
| Code | Type | Description |
|---|---|---|
| — | detail[].type | FastAPI schema/type validation failed (body, query, or path). 422 responses use detail[] and do not include error.code; endpoint-level semantic checks use 400 + error.code |
401 — Unauthorized
| Code | Type | Description |
|---|---|---|
authentication_error | authentication_error | The API key is missing, invalid, or revoked |
403 — Forbidden
| Code | Type | Description |
|---|---|---|
permission_denied | permission_error | You do not have permission to perform this action |
402 — Payment Required
| Code | Type | Description |
|---|---|---|
insufficient_credits | insufficient_credits | Credit balance is too low to cover the request cost. Purchase more credits at platform.persly.ai |
insufficient_credits is a balance check for inference endpoints only (/v1/chat/completions, /v1/embeddings, /v1/rerank, /v1/finder), not per-second burst-throttling.
500 — Server Error
| Code | Type | Description |
|---|---|---|
internal_error | server_error | An unexpected error occurred on our servers |
500 internal_error is generated by the global exception handler for unexpected runtime failures.
It may occur even when an endpoint's OpenAPI response list only declares expected application-level errors.
503 — Service Unavailable
| Code | Type | Description |
|---|---|---|
service_unavailable | server_error | A dependent service is temporarily unavailable (for example, Domains catalog source) |
Error Handling Examples
Handle 422 separately because it uses detail[] (not the error envelope). For other non-2xx responses, parse error.code / error.message. For streaming (stream: true), handle in-band SSE type: "error" events separately.
import requests
response = requests.post(
"https://api.persly.ai/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "Hello"}],
},
)
if response.status_code != 200:
payload = response.json()
if response.status_code == 422:
detail = payload.get("detail")
first = detail[0] if isinstance(detail, list) and detail else {}
print(
f"Validation error [{first.get('type', 'value_error')}]: "
f"{first.get('msg', 'Invalid request body')}"
)
else:
error = payload.get("error", {})
code = error.get("code", "unknown")
message = error.get("message", str(payload))
print(f"Error [{code}]: {message}")
else:
data = response.json()
print(data["message"])const response = await fetch("https://api.persly.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "persly-chat-v1",
messages: [{ role: "user", text: "Hello" }],
}),
});
if (!response.ok) {
const payload = await response.json();
if (response.status === 422) {
const first = payload.detail?.[0];
console.error(
`Validation error [${first?.type ?? "value_error"}]: ${first?.msg ?? "Invalid request body"}`
);
} else {
const error = payload?.error;
console.error(
`Error [${error?.code ?? "unknown"}]: ${error?.message ?? JSON.stringify(payload)}`
);
}
} else {
const data = await response.json();
console.log(data.message);
}Retry Strategy
For transient errors (5xx), implement exponential backoff:
import time
import requests
def call_api(payload, max_retries=3):
for attempt in range(max_retries):
response = requests.post(
"https://api.persly.ai/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json=payload,
)
if response.status_code < 500:
return response
wait = 2 ** attempt # 1s, 2s, 4s
time.sleep(wait)
return response # Return last response after all retries| Error Type | Retry? | Action |
|---|---|---|
| 400 Bad Request | No | Fix the request parameters |
| 401 Unauthorized | No | Check your API key |
| 422 Validation Error | No | Fix request schema/field values (body or query parameters) |
| 403 Forbidden | No | Contact support |
| 402 Insufficient Credits | No | Purchase more credits at platform.persly.ai |
| 500 Server Error | Yes | Retry with exponential backoff |
| 503 Service Unavailable | Yes | Retry with exponential backoff |