Chat Completions
Generate medical AI chat responses with source citations.
POST https://api.persly.ai/v1/chat/completionsCreates a chat completion for the given messages. Returns a medical AI response with optional source citations, follow-up question suggestions, and streaming support.
Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | Model ID. Either persly-chat-v1 or persly-chat-pro-v1 |
messages | array | Yes | — | Array of input messages. Must be non-empty and at most 200 items. |
messages[].role | string | Yes | — | Role of the message author: user or assistant |
messages[].text | string | Yes | — | Message text content. Max 32,000 characters. |
messages[].image_urls | string[] | No | [] | List of image URLs (JPG, PNG, WEBP) as https:// URLs. Counted against the attachment cap (30 for persly-chat-v1, 60 for persly-chat-pro-v1, combined with rasterized PDF pages across the full messages array). |
messages[].pdf_urls | string[] | No | [] | List of https:// PDF URLs. The server fetches each PDF, runs OCR per page, and feeds the extracted markdown to the model as text (one citable source per page — see PDF attachments). The combined total of image_urls entries and PDF pages is capped at 30 for persly-chat-v1 and 60 for persly-chat-pro-v1. |
stream | boolean | No | false | If true, response is streamed as Server-Sent Events |
include_follow_ups | boolean | No | false | Include follow-up question suggestions |
include_domains | string[] | null | No | null | Apply best-effort domain filtering during retrieval. Use Domains to discover common values. |
exclude_domains | string[] | null | No | null | Apply best-effort exclusion filtering during retrieval. Use Domains to discover common values. |
instructions | string | null | No | null | System-level instructions for answer generation. Max 4,000 characters. |
language | string | null | No | null | Output language hint (free-form string). Max 64 characters. |
Request constraints:
messagescan include up to 200 items.- Supported attachment formats: JPG/JPEG, PNG, WEBP, PDF.
Attachment limits:
persly-chat-v1up to 30 pages,persly-chat-pro-v1up to 60 pages. The cap applies to the combined total ofimage_urlsentries and OCR'd PDF pages, summed across every message in the request.- Counting rule: 1 image = 1 page, 1 PDF page = 1 page.
Attachment URLs must use
https://. Other schemes (http,data:, etc.) return422 invalid_request.- PDF fetch is capped at 50 MB and 30 s per file.
- Total attachment payload per request must be 40 MB or less.
Each
messages[].textvalue must be at most 32,000 characters.
Requesting include_follow_ups does not guarantee follow-up questions. The
follow-up surcharge is applied only when follow-up generation is actually
invoked.
Chat pricing is additive:
Base request:
persly-chat-v1= $0.15,persly-chat-pro-v1= $0.50- Follow-up generation surcharge: +$0.01 (only when invoked)
Attachment surcharge:
persly-chat-v1: +$0.003 per page (images/PDF), up to 30 pagespersly-chat-pro-v1: +$0.01 per page (images/PDF), up to 60 pages
For persly-chat-pro-v1, if language is omitted, the internal default is
auto.
include_domains and exclude_domains are mutually exclusive. Sending both
as non-empty arrays returns 422 validation error.
Response Body (Non-streaming)
| Field | Type | Description |
|---|---|---|
steps | array | AI processing steps |
steps[].description | string | Step description (e.g. "Searching medical knowledge base") |
steps[].actions | array | Actions performed during this step |
steps[].actions[].type | string | Action type (e.g. search_official_source) |
steps[].actions[].input | object | Action input parameters |
steps[].actions[].result | array | Action results |
message | string | The AI-generated response text. Inline citations appear as raw tokens such as [SW1], [PM1], [PF1] — match them against sources[].id. See Citation prefixes for what each prefix means. |
sources | array | null | Medical source citations. null when no citations are returned. When pdf_urls are provided, one PF* entry per PDF page is prepended ahead of retrieval sources (OCR-extracted text — see PDF attachments). |
sources[].id | string | Stable identifier matching inline citations in message (e.g. SW1, PF1). Parse citations with the regex \[([A-Z]{2,}\d+)\] and look each captured token up in sources[*].id. The PF prefix denotes one page of a user-attached PDF (ids PF1, PF2, ... one per page, globally sequential across the request). title is "{filename} (p.{N})", url is the original PDF URL (shared across all pages of the same PDF — clients can group by url), and relevance_score is fixed at 1.0. |
sources[].title | string | Source document title |
sources[].url | string | Source URL |
sources[].relevance_score | number | Relative relevance score as a float (higher means more relevant; do not assume a fixed 0.0–1.0 range). PF* entries are fixed at 1.0. |
follow_up_questions | array | null | Suggested follow-up questions. null when include_follow_ups: false or none are generated. |
Inline citations in message are preserved as raw tokens such as [SW1],
[PM1], [PF1]. The API does not renumber or normalize them. To match
citations with their source metadata, scan the message text with the regex
\[([A-Z]{2,}\d+)\] and look each captured token up by sources[*].id.
See Citation prefixes for the meaning of each
two-letter prefix.
For stream: true, see the Streaming guide for the SSE protocol.
Examples
Basic Request
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [
{"role": "user", "text": "What are the first-line treatments for type 2 diabetes?"}
]
}'import requests
response = requests.post(
"https://api.persly.ai/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "persly-chat-v1",
"messages": [
{
"role": "user",
"text": "What are the first-line treatments for type 2 diabetes?",
}
],
},
)
data = response.json()
print(data["message"])
for source in (data.get("sources") or []):
print(f" Source: {source['title']} ({source['url']})")const response = await fetch("https://api.persly.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "persly-chat-v1",
messages: [
{ role: "user", text: "What are the first-line treatments for type 2 diabetes?" },
],
}),
});
const data = await response.json();
console.log(data.message);
for (const source of data.sources ?? []) {
console.log(` Source: ${source.title} (${source.url})`);
}Response
{
"steps": [
{
"description": "Searching medical knowledge base",
"actions": [
{
"type": "search_official_source",
"input": { "query": "type 2 diabetes first-line treatment" },
"result": [
{
"title": "ADA Standards of Medical Care in Diabetes",
"url": "https://diabetesjournals.org/care/...",
"content": "Metformin remains the preferred initial pharmacologic agent..."
}
]
}
]
}
],
"message": "The first-line treatment for type 2 diabetes is metformin, along with lifestyle modifications including diet and exercise [SW1]. According to the ADA Standards of Care, metformin remains the preferred initial pharmacologic agent due to its efficacy, safety profile, and low cost [SW1][SW2]...",
"sources": [
{
"id": "SW1",
"title": "ADA Standards of Medical Care in Diabetes",
"url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
"relevance_score": 0.97
},
{
"id": "SW2",
"title": "Metformin - StatPearls",
"url": "https://www.ncbi.nlm.nih.gov/books/NBK518983/",
"relevance_score": 0.92
}
],
"follow_up_questions": null
}With Follow-up Questions
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "What is hypertension?"}],
"include_follow_ups": true
}'{
"steps": [
{
"description": "Searching medical knowledge base",
"actions": [
{
"type": "search_official_source",
"input": { "query": "hypertension" },
"result": [
{
"title": "Hypertension - StatPearls",
"url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
"content": "Hypertension is defined as..."
}
]
}
]
}
],
"message": "Hypertension, or high blood pressure, is a condition [SW1]...",
"sources": [
{
"id": "SW1",
"title": "Hypertension - StatPearls",
"url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
"relevance_score": 0.95
}
],
"follow_up_questions": [
"What are the risk factors for hypertension?",
"How is hypertension diagnosed?",
"What lifestyle changes can help manage hypertension?"
]
}With Image or PDF Input
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [
{
"role": "user",
"text": "Please summarize this health check report.",
"pdf_urls": ["https://example.com/healthcheck.pdf"]
}
]
}'{
"steps": [
{
"description": "Searching medical knowledge base",
"actions": [
{
"type": "search_official_source",
"input": { "query": "fasting glucose reference range" },
"result": [
{
"title": "ADA Standards of Medical Care in Diabetes",
"url": "https://diabetesjournals.org/care/...",
"content": "Fasting plasma glucose..."
}
]
}
]
}
],
"message": "Your fasting glucose is 110 mg/dL [PF2], which is within the prediabetes range by ADA criteria [SW1].",
"sources": [
{
"id": "PF1",
"title": "healthcheck.pdf (p.1)",
"url": "https://example.com/healthcheck.pdf",
"relevance_score": 1.0
},
{
"id": "PF2",
"title": "healthcheck.pdf (p.2)",
"url": "https://example.com/healthcheck.pdf",
"relevance_score": 1.0
},
{
"id": "SW1",
"title": "ADA Standards of Medical Care in Diabetes",
"url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
"relevance_score": 0.94
}
],
"follow_up_questions": null
}Cost example: A persly-chat-v1 request with 2 images and a 3-page PDF counts
as 5 pages. Total cost is $0.15 + (5 × $0.003) = $0.165, plus +$0.01 only
when follow-up generation is invoked.
With Instructions and Language Hint
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "Explain hypertension in simple terms."}],
"instructions": "Use a concise clinical tone and avoid bullet points.",
"language": "arabic"
}'With Domain Filter
Use Domains to fetch common values before sending filters. Domain tokens are normalized, invalid/non-normalized values can be ignored, and filtering is best-effort so off-domain results may still appear.
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
"include_domains": ["nih.gov", "who.int"]
}'curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
"exclude_domains": ["wikipedia.org"]
}'Streaming
See the Streaming guide for detailed SSE protocol documentation.
curl https://api.persly.ai/v1/chat/completions \
-H "Authorization: Bearer $PERSLY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "persly-chat-v1",
"messages": [{"role": "user", "text": "Explain the pathophysiology of asthma."}],
"stream": true
}'Citation prefixes
Every sources[].id — and every inline [...] token inside message — starts
with a two-letter prefix that identifies where the source came from. Clients
can parse all citations with the regex \[([A-Z]{2,}\d+)\].
| Prefix | Name | Source |
|---|---|---|
SW | Search Web | Official medical source retrieval from the curated web index (government / clinical / hospital domains). Default retrieval channel. |
PM | PubMed | PubMed article retrieval. persly-chat-pro-v1 only. |
PF | PDF File | One page of a user-attached PDF (via messages[].pdf_urls). OCR-extracted markdown. One PF* entry per page, globally sequential across the request. See PDF attachments. |
The numeric suffix (1, 2, …) is a per-request sequence within each prefix
and is stable within a single response, but not across requests — treat
[SW3] in one response as unrelated to [SW3] in another. Always resolve
citations against the same response's sources[] (and the streaming
steps[].sources snapshots — see Streaming).
PDF attachments
When a request includes messages[].pdf_urls, the server:
- Fetches each PDF over
httpswith SSRF-hardening (blocks private / loopback / link-local / reserved addresses, enforceshttpsacross redirect hops, caps body at 50 MB and the full exchange at 30 s). - Runs OCR page by page and extracts markdown for each page.
- Exposes every page as its own citable source with id
PFn, wherenis a request-wide sequence starting at 1.
Page-level ids
- Each page of a PDF becomes one
sources[]entry. - Ids are globally sequential across the entire request, not per-file. A request with two PDFs (3 pages + 2 pages) produces
PF1…PF5. titleis"{filename} (p.{N})"so clients can surface a human-readable page label.urlis the original PDF URL, shared across all pages of the same PDF. Clients can group pages back into a single PDF by comparingurlvalues.relevance_scoreis fixed at1.0(PDFs are user-authoritative, not retrieval-ranked).PF*entries are prepended tosources[]ahead of any retrieval (SW*/PM*) sources.
LLM behaviour
The OCR'd markdown is passed to the model as text inside the retrieval context — not as multimodal image content. This keeps answers grounded in the actual PDF text and avoids fabricated retrieval citations for PDF-derived claims. The model is expected to cite specific pages, e.g. "fasting glucose 110 mg/dL [PF2]".
Failure modes
PDF handling is all-or-nothing per request: if any page fails at fetch, parse, or OCR, the entire request returns 422 invalid_request with param: "pdf_urls" (see Errors). This prevents partial-content answers where a missing page could mislead a patient.
Errors
| Status | Code | Cause |
|---|---|---|
| 400 | model_not_found | Invalid model name |
| 400 | missing_required_field | Empty messages array |
| 400 | invalid_request | Invalid role or empty messages |
| 400 | too_many_messages | messages array exceeds 200 items |
| 400 | content_too_long | Any messages[].text exceeds 32,000 characters |
| 422 | invalid_request | A pdf_urls entry uses a non-https scheme, resolves to a private/loopback address, fails to fetch within the 30 s / 50 MB limits, cannot be parsed as a PDF, or OCR failed on any page. param is pdf_urls. |
| 422 | content_too_long | Combined image_urls + OCR'd PDF pages exceed the model attachment cap (30 for persly-chat-v1, 60 for persly-chat-pro-v1). param is image_urls or pdf_urls depending on which input caused the overflow. |
| 422 | — | Request validation error (missing required fields, wrong type, or both domain filters provided as non-empty arrays) |
| 401 | authentication_error | Missing or invalid Authorization header, or invalid API key (including malformed, unknown, or revoked keys). |
| 402 | insufficient_credits | Credit balance too low |
| 500 | internal_error | Unexpected server-side failure for non-streaming/pre-stream requests (streaming failures surface as in-band SSE error events with the same code) |