Chat Completions

Generate medical AI chat responses with source citations.

POST https://api.persly.ai/v1/chat/completions

Creates a chat completion for the given messages. Returns a medical AI response with optional source citations, follow-up question suggestions, and streaming support.

Request Body

Parameter	Type	Required	Default	Description
`model`	string	Yes	—	Model ID. Either `persly-chat-v1` or `persly-chat-pro-v1`
`messages`	array	Yes	—	Array of input messages. Must be non-empty and at most 200 items.
`messages[].role`	string	Yes	—	Role of the message author: `user` or `assistant`
`messages[].text`	string	Yes	—	Message text content. Max 32,000 characters.
`messages[].image_urls`	string[]	No	`[]`	List of image URLs (JPG, PNG, WEBP) as `https://` URLs. Counted against the attachment cap (30 for `persly-chat-v1`, 60 for `persly-chat-pro-v1`, combined with rasterized PDF pages across the full messages array).
`messages[].pdf_urls`	string[]	No	`[]`	List of `https://` PDF URLs. The server fetches each PDF, runs OCR per page, and feeds the extracted markdown to the model as text (one citable source per page — see PDF attachments). The combined total of `image_urls` entries and PDF pages is capped at 30 for `persly-chat-v1` and 60 for `persly-chat-pro-v1`.
`stream`	boolean	No	`false`	If `true`, response is streamed as Server-Sent Events
`include_follow_ups`	boolean	No	`false`	Include follow-up question suggestions
`include_domains`	string[] \| null	No	`null`	Apply best-effort domain filtering during retrieval. Use Domains to discover common values.
`exclude_domains`	string[] \| null	No	`null`	Apply best-effort exclusion filtering during retrieval. Use Domains to discover common values.
`instructions`	string \| null	No	`null`	System-level instructions for answer generation. Max 4,000 characters.
`language`	string \| null	No	`null`	Output language hint (free-form string). Max 64 characters.

Request constraints:

messages can include up to 200 items.
Supported attachment formats: JPG/JPEG, PNG, WEBP, PDF.
Attachment limits: persly-chat-v1 up to 30 pages, persly-chat-pro-v1 up to 60 pages. The cap applies to the combined total of image_urls entries and OCR'd PDF pages, summed across every message in the request.
Counting rule: 1 image = 1 page, 1 PDF page = 1 page.
Attachment URLs must use https://. Other schemes (http, data:, etc.) return 422 invalid_request.
PDF fetch is capped at 50 MB and 30 s per file.
Total attachment payload per request must be 40 MB or less.
Each messages[].text value must be at most 32,000 characters.

Requesting include_follow_ups does not guarantee follow-up questions. The follow-up surcharge is applied only when follow-up generation is actually invoked.

Chat pricing is additive:

Base request: persly-chat-v1 = $0.15, persly-chat-pro-v1 = $0.50
Follow-up generation surcharge: +$0.01 (only when invoked)
Attachment surcharge:
- persly-chat-v1: +$0.003 per page (images/PDF), up to 30 pages
- persly-chat-pro-v1: +$0.01 per page (images/PDF), up to 60 pages

For persly-chat-pro-v1, if language is omitted, the internal default is auto.

include_domains and exclude_domains are mutually exclusive. Sending both as non-empty arrays returns 422 validation error.

Response Body (Non-streaming)

Field	Type	Description
`steps`	array	AI processing steps
`steps[].description`	string	Step description (e.g. "Searching medical knowledge base")
`steps[].actions`	array	Actions performed during this step
`steps[].actions[].type`	string	Action type (e.g. `search_official_source`)
`steps[].actions[].input`	object	Action input parameters
`steps[].actions[].result`	array	Action results
`message`	string	The AI-generated response text. Inline citations appear as raw tokens such as `[SW1]`, `[PM1]`, `[PF1]` — match them against `sources[].id`. See Citation prefixes for what each prefix means.
`sources`	array \| null	Medical source citations. `null` when no citations are returned. When `pdf_urls` are provided, one `PF` entry per PDF page* is prepended ahead of retrieval sources (OCR-extracted text — see PDF attachments).
`sources[].id`	string	Stable identifier matching inline citations in `message` (e.g. `SW1`, `PF1`). Parse citations with the regex `\[([A-Z]{2,}\d+)\]` and look each captured token up in `sources[*].id`. The `PF` prefix denotes one page of a user-attached PDF (ids `PF1`, `PF2`, ... one per page, globally sequential across the request). `title` is `"{filename} (p.{N})"`, `url` is the original PDF URL (shared across all pages of the same PDF — clients can group by `url`), and `relevance_score` is fixed at `1.0`.
`sources[].title`	string	Source document title
`sources[].url`	string	Source URL
`sources[].relevance_score`	number	Relative relevance score as a float (higher means more relevant; do not assume a fixed 0.0–1.0 range). `PF*` entries are fixed at `1.0`.
`follow_up_questions`	array \| null	Suggested follow-up questions. `null` when `include_follow_ups: false` or none are generated.

Inline citations in message are preserved as raw tokens such as [SW1], [PM1], [PF1]. The API does not renumber or normalize them. To match citations with their source metadata, scan the message text with the regex \[([A-Z]{2,}\d+)\] and look each captured token up by sources[*].id. See Citation prefixes for the meaning of each two-letter prefix.

For stream: true, see the Streaming guide for the SSE protocol.

Examples

Basic Request

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [
      {"role": "user", "text": "What are the first-line treatments for type 2 diabetes?"}
    ]
  }'

import requests

response = requests.post(
    "https://api.persly.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "persly-chat-v1",
        "messages": [
            {
                "role": "user",
                "text": "What are the first-line treatments for type 2 diabetes?",
            }
        ],
    },
)

data = response.json()
print(data["message"])

for source in (data.get("sources") or []):
    print(f"  Source: {source['title']} ({source['url']})")

const response = await fetch("https://api.persly.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "persly-chat-v1",
    messages: [
      { role: "user", text: "What are the first-line treatments for type 2 diabetes?" },
    ],
  }),
});

const data = await response.json();
console.log(data.message);

for (const source of data.sources ?? []) {
  console.log(`  Source: ${source.title} (${source.url})`);
}

Response

{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "type 2 diabetes first-line treatment" },
          "result": [
            {
              "title": "ADA Standards of Medical Care in Diabetes",
              "url": "https://diabetesjournals.org/care/...",
              "content": "Metformin remains the preferred initial pharmacologic agent..."
            }
          ]
        }
      ]
    }
  ],
  "message": "The first-line treatment for type 2 diabetes is metformin, along with lifestyle modifications including diet and exercise [SW1]. According to the ADA Standards of Care, metformin remains the preferred initial pharmacologic agent due to its efficacy, safety profile, and low cost [SW1][SW2]...",
  "sources": [
    {
      "id": "SW1",
      "title": "ADA Standards of Medical Care in Diabetes",
      "url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
      "relevance_score": 0.97
    },
    {
      "id": "SW2",
      "title": "Metformin - StatPearls",
      "url": "https://www.ncbi.nlm.nih.gov/books/NBK518983/",
      "relevance_score": 0.92
    }
  ],
  "follow_up_questions": null
}

With Follow-up Questions

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "What is hypertension?"}],
    "include_follow_ups": true
  }'

Response with follow-up questions

{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "hypertension" },
          "result": [
            {
              "title": "Hypertension - StatPearls",
              "url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
              "content": "Hypertension is defined as..."
            }
          ]
        }
      ]
    }
  ],
  "message": "Hypertension, or high blood pressure, is a condition [SW1]...",
  "sources": [
    {
      "id": "SW1",
      "title": "Hypertension - StatPearls",
      "url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
      "relevance_score": 0.95
    }
  ],
  "follow_up_questions": [
    "What are the risk factors for hypertension?",
    "How is hypertension diagnosed?",
    "What lifestyle changes can help manage hypertension?"
  ]
}

With Image or PDF Input

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [
      {
        "role": "user",
        "text": "Please summarize this health check report.",
        "pdf_urls": ["https://example.com/healthcheck.pdf"]
      }
    ]
  }'

Response with a PDF citation

{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "fasting glucose reference range" },
          "result": [
            {
              "title": "ADA Standards of Medical Care in Diabetes",
              "url": "https://diabetesjournals.org/care/...",
              "content": "Fasting plasma glucose..."
            }
          ]
        }
      ]
    }
  ],
  "message": "Your fasting glucose is 110 mg/dL [PF2], which is within the prediabetes range by ADA criteria [SW1].",
  "sources": [
    {
      "id": "PF1",
      "title": "healthcheck.pdf (p.1)",
      "url": "https://example.com/healthcheck.pdf",
      "relevance_score": 1.0
    },
    {
      "id": "PF2",
      "title": "healthcheck.pdf (p.2)",
      "url": "https://example.com/healthcheck.pdf",
      "relevance_score": 1.0
    },
    {
      "id": "SW1",
      "title": "ADA Standards of Medical Care in Diabetes",
      "url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
      "relevance_score": 0.94
    }
  ],
  "follow_up_questions": null
}

Cost example: A persly-chat-v1 request with 2 images and a 3-page PDF counts as 5 pages. Total cost is $0.15 + (5 × $0.003) = $0.165, plus +$0.01 only when follow-up generation is invoked.

With Instructions and Language Hint

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Explain hypertension in simple terms."}],
    "instructions": "Use a concise clinical tone and avoid bullet points.",
    "language": "arabic"
  }'

With Domain Filter

Use Domains to fetch common values before sending filters. Domain tokens are normalized, invalid/non-normalized values can be ignored, and filtering is best-effort so off-domain results may still appear.

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
    "include_domains": ["nih.gov", "who.int"]
  }'

Exclude specific domains

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
    "exclude_domains": ["wikipedia.org"]
  }'

Streaming

See the Streaming guide for detailed SSE protocol documentation.

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Explain the pathophysiology of asthma."}],
    "stream": true
  }'

Citation prefixes

Every sources[].id — and every inline [...] token inside message — starts with a two-letter prefix that identifies where the source came from. Clients can parse all citations with the regex \[([A-Z]{2,}\d+)\].

Prefix	Name	Source
`SW`	Search Web	Official medical source retrieval from the curated web index (government / clinical / hospital domains). Default retrieval channel.
`PM`	PubMed	PubMed article retrieval. `persly-chat-pro-v1` only.
`PF`	PDF File	One page of a user-attached PDF (via `messages[].pdf_urls`). OCR-extracted markdown. One `PF*` entry per page, globally sequential across the request. See PDF attachments.

The numeric suffix (1, 2, …) is a per-request sequence within each prefix and is stable within a single response, but not across requests — treat [SW3] in one response as unrelated to [SW3] in another. Always resolve citations against the same response's sources[] (and the streaming steps[].sources snapshots — see Streaming).

PDF attachments

When a request includes messages[].pdf_urls, the server:

Fetches each PDF over https with SSRF-hardening (blocks private / loopback / link-local / reserved addresses, enforces https across redirect hops, caps body at 50 MB and the full exchange at 30 s).
Runs OCR page by page and extracts markdown for each page.
Exposes every page as its own citable source with id PFn, where n is a request-wide sequence starting at 1.

Page-level ids

Each page of a PDF becomes one sources[] entry.
Ids are globally sequential across the entire request, not per-file. A request with two PDFs (3 pages + 2 pages) produces PF1 … PF5.
title is "{filename} (p.{N})" so clients can surface a human-readable page label.
url is the original PDF URL, shared across all pages of the same PDF. Clients can group pages back into a single PDF by comparing url values.
relevance_score is fixed at 1.0 (PDFs are user-authoritative, not retrieval-ranked).
PF* entries are prepended to sources[] ahead of any retrieval (SW* / PM*) sources.

LLM behaviour

The OCR'd markdown is passed to the model as text inside the retrieval context — not as multimodal image content. This keeps answers grounded in the actual PDF text and avoids fabricated retrieval citations for PDF-derived claims. The model is expected to cite specific pages, e.g. "fasting glucose 110 mg/dL [PF2]".

Failure modes

PDF handling is all-or-nothing per request: if any page fails at fetch, parse, or OCR, the entire request returns 422 invalid_request with param: "pdf_urls" (see Errors). This prevents partial-content answers where a missing page could mislead a patient.

Errors

Status	Code	Cause
400	`model_not_found`	Invalid model name
400	`missing_required_field`	Empty `messages` array
400	`invalid_request`	Invalid role or empty messages
400	`too_many_messages`	`messages` array exceeds 200 items
400	`content_too_long`	Any `messages[].text` exceeds 32,000 characters
422	`invalid_request`	A `pdf_urls` entry uses a non-`https` scheme, resolves to a private/loopback address, fails to fetch within the 30 s / 50 MB limits, cannot be parsed as a PDF, or OCR failed on any page. `param` is `pdf_urls`.
422	`content_too_long`	Combined `image_urls` + OCR'd PDF pages exceed the model attachment cap (30 for `persly-chat-v1`, 60 for `persly-chat-pro-v1`). `param` is `image_urls` or `pdf_urls` depending on which input caused the overflow.
422	—	Request validation error (missing required fields, wrong type, or both domain filters provided as non-empty arrays)
401	`authentication_error`	Missing or invalid Authorization header, or invalid API key (including malformed, unknown, or revoked keys).
402	`insufficient_credits`	Credit balance too low
500	`internal_error`	Unexpected server-side failure for non-streaming/pre-stream requests (streaming failures surface as in-band SSE `error` events with the same code)

Chat Completions

On this page