PerslyPersly API
API Reference

Chat Completions

Generate medical AI chat responses with source citations.

POST https://api.persly.ai/v1/chat/completions

Creates a chat completion for the given messages. Returns a medical AI response with optional source citations, follow-up question suggestions, and streaming support.

Request Body

ParameterTypeRequiredDefaultDescription
modelstringYesModel ID. Either persly-chat-v1 or persly-chat-pro-v1
messagesarrayYesArray of input messages. Must be non-empty and at most 200 items.
messages[].rolestringYesRole of the message author: user or assistant
messages[].textstringYesMessage text content. Max 32,000 characters.
messages[].image_urlsstring[]No[]List of image URLs (JPG, PNG, WEBP) as https:// URLs. Counted against the attachment cap (30 for persly-chat-v1, 60 for persly-chat-pro-v1, combined with rasterized PDF pages across the full messages array).
messages[].pdf_urlsstring[]No[]List of https:// PDF URLs. The server fetches each PDF, runs OCR per page, and feeds the extracted markdown to the model as text (one citable source per page — see PDF attachments). The combined total of image_urls entries and PDF pages is capped at 30 for persly-chat-v1 and 60 for persly-chat-pro-v1.
streambooleanNofalseIf true, response is streamed as Server-Sent Events
include_follow_upsbooleanNofalseInclude follow-up question suggestions
include_domainsstring[] | nullNonullApply best-effort domain filtering during retrieval. Use Domains to discover common values.
exclude_domainsstring[] | nullNonullApply best-effort exclusion filtering during retrieval. Use Domains to discover common values.
instructionsstring | nullNonullSystem-level instructions for answer generation. Max 4,000 characters.
languagestring | nullNonullOutput language hint (free-form string). Max 64 characters.

Request constraints:

  • messages can include up to 200 items.
  • Supported attachment formats: JPG/JPEG, PNG, WEBP, PDF.
  • Attachment limits: persly-chat-v1 up to 30 pages, persly-chat-pro-v1 up to 60 pages. The cap applies to the combined total of image_urls entries and OCR'd PDF pages, summed across every message in the request.

  • Counting rule: 1 image = 1 page, 1 PDF page = 1 page.
  • Attachment URLs must use https://. Other schemes (http, data:, etc.) return 422 invalid_request.

  • PDF fetch is capped at 50 MB and 30 s per file.
  • Total attachment payload per request must be 40 MB or less.
  • Each messages[].text value must be at most 32,000 characters.

Requesting include_follow_ups does not guarantee follow-up questions. The follow-up surcharge is applied only when follow-up generation is actually invoked.

Chat pricing is additive:

  • Base request: persly-chat-v1 = $0.15, persly-chat-pro-v1 = $0.50

  • Follow-up generation surcharge: +$0.01 (only when invoked)
  • Attachment surcharge:

    • persly-chat-v1: +$0.003 per page (images/PDF), up to 30 pages

    • persly-chat-pro-v1: +$0.01 per page (images/PDF), up to 60 pages

For persly-chat-pro-v1, if language is omitted, the internal default is auto.

include_domains and exclude_domains are mutually exclusive. Sending both as non-empty arrays returns 422 validation error.

Response Body (Non-streaming)

FieldTypeDescription
stepsarrayAI processing steps
steps[].descriptionstringStep description (e.g. "Searching medical knowledge base")
steps[].actionsarrayActions performed during this step
steps[].actions[].typestringAction type (e.g. search_official_source)
steps[].actions[].inputobjectAction input parameters
steps[].actions[].resultarrayAction results
messagestringThe AI-generated response text. Inline citations appear as raw tokens such as [SW1], [PM1], [PF1] — match them against sources[].id. See Citation prefixes for what each prefix means.
sourcesarray | nullMedical source citations. null when no citations are returned. When pdf_urls are provided, one PF* entry per PDF page is prepended ahead of retrieval sources (OCR-extracted text — see PDF attachments).
sources[].idstringStable identifier matching inline citations in message (e.g. SW1, PF1). Parse citations with the regex \[([A-Z]{2,}\d+)\] and look each captured token up in sources[*].id. The PF prefix denotes one page of a user-attached PDF (ids PF1, PF2, ... one per page, globally sequential across the request). title is "{filename} (p.{N})", url is the original PDF URL (shared across all pages of the same PDF — clients can group by url), and relevance_score is fixed at 1.0.
sources[].titlestringSource document title
sources[].urlstringSource URL
sources[].relevance_scorenumberRelative relevance score as a float (higher means more relevant; do not assume a fixed 0.0–1.0 range). PF* entries are fixed at 1.0.
follow_up_questionsarray | nullSuggested follow-up questions. null when include_follow_ups: false or none are generated.

Inline citations in message are preserved as raw tokens such as [SW1], [PM1], [PF1]. The API does not renumber or normalize them. To match citations with their source metadata, scan the message text with the regex \[([A-Z]{2,}\d+)\] and look each captured token up by sources[*].id. See Citation prefixes for the meaning of each two-letter prefix.

For stream: true, see the Streaming guide for the SSE protocol.

Examples

Basic Request

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [
      {"role": "user", "text": "What are the first-line treatments for type 2 diabetes?"}
    ]
  }'
import requests

response = requests.post(
    "https://api.persly.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "persly-chat-v1",
        "messages": [
            {
                "role": "user",
                "text": "What are the first-line treatments for type 2 diabetes?",
            }
        ],
    },
)

data = response.json()
print(data["message"])

for source in (data.get("sources") or []):
    print(f"  Source: {source['title']} ({source['url']})")
const response = await fetch("https://api.persly.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "persly-chat-v1",
    messages: [
      { role: "user", text: "What are the first-line treatments for type 2 diabetes?" },
    ],
  }),
});

const data = await response.json();
console.log(data.message);

for (const source of data.sources ?? []) {
  console.log(`  Source: ${source.title} (${source.url})`);
}

Response

{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "type 2 diabetes first-line treatment" },
          "result": [
            {
              "title": "ADA Standards of Medical Care in Diabetes",
              "url": "https://diabetesjournals.org/care/...",
              "content": "Metformin remains the preferred initial pharmacologic agent..."
            }
          ]
        }
      ]
    }
  ],
  "message": "The first-line treatment for type 2 diabetes is metformin, along with lifestyle modifications including diet and exercise [SW1]. According to the ADA Standards of Care, metformin remains the preferred initial pharmacologic agent due to its efficacy, safety profile, and low cost [SW1][SW2]...",
  "sources": [
    {
      "id": "SW1",
      "title": "ADA Standards of Medical Care in Diabetes",
      "url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
      "relevance_score": 0.97
    },
    {
      "id": "SW2",
      "title": "Metformin - StatPearls",
      "url": "https://www.ncbi.nlm.nih.gov/books/NBK518983/",
      "relevance_score": 0.92
    }
  ],
  "follow_up_questions": null
}

With Follow-up Questions

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "What is hypertension?"}],
    "include_follow_ups": true
  }'
Response with follow-up questions
{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "hypertension" },
          "result": [
            {
              "title": "Hypertension - StatPearls",
              "url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
              "content": "Hypertension is defined as..."
            }
          ]
        }
      ]
    }
  ],
  "message": "Hypertension, or high blood pressure, is a condition [SW1]...",
  "sources": [
    {
      "id": "SW1",
      "title": "Hypertension - StatPearls",
      "url": "https://www.ncbi.nlm.nih.gov/books/NBK539859/",
      "relevance_score": 0.95
    }
  ],
  "follow_up_questions": [
    "What are the risk factors for hypertension?",
    "How is hypertension diagnosed?",
    "What lifestyle changes can help manage hypertension?"
  ]
}

With Image or PDF Input

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [
      {
        "role": "user",
        "text": "Please summarize this health check report.",
        "pdf_urls": ["https://example.com/healthcheck.pdf"]
      }
    ]
  }'
Response with a PDF citation
{
  "steps": [
    {
      "description": "Searching medical knowledge base",
      "actions": [
        {
          "type": "search_official_source",
          "input": { "query": "fasting glucose reference range" },
          "result": [
            {
              "title": "ADA Standards of Medical Care in Diabetes",
              "url": "https://diabetesjournals.org/care/...",
              "content": "Fasting plasma glucose..."
            }
          ]
        }
      ]
    }
  ],
  "message": "Your fasting glucose is 110 mg/dL [PF2], which is within the prediabetes range by ADA criteria [SW1].",
  "sources": [
    {
      "id": "PF1",
      "title": "healthcheck.pdf (p.1)",
      "url": "https://example.com/healthcheck.pdf",
      "relevance_score": 1.0
    },
    {
      "id": "PF2",
      "title": "healthcheck.pdf (p.2)",
      "url": "https://example.com/healthcheck.pdf",
      "relevance_score": 1.0
    },
    {
      "id": "SW1",
      "title": "ADA Standards of Medical Care in Diabetes",
      "url": "https://diabetesjournals.org/care/article/47/Supplement_1/S1/...",
      "relevance_score": 0.94
    }
  ],
  "follow_up_questions": null
}

Cost example: A persly-chat-v1 request with 2 images and a 3-page PDF counts as 5 pages. Total cost is $0.15 + (5 × $0.003) = $0.165, plus +$0.01 only when follow-up generation is invoked.

With Instructions and Language Hint

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Explain hypertension in simple terms."}],
    "instructions": "Use a concise clinical tone and avoid bullet points.",
    "language": "arabic"
  }'

With Domain Filter

Use Domains to fetch common values before sending filters. Domain tokens are normalized, invalid/non-normalized values can be ignored, and filtering is best-effort so off-domain results may still appear.

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
    "include_domains": ["nih.gov", "who.int"]
  }'
Exclude specific domains
curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Summarize hypertension treatment."}],
    "exclude_domains": ["wikipedia.org"]
  }'

Streaming

See the Streaming guide for detailed SSE protocol documentation.

curl https://api.persly.ai/v1/chat/completions \
  -H "Authorization: Bearer $PERSLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "persly-chat-v1",
    "messages": [{"role": "user", "text": "Explain the pathophysiology of asthma."}],
    "stream": true
  }'

Citation prefixes

Every sources[].id — and every inline [...] token inside message — starts with a two-letter prefix that identifies where the source came from. Clients can parse all citations with the regex \[([A-Z]{2,}\d+)\].

PrefixNameSource
SWSearch WebOfficial medical source retrieval from the curated web index (government / clinical / hospital domains). Default retrieval channel.
PMPubMedPubMed article retrieval. persly-chat-pro-v1 only.
PFPDF FileOne page of a user-attached PDF (via messages[].pdf_urls). OCR-extracted markdown. One PF* entry per page, globally sequential across the request. See PDF attachments.

The numeric suffix (1, 2, …) is a per-request sequence within each prefix and is stable within a single response, but not across requests — treat [SW3] in one response as unrelated to [SW3] in another. Always resolve citations against the same response's sources[] (and the streaming steps[].sources snapshots — see Streaming).

PDF attachments

When a request includes messages[].pdf_urls, the server:

  1. Fetches each PDF over https with SSRF-hardening (blocks private / loopback / link-local / reserved addresses, enforces https across redirect hops, caps body at 50 MB and the full exchange at 30 s).
  2. Runs OCR page by page and extracts markdown for each page.
  3. Exposes every page as its own citable source with id PFn, where n is a request-wide sequence starting at 1.

Page-level ids

  • Each page of a PDF becomes one sources[] entry.
  • Ids are globally sequential across the entire request, not per-file. A request with two PDFs (3 pages + 2 pages) produces PF1PF5.
  • title is "{filename} (p.{N})" so clients can surface a human-readable page label.
  • url is the original PDF URL, shared across all pages of the same PDF. Clients can group pages back into a single PDF by comparing url values.
  • relevance_score is fixed at 1.0 (PDFs are user-authoritative, not retrieval-ranked).
  • PF* entries are prepended to sources[] ahead of any retrieval (SW* / PM*) sources.

LLM behaviour

The OCR'd markdown is passed to the model as text inside the retrieval context — not as multimodal image content. This keeps answers grounded in the actual PDF text and avoids fabricated retrieval citations for PDF-derived claims. The model is expected to cite specific pages, e.g. "fasting glucose 110 mg/dL [PF2]".

Failure modes

PDF handling is all-or-nothing per request: if any page fails at fetch, parse, or OCR, the entire request returns 422 invalid_request with param: "pdf_urls" (see Errors). This prevents partial-content answers where a missing page could mislead a patient.

Errors

StatusCodeCause
400model_not_foundInvalid model name
400missing_required_fieldEmpty messages array
400invalid_requestInvalid role or empty messages
400too_many_messagesmessages array exceeds 200 items
400content_too_longAny messages[].text exceeds 32,000 characters
422invalid_requestA pdf_urls entry uses a non-https scheme, resolves to a private/loopback address, fails to fetch within the 30 s / 50 MB limits, cannot be parsed as a PDF, or OCR failed on any page. param is pdf_urls.
422content_too_longCombined image_urls + OCR'd PDF pages exceed the model attachment cap (30 for persly-chat-v1, 60 for persly-chat-pro-v1). param is image_urls or pdf_urls depending on which input caused the overflow.
422Request validation error (missing required fields, wrong type, or both domain filters provided as non-empty arrays)
401authentication_errorMissing or invalid Authorization header, or invalid API key (including malformed, unknown, or revoked keys).
402insufficient_creditsCredit balance too low
500internal_errorUnexpected server-side failure for non-streaming/pre-stream requests (streaming failures surface as in-band SSE error events with the same code)

On this page