API v1 — stable

Xelurel AI Documentation

Xelurel AI is a governance layer for AI output. It sits between your AI model and your end users — assessing every output against your policy, enforcing human review where required, and logging every decision immutably for compliance and audit.

You send Xelurel AI your AI's output. Xelurel AI returns a decision: allow review block. Every decision is logged with a unique ID, risk score, triggered rules, and an immutable audit trail.

Your AI model

→

Generated output

→

Xelurel AI /assess

→

allow / review / block

→

End user

ℹ

Xelurel AI does not modify, generate, or store your AI output in raw form. It makes a governance decision about content your model already produced, hashes the content for integrity, and records that decision permanently.

Quickstart

Get a governance decision on your first AI output in under 5 minutes.

Get your API key

Sign into the Xelurel AI dashboard and navigate to API Keys. Create a key with the environment set to test for development. Keys are prefixed xel_test_ for sandbox and xel_live_ for production.

Make your first request

Use the official SDK (npm install @xelurelai/sdk) or send a raw HTTP POST to /api/v1/assess with your API key in the x-api-key header. See examples below.

Read the decision

Check the decision field: allow, review, or block. Route your output accordingly — show it, hold it for human review, or suppress it entirely.

Store the decision ID

Persist the decision_id alongside your record. This is your audit receipt — it permanently links your output to the governance decision that governed it.

SDK (recommended)

bash

npm install @xelurelai/sdk

typescript

import { XelurelAI } from '@xelurelai/sdk';

const client = new XelurelAI({
  apiKey: process.env.XELURELAI_API_KEY,
});

const result = await client.assess({
  prompt:   "Summarize this patient visit",
  output:   "Patient prescribed 500mg amoxicillin twice daily for 7 days.",
  use_case: "medical_note",
});

// result.decision → "allow" | "review" | "block"
// result.decision_id → keep this for your audit trail

cURL

bash

curl -X POST https://api.xelurel.com/v1/assess \
  -H "Content-Type: application/json" \
  -H "x-api-key: xel_test_your_key_here" \
  -d '{
    "prompt":   "Summarize this patient visit",
    "output":   "Patient prescribed 500mg amoxicillin twice daily for 7 days.",
    "use_case": "medical_note"
  }'

Response

json

{
  "decision_id":       "9f4e2a1b-3c7d-4e8f-a1b2-c3d4e5f67890",
  "decision":         "review",
  "risk_score":        40,             // 0–100
  "risk_score_normalized": 0.4,          // 0.0–1.0
  "reasons":           ["contains medication dosage"],
  "policy_id":         "healthcare_default",
  "policy_version":    "1.0.0"
}

Authentication

All API requests require an API key in the x-api-key request header. Keys are tenant-scoped — all activity is isolated to your workspace.

http

POST /api/v1/assess HTTP/1.1
x-api-key: xel_test_abc123...
Content-Type: application/json

Key environments

Prefix	Environment	Behaviour
`xel_test_`	sandbox	Full functionality. Decisions logged. Safe for development, staging, and integration testing.
`xel_live_`	production	Full functionality. Use for any production or patient-facing workflows.

⚠

Never expose your API key client-side. All calls to /api/v1/assess must originate from your backend server — never directly from a browser, mobile app, or public script.

API Key Management

API keys are managed through the Xelurel AI dashboard. Each key is scoped to your tenant and can be independently labelled, rotated, and revoked.

Creating a key

Open the dashboard

Navigate to your Xelurel AI dashboard and click API Keys in the top navigation.

Click Create key

Choose a label (e.g. "Production backend"), select the environment (test or live), and confirm.

Copy your key now

The full key is shown once only at creation time. Copy it immediately and store it in your secrets manager or environment variable store. Xelurel AI only retains the last 4 characters for identification.

Key security practices

Practice	Detail
Use separate keys per environment	Never use a live key in development. Keep test and production keys strictly separated.
Store in environment variables	Never hardcode keys in source code. Use `process.env.XELURELAI_API_KEY` or your secrets manager.
Rotate on suspected compromise	Revoke the key immediately from the dashboard and create a replacement. Revocation is instant.
Label keys meaningfully	Use labels like `prod-backend-v2` so you can identify and revoke specific keys without disruption.

ℹ

Rate limits are enforced per API key as well as per tenant. Creating multiple keys does not increase your overall rate limit allocation — limits are shared across your tenant.

POST /api/v1/assess

The core endpoint. Send your AI's generated output and receive a governance decision.

POST/api/v1/assessAssess an AI-generated output

Request body

Field	Type	Required	Description
prompt	string	required	The input sent to your AI model. Max 50,000 characters. Used for semantic overlap checks and logged as a hash.
output	string	required	The AI-generated output to assess. Max 50,000 characters. This is the content that will reach your user if allowed.
use_case	string	optional	Content type for policy and threshold selection. Defaults to `general`. See .
model	string	optional	The model name that generated the output (e.g. "gpt-4o", "claude-3-5-sonnet"). Logged for audit purposes only.
context	object	optional	Additional metadata. Can include `session_id`, `patient_id`, or any key-value pairs. Not stored in raw form.
policy_id	string	optional	Override the auto-selected policy for this assessment. Useful for multi-policy tenants.

Response

Field	Type	Description
decision_id	string (uuid)	Unique identifier for this governance decision. Store this — it is your audit receipt.
tenant_id	string	Your tenant identifier.
decision	string	The governance decision: `allow`, `review`, or `block`.
risk_score	integer (0–100)	Aggregate risk score normalised to 0–100. Suitable for display.
risk_score_normalized	float (0.0–1.0)	Raw risk score. Used for threshold comparisons. Matches policy threshold values directly.
reasons	string[]	Human-readable list of rules that triggered. Show these to reviewers — they explain why the decision was made.
policy_id	string	The policy that governed this assessment.
policy_version	string	The exact policy version applied. Immutably linked to this decision record.
api_key_env	string	`test` or `live`, matching the key used.

POST /api/v1/assess/batch

Assess up to 50 AI outputs in a single request. Each item is evaluated independently against your active policy. The batch endpoint returns one decision per item and creates the same immutable audit record as a single /assess call.

POST/api/v1/assess/batchBatch assess up to 50 outputs

Request body

Field	Type	Required	Description
items	array	required	Array of assessment objects. Max 50 items per request. Each item accepts the same fields as a single `/assess` request: `prompt`, `output`, `use_case`, `model`, `context`, `policy_id`.

Response

Returns an array of results in the same order as the input items. Each result has the same shape as a single /assess response, plus an index field.

Field	Type	Description
results	array	Array of decision objects, one per input item, in input order.
results[n].index	integer	Zero-based position of this result in the input array.
results[n].decision_id	string (uuid)	Audit receipt for this item. Store on your record.
results[n].decision	string	`allow`, `review`, or `block`.
results[n].risk_score	integer (0–100)	Aggregate risk score for this item.
results[n].reasons	string[]	Rules that triggered for this item.
results[n].policy_version	string	The policy version applied to this item.

Example request

typescript

const res = await fetch('https://api.xelurel.com/v1/assess/batch', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'x-api-key': process.env.XELURELAI_API_KEY },
  body: JSON.stringify({
    items: [
      { prompt: userMessage1, output: aiReply1, use_case: "medical_note" },
      { prompt: userMessage2, output: aiReply2, use_case: "medical_note" },
    ],
  }),
});

const { results } = await res.json();
for (const item of results) {
  // same handling as single /assess
  if (item.decision === 'allow') deliverToUser(item);
  if (item.decision === 'review') queueForReview(item);
  if (item.decision === 'block')  suppress(item);
}

ℹ

Batch assess is available on the Growth plan and above. Each item counts as one assessment against your rate limit and usage quota. The batch endpoint shares the same rate limits as /assess — the per-tenant limit of 120/min applies to the total number of items across all requests.

Decision States

Every assessment returns one of three decisions. Your integration must handle all three — never assume only allow will be returned.

Decision	risk_score_normalized	Meaning	Required action
allow	0.00 – 0.30	Output passed all policy rules within acceptable thresholds.	Safe to deliver to end user. Store the `decision_id` on your record.
review	0.31 – 0.69	Output triggered one or more risk rules. Requires a human decision before delivery.	Do not auto-publish. Hold the output and route to your human review queue. Record reviewer action.
block	0.70 – 1.00	Output triggered high-weight rules. Risk exceeds automatic review threshold.	Suppress the output entirely. Inform the user that manual input is required. Log the `decision_id` regardless.

⚠

Never auto-publish a review decision. The entire compliance value of Xelurel AI is that review decisions require a human to approve, edit, or reject them. Auto-publishing defeats the audit trail and removes the human accountability the system is designed to prove.

Risk Scoring

The risk score is an aggregate of all rules that triggered during assessment. Each rule in your active policy carries a weight between 0 and 1. When a rule triggers, its weight is added to the running score. The final score is capped at 1.0 before thresholds are applied.

The response returns two representations: risk_score (0–100, integer, for display) and risk_score_normalized (0.0–1.0, float, matches policy threshold values directly).

json

// Example: two rules triggered
// DOSAGE_DETECTED (weight: 0.4) + LOW_SEMANTIC_OVERLAP (weight: 0.3)

{
  "risk_score":            70,      // display value (0–100)
  "risk_score_normalized": 0.7,     // 0.4 + 0.3 = 0.70
  "decision":              "block",  // 0.70 exceeds reviewMax threshold
  "reasons": ["contains medication dosage", "output may not relate to prompt"]
}

ℹ

Risk score thresholds are configurable per tenant through the dashboard policy editor. Changes to thresholds take effect immediately on the next assessment after publishing.

Rate Limits

Xelurel AI enforces two independent rate limits on /api/v1/assess. Both must pass for the request to proceed. When a limit is exceeded, the API returns 429 with a Retry-After header.

Limit type

Limit

Scope

Tenant

120 / min

All requests across all API keys for your tenant

API key

60 / min

Requests from a single API key

Handling 429 responses

typescript

async function assessWithRetry(payload, retries = 3) {
  const res = await fetch('https://api.xelurel.com/v1/assess', {
    method: 'POST',
    headers: { 'x-api-key': process.env.XELURELAI_API_KEY, 'Content-Type': 'application/json'  },
    body: JSON.stringify(payload),
  );

  if (res.status === 429) {
    if (retries === 0) throw new Error("rate limited");
    const wait = parseInt(res.headers.get('Retry-After') ?? '2') * 1000;
    await new Promise(r => setTimeout(r, wait));
    return assessWithRetry(payload, retries - 1);
  }
  return res.json();
}

JavaScript SDK

The official SDK wraps the assess API with typed responses, automatic error handling, and retry-after support. Zero dependencies — works in Node.js 18+ and modern bundlers.

ℹ

No framework required. Xelurel AI is a REST API — any backend that can make an HTTP POST request works: Node.js, Python, PHP, Ruby, Cloudflare Workers, Netlify Functions, or plain curl. The SDK is optional convenience. The only firm requirement is that your API key stays server-side and is never exposed in browser code or a public script. See if you don't have a traditional backend.

Installation

bash

npm install @xelurelai/sdk

Initialisation

typescript

import { XelurelAI } from '@xelurelai/sdk';

// Instantiate once and reuse across your application
const xel = new XelurelAI({
  apiKey:  process.env.XELURELAI_API_KEY,  // required
  baseUrl: "https://app.xelurel.com", // optional — defaults to this
  timeout: 30_000,               // optional — ms, default 30 000
});

client.assess(params)

Param	Type	Required	Description
prompt	string	required	The input sent to your AI model.
output	string	required	The AI-generated text to evaluate.
use_case	string	optional	Use-case hint for policy routing. e.g. `medical_note`, `legal_draft`.
policy_id	string	optional	Override the policy to evaluate against.
model	string	optional	Model name — logged for audit, not used in scoring.

client.assessBatch(items)

Assess up to 50 prompt/output pairs in a single request. Results are returned in the same order as the input. Requires Growth plan or higher.

typescript

const { results } = await xel.assessBatch([
  { prompt: promptA, output: outputA, use_case: "healthcare" },
  { prompt: promptB, output: outputB, use_case: "healthcare" },
]);

for (const r of results) {
  if (XelurelAI.isBlocked(r)) continue;
  deliver(r.decision_id);
}

client.assessStream(stream, params)

Buffer a ReadableStream or Node.js async iterable, then assess the full accumulated output. Use this with streaming LLM responses to avoid calling assess on incomplete text.

typescript

// stream is an OpenAI / Anthropic streaming response
const stream = await openai.chat.completions.create({
  model: "gpt-4o", stream: true, messages 
});

const result = await xel.assessStream(stream, {
  prompt:   userMessage,
  use_case: "customer_support",
});

if XelurelAI.isBlocked(result)) {
  // stream was safe to deliver but the full output is now blocked
}

Full usage example

typescript

import { XelurelAI, XelurelAIError } from '@xelurelai/sdk';

const xel = new XelurelAI({ apiKey: process.env.XELURELAI_API_KEY });

try {
  const result = await xel.assess({
    prompt:   userMessage,
    output:   aiResponse,
    use_case: "customer_support",
    model:    "gpt-4o",
  });

  if XelurelAI.isAllowed(result)) {
    return deliverToUser(result);
  }

  if XelurelAI.needsReview(result)) {
    await queueForReview({ output: aiResponse, decisionId: result.decision_id, reasons: result.reasons });
    return { status: "pending", decisionId: result.decision_id };
  }

  // Xelurel AI.isBlocked(result) — suppress the output
  return { status: "blocked", reasons: result.reasons };
} catch (err) {
  if (err instanceof XelurelAIError && err.code === "rate_limited") {
    // retry after err.retryAfterMs
  }
  throw err;
}

Static helpers

Method	Returns	Description
`XelurelAI.isAllowed(result)`	boolean	True when `decision === "allow"`. Safe to deliver output.
`XelurelAI.needsReview(result)`	boolean	True when `decision === "review"`. Route to human review queue.
`XelurelAI.isBlocked(result)`	boolean	True when `decision === "block"`. Suppress output entirely.
`XelurelAI.verifyWebhook(payload, signature, secret)`	boolean	Verify the `X-XelurelAI-Signature-256` HMAC on an incoming webhook. Pass the raw request body string — not a parsed object.

Webhook verification

Every webhook Xelurel AI delivers is signed with HMAC-SHA256 using the secret you configure in the dashboard. Verify the signature on every incoming request before processing the payload.

⚠

Always pass the raw request body string — not a parsed JSON object. Once you JSON.parse() and re-JSON.stringify() the body, whitespace or key ordering may differ and the signature check will fail. In Next.js App Router use await req.text(). In Express, add express.raw({ type: '*/*' }) before your route and call req.body.toString().

typescript

import { XelurelAI } from '@xelurelai/sdk';

// Next.js App Router
export async function POST(req: Request) {
  const payload   = await req.text();                              // raw body — do not parse first
  const signature = req.headers.get('x-xelurelai-signature-256') ?? [object Object][object Object]

  if (!XelurelAI.verifyWebhook(payload, signature, process.env.XELUREL_WEBHOOK_SECRET!)) {
    return new Response([object Object][object Object]401 });
  }

  const event = JSON.parse(payload);  // safe to parse now that signature is verified
  // event.decision, event.decision_id, event.risk_score ...
  return new Response('OK', { status: 200 });
}

XelurelAIError

All API errors throw a XelurelAIError with the following properties:

Property	Type	Description
`message`	string	Human-readable error description.
`status`	number \| null	HTTP status code, if the request reached the server.
`code`	string \| null	Machine-readable code: `rate_limited`, `timeout`, `network_error`.
`retryAfterMs`	number \| null	Set on `rate_limited` — milliseconds to wait before retrying.

ℹ

The SDK is ESM-first (import) with a CommonJS wrapper (require) for legacy Node.js projects. TypeScript types are bundled — no @types/ package needed.

Node.js / TypeScript

Standard integration pattern for any AI pipeline that produces text output before delivery to a user.

User input

→

AI model

→

Generated output

→

Xelurel AI /assess

→

allow / review / block

→

End user

Using the SDK

typescript

import { XelurelAI } from '@xelurelai/sdk';

const xel = new XelurelAI({ apiKey: process.env.XELURELAI_API_KEY });

async function assessOutput(prompt: string, output: string, useCase: string) {
  const result = await xel.assess({ prompt, output, use_case: useCase });

  switch (result.decision) {
    case 'allow':  return { status: 'ready', output, decision_id: result.decision_id };
    case 'review': await queueForReview({ output, decision_id: result.decision_id, reasons: result.reasons });
               return { status: 'pending_review', decision_id: result.decision_id };
    case 'block':  return { status: 'blocked', decision_id: result.decision_id, reasons: result.reasons };
  }
}

Raw fetch (no SDK)

typescript

async function assessOutput(prompt: string, output: string, useCase: string) {
  const res = await fetch('https://api.xelurel.com/v1/assess', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'x-api-key': process.env.XELURELAI_API_KEY! },
    body: JSON.stringify({ prompt, output, use_case: useCase }),
  });
  if (!res.ok) throw new Error(`Xelurel AI error ${res.status}`);
  const { decision_id, decision, reasons } = await res.json();
  // handle decision as above...
}

Python

Use the official SDK for typed responses, automatic retries, and built-in webhook verification. Requires Python 3.8+ and one dependency (httpx).

Installation

bash

pip install xelurelai

Sync usage

python

from xelurelai import XelurelAI, XelurelAIError
import os

# Instantiate once and reuse across your application
xel = XelurelAI(api_key=os.environ["XELURELAI_API_KEY"])

try:
    result = xel.assess(
        prompt  = user_message,
        output  = ai_response,
        use_case="customer_support",
        model   ="gpt-4o",
    )

    if result.allowed:
        deliver_to_user(result)
    elif result.needs_review:
        queue_for_review(output=ai_response, decision_id=result.decision_id)
    else:  # result.blocked
        suppress(reasons=result.reasons)
except XelurelAIError as e:
    if e.code == "rate_limited":
        time.sleep(e.retry_after_ms / 1000)
    raise

Async usage (FastAPI, asyncio)

python

from xelurelai import AsyncXelurelAI
import os

xel = AsyncXelurelAI(api_key=os.environ["XELURELAI_API_KEY"])

async def assess(prompt: str, output: str) -> dict:
    result = await xel.assess(prompt=prompt, output=output)
    return {"decision": result.decision, "decision_id": result.decision_id}

Webhook verification

Use XelurelAI.verify_webhook() to authenticate incoming webhook deliveries. The same method is available on AsyncXelurelAI for async applications.

⚠

Always pass the raw request body — not a parsed dict. In FastAPI use await request.body(). In Flask use request.get_data(). Parsing to a dict and re-serialising will change whitespace or key order and break the HMAC check.

python

from xelurelai import XelurelAI
from fastapi import FastAPI, Request, HTTPException
import os

app = FastAPI()

@app.post("/webhooks/xelurel")
async def xelurel_webhook(request: Request):
    body = await request.body()   # bytes — do not parse first
    sig  = request.headers.get("x-xelurelai-signature-256", "")

    if not XelurelAI.verify_webhook(body, sig, os.environ["XELUREL_WEBHOOK_SECRET"]):
        raise HTTPException(status_code=401, detail="Invalid signature")

    event = json.loads(body)  # safe to parse now
    # event["decision"], event["decision_id"], event["risk_score"] ...
    return {"ok": True}

Without the SDK (raw httpx / requests)

python

import requests, os

def assess_output(prompt: str, output: str, use_case: str = "general") -> dict:
    response = requests.post(
        "https://api.xelurel.com/v1/assess",
        headers={{"x-api-key": os.environ["XELURELAI_API_KEY"], "Content-Type": "application/json"},
        json={{"prompt": prompt, "output": output, "use_case": use_case},
        timeout=10,
    )
    response.raise_for_status()
    return response.json()

LangChain

Drop XelurelCallbackHandler into any LangChain chain to govern every LLM output. Works transparently with streaming — LangChain accumulates tokens before calling the handler, so no extra setup is needed.

Python — install

bash

pip install "xelurelai[langchain]"

Python — sync usage

python

from langchain_openai import ChatOpenAI
from xelurelai import XelurelAI
from xelurelai.langchain import XelurelCallbackHandler, XelurelBlockedError
import os

client  = XelurelAI(api_key=os.environ["XELURELAI_API_KEY"])
handler = XelurelCallbackHandler(client, use_case="customer_support")

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

try:
    response = llm.invoke("Help me with my refund.")
except XelurelBlockedError as e:
    # e.result is the full AssessResult
    log_blocked_output(e.result.decision_id, e.result.reasons)

Python — async usage (FastAPI, asyncio)

python

from xelurelai import AsyncXelurelAI
from xelurelai.langchain import AsyncXelurelCallbackHandler

client  = AsyncXelurelAI(api_key=os.environ["XELURELAI_API_KEY"])
handler = AsyncXelurelCallbackHandler(client, use_case="legal")

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
response = await llm.ainvoke("Draft a contract clause.")

JavaScript / TypeScript

typescript

import { ChatOpenAI } from '@langchain/openai';
import { XelurelAI } from '@xelurelai/sdk';
import { XelurelCallbackHandler, XelurelBlockedError } from '@xelurelai/sdk/langchain';

const client  = new XelurelAI({ apiKey: process.env.XELURELAI_API_KEY });
const handler = new XelurelCallbackHandler(client, { useCase: "healthcare" });

const llm = new ChatOpenAI({ model: "gpt-4o", callbacks: [handler] });

try {
  const response = await llm.invoke("Summarise patient visit.");
} catch (err) {
  if (err instanceof XelurelBlockedError) {
    // err.result is the full AssessResult
    logBlocked(err.result);
  }
}

Option	Type	Default	Description
NaN	string	—	Use-case tag forwarded to /assess.
NaN	string	—	Override the policy to evaluate against.
NaN	function	—	Called with AssessResult before raising.
NaN	function	—	Called with AssessResult on review decision.
NaN	bool	`True`	Raise `XelurelBlockedError` on block. Set to `False` to handle via `on_blocked` instead.
NaN	bool	`True`	If True, assessment errors are silently ignored — a governance outage never breaks user traffic.

LlamaIndex

Attach XelurelCallbackHandler to LlamaIndex's CallbackManager to govern every LLM call in your RAG pipeline — query engines, chat engines, and agents alike.

Install

bash

pip install "xelurelai[llamaindex]"

Global governance — one line

python

from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.callbacks import CallbackManager
from xelurelai import XelurelAI
from xelurelai.llamaindex import XelurelCallbackHandler

client  = XelurelAI(api_key=os.environ["XELURELAI_API_KEY"])
handler = XelurelCallbackHandler(client, use_case="legal")

# Every LLM call in this application is now governed
Settings.callback_manager = CallbackManager([handler])

index    = VectorStoreIndex.from_documents(docs)
response = index.as_query_engine().query("Summarise the contract risks.")

Per-index governance

python

# Apply governance only to a specific index, not globally
index = VectorStoreIndex.from_documents(
    docs,
    callback_manager=CallbackManager([handler]),
)
engine = index.as_query_engine()
response = engine.query("What are the indemnification terms?")

ℹ

LlamaIndex integration uses the same XelurelCallbackHandler options as the LangChain handler (on_blocked, on_review, raise_on_block, fail_open). An AsyncXelurelCallbackHandler is not available for LlamaIndex — use the sync client.

PHP

Use PHP's built-in cURL extension to call /api/v1/assess from any PHP backend — WordPress, Laravel, Symfony, or a plain script.

php

<?php

function assess_output(string $prompt, string $output, string $use_case = 'general') : array {
    $payload = json_encode([
        'prompt'   => $prompt,
        'output'   => $output,
        'use_case' => $use_case,
    ]);

    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_URL            => 'https://api.xelurel.com/v1/assess',
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => $payload,
        CURLOPT_TIMEOUT        => 10,
        CURLOPT_HTTPHEADER     => [
            'Content-Type: application/json',
            'x-api-key: ' . getenv('XELURELAI_API_KEY'),
        ],
    ]);

    $body = curl_exec($ch);
    $status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);

    if ($status !== 200) {
        throw new \RuntimeException('Xelurel AI error: ' . $status);
    }

    return json_decode($body, true);
}

// Usage
$result = assess_output(
    prompt:   $user_message,
    output:   $ai_response,
    use_case: 'medical_note'
);

switch ($result['decision']) {
    case 'allow':  deliver_to_user($result); break;
    case 'review': queue_for_review($result); break;
    case 'block':  suppress_output($result); break;
}

✓

Store the API key in your environment ($_ENV['XELURELAI_API_KEY'] or getenv('XELURELAI_API_KEY')), not hardcoded in source. In Laravel, add it to your .env file and access it via env('XELURELAI_API_KEY').

HTML / Serverless Sites

If your frontend is plain HTML or a static site (no server), you still need a server-side function to hold your API key. The pattern is: your page calls your own endpoint, your endpoint calls Xelurel AI, and the decision comes back. A single serverless function is all it takes.

HTML page

→

Your serverless fn

→

Xelurel AI /assess

→

Decision back to page

⚠

Never call /assess directly from browser JavaScript. Your API key would be visible to anyone who opens DevTools. Always proxy through a server-side function you control.

Cloudflare Worker

javascript

// wrangler.toml: set XELURELAI_API_KEY as an environment secret
// npx wrangler secret put XELURELAI_API_KEY

export default {
  async fetch(request, env) {
    if (request.method === 'OPTIONS') return new Response(null, {
      headers: { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Headers': 'Content-Type'  }
    });

    const { prompt, output, use_case } = await request.json();

    const res = await fetch('https://api.xelurel.com/v1/assess', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'x-api-key': env.XELURELAI_API_KEY,
      },
      body: JSON.stringify({ prompt, output, use_case }),
    });

    const data = await res.json();
    return Response.json(data, {
      headers: { 'Access-Control-Allow-Origin': '*'  }
    });
  }
}

Calling from your HTML page

html

<script>
async function checkOutput(prompt, output) {
  const res = await fetch('https://your-worker.workers.dev/assess', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json'  },
    body: JSON.stringify({ prompt, output, use_case: 'general'  }),
  });
  const { decision, decision_id, reasons } = await res.json();

  if (decision === 'allow')  showOutput(output);
  if (decision === 'review') showPendingMessage();
  if (decision === 'block')  showBlockedMessage(reasons);
}
</script>

Other serverless options

Platform	Where your key lives	Docs
Cloudflare Workers	Wrangler secrets — `npx wrangler secret put XELURELAI_API_KEY`	workers.cloudflare.com
Netlify Functions	`netlify.toml` env or Netlify UI environment variables	docs.netlify.com/functions
Vercel Serverless	Vercel project environment variables — `process.env.XELURELAI_API_KEY`	vercel.com/docs/environment-variables
AWS Lambda	Lambda environment variables or Secrets Manager	docs.aws.amazon.com/lambda

✓

If your backend already calls OpenAI or Anthropic, the fastest path is proxy mode — swap the provider baseURL for the Xelurel AI proxy endpoint. No new endpoint needed, no frontend changes. See

Proxy Mode

Proxy mode lets you add Xelurel AI governance to an existing OpenAI or Anthropic integration with a single line change. Instead of calling the provider directly, you point your SDK at the Xelurel AI proxy URL. The proxy forwards your request to the provider, runs governance on the response, and returns the result — all in the same call. No second API call. No restructured code.

How it works

Your app

→

xelurel proxy

→

OpenAI / Anthropic

→

Governance assessment

→

Response + headers

OpenAI SDK — one line change

typescript

// Before: direct OpenAI call
const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// After: route through Xelurel AI proxy — one baseURL change
const client = new OpenAI({
  apiKey:         process.env.OPENAI_API_KEY,
  baseURL:        "https://api.xelurel.com/v1/proxy/openai",  // ← only change
  defaultHeaders: { "x-api-key": process.env.XELURELAI_API_KEY },
});

// Your existing calls are unchanged — text outputs and tool calls are both governed
const completion = await client.chat.completions.create({
  model:    "gpt-4o",
  messages: [{ role: "user", content: userMessage }],
});

Anthropic SDK

typescript

const client = new Anthropic({
  apiKey:    "placeholder", // real key goes in x-upstream-key below
  baseURL:   "https://api.xelurel.com/v1/proxy/anthropic",
  defaultHeaders: {
    "x-api-key":        process.env.XELURELAI_API_KEY,
    "x-upstream-key": process.env.ANTHROPIC_API_KEY,
  },
});

Tool call assessment

The proxy assesses function/tool call arguments alongside text content. If a model's tool call includes PII, sensitive data, or an injection pattern in its arguments, the same policy rules that catch text violations will catch it — no extra configuration needed.

When a response includes tool calls, the header X-XelurelAI-Tool-Calls-Assessed: true is set on the response so you can confirm governance ran on the structured output.

Governance response headers

Every proxied response includes Xelurel AI governance headers so you can act on the decision without parsing a separate payload:

Header	Value	Description
`X-XelurelAI-Decision`	allow \| review \| block	The governance decision for this response.
`X-XelurelAI-Risk-Score`	0–100	Aggregate risk score (integer).
`X-XelurelAI-Decision-Id`	uuid	The audit receipt. Store this on your record.
`X-XelurelAI-Decision-Source`	string	How the decision was reached: `deterministic`, `semantic_rule`, `judge`, or `score_fallback`.
`X-XelurelAI-Judge-Risk-Score`	0–100	Judge risk score when the LLM judge contributed to the decision.
`X-XelurelAI-Tool-Calls-Assessed`	true	Present when the response contained tool calls. Confirms function arguments were assessed.

⚠

Proxy mode does not auto-block. The proxy always returns 200 with the provider's response body intact — even when the governance decision is block. The X-XelurelAI-Decision header tells you how to handle it. Your application must read that header and suppress or hold the response before delivering it to your user. The audit record is created regardless of your handling.

Self-hosted / Ollama / vLLM

Enterprise teams running Llama, Mistral, Qwen, or other open-source models via Ollama, vLLM, or LiteLLM can route those models through Xelurel AI governance using the x-upstream-base-url header. The proxy forwards your request to your self-hosted endpoint instead of OpenAI, then applies your policy to the response.

1. Add your model URL to the allowlist

Before the proxy will accept a custom upstream URL, you must add it to your tenant's allowed list. Do this once in your dashboard under Settings → Upstream URLs, or via the API:

bash

curl -X POST https://api.xelurel.com/api/admin/upstream-urls \
  -H "x-api-key: YOUR_SESSION_TOKEN" \
  -d '{"tenantId":"YOUR_TENANT_ID","url":"https://ollama.internal.company.com"}'

2. Point the proxy at your model

Ollama exposes an OpenAI-compatible API — use the OpenAI proxy endpoint with x-upstream-base-url:

typescript

const client = new OpenAI({
  apiKey:         "not-required", // Ollama has no API key — pass anything
  baseURL:        "https://api.xelurel.com/v1/proxy/openai",
  defaultHeaders: {
    "x-api-key":             process.env.XELURELAI_API_KEY,
    "x-upstream-base-url": "https://ollama.internal.company.com",
  },
});

// All calls to llama3.2, mistral, qwen, etc. are now governed by your policy
const completion = await client.chat.completions.create({
  model:    "llama3.2",
  messages: [{ role: "user", content: userMessage }],
});

vLLM or LiteLLM with an API key

typescript

const client = new OpenAI({
  apiKey:         "placeholder",
  baseURL:        "https://api.xelurel.com/v1/proxy/openai",
  defaultHeaders: {
    "x-api-key":             process.env.XELURELAI_API_KEY,
    "authorization":       `Bearer ${process.env.VLLM_API_KEY}`, // your vLLM/LiteLLM key
    "x-upstream-base-url": "https://vllm.prod.internal:8000",
  },
});

Header	Required	Description
`x-upstream-base-url`	optional	The base URL of your self-hosted model. Must be pre-registered in Dashboard → Settings → Upstream URLs. HTTPS required for non-localhost.
`authorization`	optional	Your model's API key (if it requires one). Omit entirely for Ollama and other unauthenticated deployments.

ℹ

x-upstream-base-url is validated against your tenant's pre-registered allowlist on every request. Unregistered URLs are rejected with a 403. This prevents SSRF and ensures governance is never silently bypassed by routing to an unexpected endpoint.

Handling Decisions in Your UI

What your user-facing interface should do for each decision state.

allow — deliver the output

tsx

// Output passed governance — present normally
if (decision === 'allow') {
  return <OutputViewer output={output} governanceId={decision_id} />;
}

review — hold and route to reviewer

tsx

// Output flagged — must not be auto-published
if (decision === 'review') {
  return (
    <ReviewBanner
      reasons={reasons}    // show why it was flagged
      output={output}
      onApprove={() => submitOutput(output, decision_id, 'approved')
      onEdit={() => openEditor(output, decision_id)
      onReject={() => discardOutput(decision_id)
    />
  );
}

block — suppress and prompt manual input

tsx

// Output blocked — do not show AI content to user
if (decision === 'block') {
  return (
    <BlockedNotice
      message="This output could not be automatically generated safely."
      reasons={reasons}
      onManualEntry={() => openManualEditor()
      governanceId={decision_id}   // log even blocked attempts
    />
  );
}

✓

Always store decision_id on your underlying record — even for allow decisions. This links every output to the governance decision that permitted it. Your audit trail requires this.

Use Case Types

The use_case field tells Xelurel AI what kind of content it is assessing. Different use cases apply different policy thresholds — clinical content uses stricter thresholds than general content.

use_case value	Policy applied	Notes
medical_note	healthcare_default	SOAP notes, visit summaries, clinical documentation. Strictest thresholds — dosage, diagnosis language, allergy flags all apply.
discharge_summary	healthcare_default	Discharge documentation. Same policy as medical_note.
patient_instructions	healthcare_default	Patient-facing instructions. Dosage and allergy rules apply.
legal_draft	law_default	Legal documents and contract analysis. Flags definitive advice language and privileged terms.
legal_summary	law_default	Summarised legal content. Same policy as legal_draft.
financial_advice	finance_default	Investment, tax, or financial planning output. Flags unqualified advice language and missing disclaimers.
financial_summary	finance_default	Summarised financial content. Same policy as financial_advice.
customer_support	customer_support_default	Customer-facing support responses. Injection detection active — guards against adversarial prompts. PII check prevents leaking customer data in replies.
general	general_default	Default fallback. Standard thresholds. Appropriate when no specific use case applies.

ℹ

You can define additional use cases and per-use-case threshold overrides through your policy configuration in the dashboard — no code changes needed.

How Policies Work

A policy is a versioned set of rules and thresholds that governs how AI outputs are assessed. Every tenant has their own isolated policies. Every assessment records the exact policy version that was active — the audit trail is immutable even when policies change.

Policies use semantic versioning (1.0.0, 1.0.1, etc.). You work in a draft version that you can edit freely, then publish it as a new immutable version. Past decisions remain permanently linked to the version that governed them.

Policy lifecycle

Edit the draft

Navigate to Policies in your dashboard. All editing happens in the draft version — changes do not affect live assessments until you publish.

Test your changes

Use your test API key to run assessments and verify the draft rules behave as expected before promoting to production.

Publish

Clicking Publish creates a new immutable version and immediately activates it for all future assessments under that policy.

Rollback if needed

If a published version causes unexpected behaviour, you can roll back to any previous version from the dashboard. Past decisions remain linked to the version that governed them.

ℹ

Policy changes take effect immediately after publishing. There is no cache delay — every assessment fetches the active policy version directly.

Rule Types

Each policy contains an array of rules. When a rule triggers, its weight is added to the risk score. Rules are evaluated in order with an early exit once the block threshold is exceeded.

regex

Tests a regular expression against the target text. Use for pattern detection — medication dosages, critical values, specific terminology. Max 300 character pattern.

contains_any

Triggers if the target contains any of the provided strings. Case-insensitive substring match. Efficient for keyword lists.

length_lt

Triggers if the target text is shorter than min characters. Catches suspiciously short or empty outputs that may indicate a model failure.

token_overlap_lt

Triggers if the semantic token overlap between prompt and output is below minOverlap. Catches outputs that do not appear to address their prompt.

semantic_similarity

Triggers if the semantic similarity score between prompt and output is below minSimilarity (0.0–1.0). Embedding-based — more accurate than token overlap for paraphrased content.

pii_check

Detects personally identifiable information in the target text. Optional piiTypes array scopes detection (e.g. ["email","ssn","phone"]). Optional minConfidence ("low" | "medium" | "high") sets the detection threshold.

injection_check

Detects prompt injection and jailbreak attempts. Optional categories array scopes to specific attack patterns. Optional minSeverity string filters by severity level.

llm_classifier

Routes the text to a secondary LLM classifier with a yes/no question. Triggers if the classifier answers yes with confidence above threshold (default 0.7). Use for nuanced semantic checks regex cannot express.

llm_judge

Routes both prompt and output to an LLM judge that evaluates for specified categories (e.g. ["hallucination","harmful_advice"]). Triggers if any category score exceeds threshold. Highest fidelity — best for high-stakes content.

Rule targets

target	Description
output	Evaluate the rule against the AI-generated output only. Most rules should target output.
prompt	Evaluate against the input prompt only. Useful for detecting sensitive query patterns.
prompt_output	Evaluate against the concatenation of prompt and output. Used for overlap and relevance checks.

Example rule — regex

json

{
  "id":      "DOSAGE_DETECTED",
  "type":    "regex",
  "target":  "output",      // "prompt" | "output" | "prompt_output"
  "pattern": "\\b\\d+(\\.\\d+)?\\s*(mg|ml|mcg|units|tablets?)\\b",
  "flags":   "i",           // case-insensitive
  "weight":  0.4,           // added to risk_score_normalized if triggered
  "reason":  "contains medication dosage"
}

Example rule — contains_any

json

{
  "id":     "ALLERGY_MENTION",
  "type":   "contains_any",
  "target": "output",
  "any":    ["allerg", "anaphylax", "epipen"],
  "weight": 0.3,
  "reason": "contains allergy reference requiring review"
}

Example rule — llm_classifier

json

{
  "id":         "CONTAINS_DIAGNOSIS",
  "type":       "llm_classifier",
  "target":     "output",
  "question":   "Does this text state or imply a specific medical diagnosis?",
  "threshold":  0.75,          // trigger if classifier confidence ≥ 0.75
  "weight":     0.5,
  "reason":     "output contains implied diagnosis"
}

Example rule — llm_judge

json

{
  "id":          "HALLUCINATION_CHECK",
  "type":        "llm_judge",
  "target":      "prompt_output",   // judge evaluates both together
  "categories":  ["hallucination", "unsupported_claim"],
  "threshold":   0.7,             // trigger if any category score ≥ 0.7
  "weight":      0.6,
  "action":      "block",           // force block regardless of threshold total
  "reason":      "output contains potential hallucination"
}

Optional action field

Any rule can include action: "block" to force a block decision immediately when that rule triggers, regardless of the accumulated risk score and threshold values. Use this for zero-tolerance rules where any match should never reach a user — for example, confirmed PII leakage or a hallucination detected by an LLM judge. If omitted, the rule contributes its weight to the score and the threshold determines the final decision.

⚠

Regex rules must not contain nested quantifiers (e.g. (a+)+) to prevent ReDoS attacks. Patterns over 300 characters are rejected. Both checks are enforced at policy save time.

Thresholds

Thresholds define the risk_score_normalized (0.0–1.0) boundaries for each decision. You can set policy-wide thresholds and override them for specific use cases.

json

// Policy-level thresholds (defaults)
"thresholds": {
  "allowMax":  0.30,  // 0.00 – 0.30 → allow
  "reviewMax": 0.69   // 0.31 – 0.69 → review  |  0.70+ → block
},

// Per use_case override — stricter for medical_note content
"useCaseOverrides": {
  "medical_note": {
    "thresholds": {
      "allowMax":  0.19,  // tighter allow band
      "reviewMax": 0.59   // lower block threshold for clinical content
    }
  }
}

Dashboard Overview

The Xelurel AI dashboard is your operational interface for governance. It provides real-time visibility into all AI decisions made through your tenant, a policy editor, analytics, and audit export.

Access the dashboard at /dashboard after signing in with your Xelurel AI account.

Decisions

Real-time log of every assessment. Filter by decision type, date range, or API key. Click any decision to see full details including the reasons, risk score, triggered rules, and the complete audit event log.

Policies

View, edit, publish, and roll back your governance policies. All edits happen in a safe draft version. Publishing creates an immutable version immediately applied to all new assessments.

Analytics

Aggregated stats over configurable time windows. Decision distribution (allow / review / block), top triggered rules, violation reasons, and risk score trends over time.

Decision Log

The decision log is the real-time record of every assessment made through your tenant. Each row represents one call to /api/v1/assess.

What you can see per decision

Field	Description
Decision ID	UUID. The audit receipt stored on your system records.
Timestamp	Server-side ISO timestamp. Tamper-evident.
Decision	allow review block — the governance outcome.
Risk score	Aggregate score at time of assessment (0–100).
Reasons	Human-readable list of rules that triggered and why.
Rules triggered	Exact rule IDs from your active policy that fired.
Policy / version	The exact policy version that governed this decision.
API key	Last 4 of the key used — environment (test / live) and label.
Use case	The use_case value sent in the request.
Model	The client model reported in the request, if provided.
Review status	Whether a human reviewer has acted on this decision.
Audit events	Full chronological log of every action taken on this decision.

Review Actions

Decisions with status review can receive human review actions from your team directly in the dashboard. All review actions are appended to the immutable audit log with the reviewer's identity and timestamp.

Action	Resulting status	When to use
Approve	allow	Reviewer has examined the output and determined it is safe to deliver. Output can now be sent to the end user.
Reject	block	Reviewer has determined the output should not be delivered. Adds reviewer identity and reason to the audit trail.
Send for review	review	Escalate to another team member. Decision remains in review status with an audit note.

✓

Every review action records the reviewer's user ID, email, timestamp, and any note they add. This is the human accountability chain that makes Xelurel AI audit-ready for enterprise and regulatory use cases.

Policies in the Dashboard

The Policies tab lets you view and edit your governance configuration without writing code.

Policy list

Your tenant starts with pre-seeded policies based on the industry you selected at registration:

Policy ID	Industry	Default rules included
general_default	All	OUTPUT_TOO_SHORT, LOW_SEMANTIC_OVERLAP, PII_CHECK
healthcare_default	Healthcare	DOSAGE_DETECTED, ALLERGY_MENTION, OUTPUT_TOO_SHORT, LOW_SEMANTIC_OVERLAP, PII_CHECK
law_default	Legal	DEFINITIVE_LEGAL_ADVICE, PRIVILEGED_TERMS, OUTPUT_TOO_SHORT, LOW_SEMANTIC_OVERLAP
finance_default	Finance	FINANCIAL_ADVICE_DISCLAIMER, OUTPUT_TOO_SHORT, LOW_SEMANTIC_OVERLAP, PII_CHECK
customer_support_default	Customer Support	OUTPUT_TOO_SHORT, LOW_SEMANTIC_OVERLAP, INJECTION_CHECK, PII_CHECK

Editing a policy

Select a policy and click Edit draft. You can add, remove, and reorder rules, adjust weights and thresholds, and add use-case overrides. All changes are saved to the draft version only — live assessments continue using the published version until you explicitly publish.

⚠

Published policy versions are immutable. Once a version is published, it cannot be modified — only superseded by a new published version. This is by design: every past decision must remain permanently linked to the exact policy that governed it.

Analytics

The Analytics tab provides aggregated governance metrics over a configurable time window (default: 7 days).

Metric	Description
Total decisions	All assessments in the selected window.
Decision split	Count and percentage of allow / review / block decisions.
Flagged rate	Percentage of decisions that were review or block — a proxy for output risk rate.
Top rules	The rules that triggered most frequently — useful for tuning policy weights.
Top reasons	Most common human-readable violation reasons, sorted by frequency.
Risk trend	Daily risk score trend over the selected window. Useful for detecting model drift or prompt changes.

Audit Export

Export your complete decision log as CSV or JSON for compliance reporting, external audit, or integration with your SIEM or data warehouse. Export is available from the dashboard and via the API.

GET/api/admin/audit/exportExport decision log

Query parameters

Parameter	Type	Required	Description
tenantId	string	required	Your tenant ID.
format	string	optional	`csv` (default) or `json`.
limit	integer	optional	Max records to return. Default: 2000. Max: 10,000.
fromIso	string	optional	ISO 8601 start date filter (inclusive).
toIso	string	optional	ISO 8601 end date filter (inclusive).
decisionId	string	optional	Export a single decision by ID. Ignores limit/date filters.

CSV fields exported

Every exported row includes: decision_id, timestamp, use_case, model_used, api_key_env, policy_id, policy_version, decision, reviewed_decision, review_status, reviewed_by, reviewed_at_iso, review_note, risk_score, risk_score_normalized, rules_triggered, reasons, prompt_hash, output_hash, audit_events_count, audit_log.

✓

For enterprise procurement or regulatory audits, export the full JSON format. It includes the complete audit event log per decision — every assessment, review action, reviewer identity, and timestamp, in a structured format that can be ingested directly into audit tooling.

Request Schema

typescript

interface AssessRequest {
  prompt:      string;                // required — max 50,000 chars
  output:      string;                // required — max 50,000 chars
  use_case?:   string;                // default: "general"
  model?:      string;                // model name — logged for audit
  context?:    Record<string, unknown>; // optional metadata
  policy_id?:  string;                // override policy selection
}

Response Schema

typescript

interface AssessResponse {
  decision_id:          string;            // uuid — store this on your record
  tenant_id:            string;
  decision:             'allow' | 'review' | 'block';
  risk_score:            number;            // 0–100 integer — for display
  risk_score_normalized: number;            // 0.0–1.0 float — matches threshold values
  reasons:              string[];          // human-readable rule triggers
  policy_id:            string;
  policy_version:       string;
  api_key_id:           string | null;
  api_key_env:          'test' | 'live' | null;
  api_key_last4:        string | null;
}

Errors

All error responses return JSON with an error field describing the problem.

Status	Error	Cause
400	prompt and output are required	Request body missing `prompt` or `output`.
400	prompt and output must be strings	Non-string value passed for prompt or output.
400	prompt and output must each be under 50000 characters	Input exceeds the maximum length limit.
401	missing api key	`x-api-key` header absent.
401	invalid api key	Key not found, revoked, or inactive.
429	rate_limited	Limit exceeded. Respect the `Retry-After` response header (seconds).
500	internal server error	Unexpected server error. Retry with exponential backoff.

Error response shape

json

{
  "error": "prompt and output are required"
}

Rate limit response shape

json

{
  "error":         "rate_limited",
  "retryAfterMs": 15000  // milliseconds until your window resets
}

Audit Trail

Every call to /api/v1/assess creates an immutable record in Xelurel AI's decision log. Records cannot be modified or deleted. The prompt and output are never stored in raw form — they are stored as HMAC-SHA256 hashes, scoped to your tenant, allowing cryptographic verification without retaining potentially sensitive content.

What is logged per decision

Field	Description
id	UUID. Your audit receipt — store this on every record your system produces.
tenantId	Your tenant. All decisions are fully isolated per tenant.
promptHash	Tenant-scoped HMAC-SHA256 of the prompt. Proves what context was provided without storing PHI.
outputHash	Tenant-scoped HMAC-SHA256 of the output. Proves exactly what was assessed.
hashVersion	Hash scheme version, for forward compatibility.
decision	allow / review / block — the governance outcome.
riskScore	The aggregate score (0–100) at time of assessment.
riskScoreNormalized	The raw score (0.0–1.0) used for threshold comparison.
reasons	Human-readable rule triggers at time of assessment.
rulesTriggered	Rule IDs from your policy that fired.
policyId + policyVersion	Exact policy snapshot that governed this decision. Permanently linked.
engineVersion	Assessment engine used (policy_v1 or assess_v1 fallback).
apiKeyId / apiKeyLast4	Which key was used. Last 4 characters only — the full key is never stored.
clientModel	Model name reported by the caller, if provided.
createdAt	Server-side timestamp. Set by Firestore — cannot be spoofed by the caller.
reviewStatus	Current review state: null, approved, rejected, or sent_for_review.
reviewedBy / reviewedByEmail	Identity of the reviewer who acted on this decision.
reviewedAtIso	ISO timestamp of the review action.
reviewNote	Optional note the reviewer attached to their action.
auditLog	Append-only array of every event on this decision — assessment, review actions, escalations. Capped at 200 entries.

✓

When an auditor, hospital procurement team, or regulator asks "how do you ensure AI outputs don't reach users without oversight?" — you open the Xelurel AI dashboard or export the audit log and show them this record. Every output. Every decision. Every reviewer. Timestamped and immutable.