API Reference — Corenet AI

Authentication

All API requests must include an Authorization header with a bearer token. Tokens are scoped to your organization and issued during onboarding. There are no per-user tokens; credentials are managed at the org level.

Authorization: Bearer cnai-org_<your_token>

Tokens beginning with cnai-org_ have full API access within your contractual quota. Tokens are rotated on a 90-day cycle. Rotation notices are sent to the registered org contact 14 days in advance.

Base URL

All endpoints are served from a single base URL assigned to your organization at provisioning time:

https://<org-handle>.api.corenet.ai/v1

The <org-handle> prefix is unique to your organization and listed in your onboarding document. Do not use the root domain directly — requests without an org handle will return 403 Forbidden.

Versioning

The current API version is v1. Version is specified in the URL path. Breaking changes will be introduced under a new version path. The previous version remains available for a deprecation window of no less than 90 days.

Chat Completions

Generates a model response for the given conversation history.

POST /v1/chat/completions

Minimal request example

curl
curl https://<org-handle>.api.corenet.ai/v1/chat/completions \
  -H "Authorization: Bearer cnai-org_<token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "corenet-1",
    "messages": [
      { "role": "user", "content": "Summarize the attached report." }
    ]
  }'

Request Body

Parameter	Type	Description
model required	string	Model identifier. See Models for available values.
messages required	array	Array of message objects forming the conversation. Each object must have `role` and `content` fields.
temperature optional	number	Sampling temperature between `0` and `2`. Higher values produce more varied output. Default: `1`.
top_p optional	number	Nucleus sampling cutoff. Recommended to adjust either `temperature` or `top_p`, not both. Default: `1`.
max_tokens optional	integer	Upper bound on tokens generated. Does not guarantee this length. Default is model-dependent. Must not exceed the context window limit.
stream optional	boolean	If `true`, partial message deltas are sent as server-sent events. See Streaming. Default: `false`.
stop optional	string \| array	Up to 4 sequences where generation stops. The stop sequence itself is not included in the output.
n optional	integer	Number of completion choices to generate. Values above `1` are counted against your quota proportionally. Default: `1`.
presence_penalty optional	number	Penalty between `-2.0` and `2.0` applied to tokens based on whether they have appeared. Positive values reduce repetition. Default: `0`.
frequency_penalty optional	number	Penalty between `-2.0` and `2.0` applied proportional to token frequency in the output so far. Default: `0`.
user optional	string	Caller-supplied identifier for the end user. Used in audit logs. Has no effect on inference. Max 64 characters.

Message object

Each entry in the messages array must conform to the following structure:

Field	Type	Description
role required	string	One of `system`, `user`, or `assistant`. The `system` role may only appear as the first message.
content required	string	Text content of the message. May be an empty string for assistant messages if `tool_calls` are present (not yet supported in v1).
name optional	string	An optional label for the participant. Included in the context as-is. No semantic effect on inference.

Full request example

json{
  "model": "corenet-1",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise technical assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "temperature": 0.4,
  "max_tokens": 256,
  "stream": false,
  "user": "session-a3f91"
}

Response Object

A successful non-streaming response returns an object with the following structure:

json{
  "id": "chatcmpl-8fZ2kLmNpQr1tXwV",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "corenet-1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 2,
    "total_tokens": 30
  }
}

finish_reason values

stop — model reached a natural stopping point or a stop sequence
length — output was truncated at max_tokens
content_filter — output was blocked by the content policy layer

Streaming

When stream: true, the API returns a stream of text/event-stream events. Each event contains a partial response delta. The stream terminates with a final data: [DONE] message.

ssedata: {
  "id": "chatcmpl-8fZ2kLmNpQr1tXwV",
  "object": "chat.completion.chunk",
  "created": 1710000000,
  "model": "corenet-1",
  "choices": [{
    "index": 0,
    "delta": { "content": "Paris" },
    "finish_reason": null
  }]
}

data: {
  "choices": [{
    "delta": {},
    "finish_reason": "stop"
  }]
}

data: [DONE]

The first chunk includes role: "assistant" in the delta. Subsequent chunks carry only content. The final chunk has an empty delta and a non-null finish_reason. Token usage is not returned in streaming mode.

Models

The following model identifiers are available to enterprise clients:

Model ID	Context	Notes
corenet-1	128k tokens	Primary production model. Recommended for most workloads.
corenet-1-fast	32k tokens	Reduced latency variant. Lower throughput cost. Suitable for latency-sensitive pipelines.
corenet-1-preview	128k tokens	Pre-release checkpoint. Behavior may differ from stable. Opt-in required per org.

Model identifiers are pinned per contract period. Access to new model versions requires explicit acknowledgment of any behavioral change notes provided by the account team.

Error Codes

Errors follow the standard HTTP status code convention. The response body is a JSON object with an error field:

json{
  "error": {
    "message": "Invalid bearer token.",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Status	Type	Description
400	invalid_request_error	Malformed request body or invalid parameter values.
401	authentication_error	Missing or invalid bearer token.
403	permission_error	Token valid, but lacks access to the requested model or feature.
429	rate_limit_error	Request or token quota exceeded. See `Retry-After` header.
500	api_error	Internal error. Retry with exponential backoff. Persistent errors should be reported to your account manager.
503	overloaded_error	Service temporarily at capacity. Queue or retry with backoff.

Rate Limits

Rate limits are defined per organization in the enterprise agreement and enforced at the API gateway. Limits are expressed as:

Requests per minute (RPM)
Tokens per minute (TPM)
Tokens per day (TPD)

All limit headers are returned on every response:

x-ratelimit-limit-requests: 100
x-ratelimit-remaining-requests: 87
x-ratelimit-limit-tokens: 100000
x-ratelimit-remaining-tokens: 94210
x-ratelimit-reset-requests: 8s
x-ratelimit-reset-tokens: 3s

When a limit is exceeded, the response is 429 Too Many Requests. The Retry-After header indicates the number of seconds until the limit resets. Sustained over-limit usage may trigger a temporary org suspension pending review by the account team.