Documentation Index
Fetch the complete documentation index at: https://docs.tokenfactory.nebius.com/llms.txt
Use this file to discover all available pages before exploring further.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Display name for the endpoint |
description | string | no | Optional description |
model_name | string | yes | Template model name (e.g. openai/gpt-oss-120b) |
flavor_name | string | yes | Template flavor (e.g. base, fast) |
gpu_type | string | yes | GPU type supported by the chosen template + flavor |
gpu_count | integer | yes | gpu_count per replica. Total maximum GPUs = gpu_count × scaling.max_replicas |
region | string | yes | eu-north1, eu-west1, us-central1 |
scaling.min_replicas | integer | yes | Minimum replicas |
scaling.max_replicas | integer | yes | Maximum replicas |