Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tokenfactory.nebius.com/llms.txt

Use this file to discover all available pages before exploring further.

FieldTypeRequiredDescription
namestringyesDisplay name for the endpoint
descriptionstringnoOptional description
model_namestringyesTemplate model name (e.g. openai/gpt-oss-120b)
flavor_namestringyesTemplate flavor (e.g. base, fast)
gpu_typestringyesGPU type supported by the chosen template + flavor
gpu_countintegeryesgpu_count per replica. Total maximum GPUs = gpu_count × scaling.max_replicas
regionstringyeseu-north1, eu-west1, us-central1
scaling.min_replicasintegeryesMinimum replicas
scaling.max_replicasintegeryesMaximum replicas