Skip to main content
POST
/
v0
/
dedicated_endpoints
Create Dedicated Endpoint
curl --request POST \
  --url https://api.tokenfactory.nebius.com/v0/dedicated_endpoints \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "model_name": "<string>",
  "flavor_name": "<string>",
  "gpu_type": "gpu-l40s-d",
  "region": "eu-north1",
  "gpu_count": 1,
  "scaling": {
    "min_replicas": 2,
    "max_replicas": 2
  },
  "description": "",
  "custom_weights_id": "<string>"
}
'
{
  "endpoint": {
    "id": "<string>",
    "name": "<string>",
    "description": "<string>",
    "enabled": true,
    "routing_key": "<string>",
    "model_name": "<string>",
    "flavor_name": "<string>",
    "region": "<string>",
    "gpu_type": "gpu-l40s-d",
    "gpu_count": 123,
    "custom_weights_id": "<string>",
    "scaling": {
      "min_replicas": 2,
      "max_replicas": 2
    },
    "deployment": {
      "ready_replicas": 123,
      "status": "starting",
      "readiness": "ready"
    },
    "created_at": "2023-11-07T05:31:56Z"
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.tokenfactory.nebius.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

ai_project_id
string | null

Project ID to create endpoint

Body

application/json
name
string
required
Required string length: 1 - 100
model_name
string
required
Minimum string length: 1
flavor_name
string
required
Minimum string length: 1
gpu_type
enum<string>
required
Available options:
gpu-l40s-d,
gpu-l40s-a,
gpu-h100-sxm,
gpu-h200-sxm,
gpu-b200-sxm,
gpu-b200-sxm-a,
gpu-b300-sxm
region
enum<string>
required
Available options:
eu-north1,
us-central1,
eu-west1,
me-west1,
uk-south1,
tf-us1
gpu_count
integer
required
Required range: x >= 0
scaling
EndpointScaling · object
required
description
string
default:""
Maximum string length: 500
custom_weights_id
string | null
Minimum string length: 1

Response

Successful Response

endpoint
DedicatedEndpoint · object
required