Skip to main content
PATCH
/
v0
/
dedicated_endpoints
/
{endpoint_id}
Update Dedicated Endpoint
curl --request PATCH \
  --url https://api.tokenfactory.nebius.com/v0/dedicated_endpoints/{endpoint_id} \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "description": "<string>",
  "enabled": true,
  "gpu_type": "gpu-l40s-d",
  "gpu_count": 1,
  "custom_weights_id": "UNSET",
  "scaling": {
    "min_replicas": 2,
    "max_replicas": 2
  }
}
'
{
  "id": "<string>",
  "name": "<string>",
  "description": "<string>",
  "enabled": true,
  "routing_key": "<string>",
  "model_name": "<string>",
  "flavor_name": "<string>",
  "region": "<string>",
  "gpu_type": "gpu-l40s-d",
  "gpu_count": 123,
  "custom_weights_id": "<string>",
  "scaling": {
    "min_replicas": 2,
    "max_replicas": 2
  },
  "deployment": {
    "ready_replicas": 123,
    "status": "starting",
    "readiness": "ready"
  },
  "created_at": "2023-11-07T05:31:56Z"
}

Documentation Index

Fetch the complete documentation index at: https://docs.tokenfactory.nebius.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

endpoint_id
string<uuid>
required

Body

application/json
name
string | null
Required string length: 1 - 100
description
string | null
Maximum string length: 500
enabled
boolean | null
gpu_type
enum<string> | null
Available options:
gpu-l40s-d,
gpu-l40s-a,
gpu-h100-sxm,
gpu-h200-sxm,
gpu-b200-sxm,
gpu-b200-sxm-a,
gpu-b300-sxm
gpu_count
integer | null
Required range: x >= 0
custom_weights_id
string | null
default:UNSET
Minimum string length: 1
scaling
EndpointScaling · object

Response

Successful Response

id
string
required
name
string
required
description
string
required
enabled
boolean
required
routing_key
string
required
model_name
string
required
flavor_name
string
required
region
string
required
gpu_type
enum<string>
required
Available options:
gpu-l40s-d,
gpu-l40s-a,
gpu-h100-sxm,
gpu-h200-sxm,
gpu-b200-sxm,
gpu-b200-sxm-a,
gpu-b300-sxm
gpu_count
integer
required
custom_weights_id
string | null
required
scaling
EndpointScaling · object
required
deployment
EndpointDeploymentStatus · object
required
created_at
string<date-time>
required