This guide shows you how to:
- Upload training (and optional validation) datasets
- Create a fine-tuning job via Python or cURL
- Monitor job status and events
- Download the resulting model checkpoints
- Understand the fine-tuning job API shape
Prerequisites
- Select a supported base model for fine-tuning.
-
Create a dataset for training.
Optionally, create a validation dataset as well. A typical split is:- 80–90% of examples → training
- 10–20% of examples → validation
- Create an API key.
-
Export the API key as an environment variable:
How to fine-tune a model
- Python
- cURL
1. Install and import the client
-
Install the
openaiPython SDK (Nebius exposes an OpenAI-compatible API): -
Import libraries:
-
Initialize the Nebius client:
2. Upload training (and optional validation) datasets
If you already uploaded datasets via the UI or API, you can skip this and reuse their IDs.id fields from these responses to create a fine-tuning job.3. Configure fine-tuning parameters
For a full list of allowed fields and defaults, see API specification for a fine-tuning job.4. Create and run the fine-tuning job
5. Poll job status
Fine-tuning takes time. Poll the job until it reaches a terminal status:- If
job.status == "succeeded", training finished successfully. - If
job.status == "failed", inspectjob.errorforcode,message, andparam. For transient5xxerrors, you can safely retry.
6. Inspect job events (optional but recommended)
Events help you understand the lifecycle (file validation, dataset processing, training progress).Dataset processed successfullyTraining completed successfully
7. Download checkpoints and model files
Each checkpoint represents the model after a certain number of training steps (often per epoch).- Intermediate checkpoints (per step / epoch)
- Final checkpoint (usually the last one in the list)
API specification for a fine-tuning job
This section describes the request payload when creating a fine-tuning job.Top-level fields
-
model(string, required) Base model to fine-tune. -
suffix(string, optional) Human-readable suffix appended to the model name. Use this to distinguish multiple runs, e.g.,customer-support-v1. -
training_file(string, required) ID of the file with the training dataset (purpose = "fine-tune"). See: -
validation_file(string, optional) ID of the file with the validation dataset. Same format and requirements as the training dataset. -
hyperparameters(object, optional) Fine-tuning configuration. Omitted fields fall back to defaults. -
seed(integer, optional) Random seed used during training. Using the sameseedand the same data/hyperparameters improves reproducibility between runs. -
integrations(array, optional) Third-party integrations configured for this job.type(string, required) Currently supported:"wandb".
-
wandb(object, required whentype = "wandb") Settings for exporting metrics to
Weights & Biases:project(string, required): W&B project name.name(string, optional): Run name.entity(string, optional): W&B entity (user or team).tags(array of strings, optional): Tags to attach to the run.
type = "hf")output_repo_name(string, required):
Target Hugging Face repo name, e.g."org/llama-8b-support-ft"or"username/my-finetune".api_token(string, required):
Hugging Face access token (PAT) with write access tooutput_repo_name.
Hyperparameters
All hyperparameters are nested underhyperparameters.
-
batch_size(integer, optional) Number of examples per training batch. Larger batch sizes are more efficient but require more VRAM.- Typical range:
8–32 - Default:
8
- Typical range:
-
context_length(integer, optional) Maximum sequence length in tokens used during fine-tuning. Inputs longer than this limit will cause errors.- Units: tokens (e.g.,
8192) - Supported values depend on the base model; see the models page.
- Default:
8192
- Analyze the token length distribution of your dataset.
- Choose the smallest context length that covers your P95–P99 examples.
- If
packing = false, a much larger context length choice than your examples leads to heavy padding and wasted compute.
- Units: tokens (e.g.,
-
learning_rate(float, optional) Step size for gradient descent.- Must be
>= 0 - Typical values:
1e-6–5e-5 - Default:
0.00001
- Must be
-
n_epochs(integer, optional) Number of passes over the entire dataset.- Range:
1–20 - Default:
3
- Range:
-
warmup_ratio(float, optional) Fraction of total training steps used for linear warmup of the learning rate from 0 to the target value.- Range:
0–1 - Default:
0
- Range:
-
weight_decay(float, optional) L2 regularization factor applied to weights. Helps prevent overfitting and preserve generalization.- Must be
>= 0 - Default:
0
- Must be
-
lora(boolean, optional) Whether to use LoRA (Low-Rank Adaptation) instead of full-parameter fine-tuning.true: only LoRA adapter weights are trained; base model weights stay frozen.false: full fine-tuning is applied.- Default:
false
-
lora_r(integer, optional) Rank of LoRA matrices. Higher values increase capacity but also overfitting and cost.- Range:
8–128 - Default:
8
- Range:
-
lora_alpha(integer, optional) Scaling factor for LoRA updates. Higher values increase the impact of LoRA adapters.- Must be
>= 8 - Default:
8
- Must be
-
lora_dropout(float, optional) Dropout applied to LoRA layers. Helps prevent overfitting, especially on small datasets.- Range:
0–1 - Default:
0
- Range:
-
packing(boolean, optional) Iftrue, multiple shorter samples can be packed into a single sequence to better utilize the context window and improve efficiency.- Default:
true
- Default:
-
max_grad_norm(float, optional) Gradient clipping threshold (L2 norm). Avoids unstable updates:- Too high → effectively no clipping → risk of exploding gradients.
- Too low → overly aggressive clipping → risk of under-training.
- Must be
>= 0 - Default:
1
Fine-tuning job object (response shape)
When you query a job or list jobs, you get objects shaped like this:status:validating_files→queued→running→succeeded/failedtrained_tokens: how many tokens have been processed so fartrained_steps/total_steps: progress of the training looperror: structured error info whenstatus = "failed"result_files: IDs of produced artifacts (also available via checkpoints API)