Models for fine-tuning in Nebius Token Factory

Nebius Token Factory supports fine-tuning on multiple open-weight model families.
This page lists:

Which base models you can fine-tune
Which context lengths they support
Which fine-tuning types are available (LoRA vs full fine-tuning)

Deployment note
Not all models that can be fine-tuned can be deployed as serverless endpoints in Nebius Token Factory.

For serving options, see Deploy custom model and the list of available deployment models.

Model List

For each models listed below, Nebius Token Factory supports the following

context_length: 8192, 16384, 32768, 65536, 131072 Unless you override it via the context_length hyperparameter, the default context length for fine-tuning is 8192 tokens. Check hyperparameter section for model details regarding context_length

Meta (Llama 3.1 / 3.2 / 3.3)

Nebius Token Factory and the Meta models hosted in the service are built on the Llama 3.1, Llama 3.2, and Llama 3.3 families. For acceptable use, see Meta’s policies:

Name	Supported fine-tuning type	Model card / license
meta-llama/Meta-Llama-3.1-8B-Instruct (Model card)	LoRA and full fine-tuning	Llama 3.1 Community License Agreement
meta-llama/Meta-Llama-3.1-8B (Model card)	LoRA and full fine-tuning	Llama 3.1 Community License Agreement
meta-llama/Llama-3.1-70B-Instruct (Model card)	LoRA and full fine-tuning	Llama 3.1 Community License Agreement
meta-llama/Llama-3.1-70B (Model card)	LoRA and full fine-tuning	Llama 3.1 Community License Agreement
meta-llama/Llama-3.2-1B-Instruct (Model card)	LoRA and full fine-tuning	Llama 3.2 Community License Agreement
meta-llama/Llama-3.2-1B (Model card)	LoRA and full fine-tuning	Llama 3.2 Community License Agreement
meta-llama/Llama-3.2-3B-Instruct (Model card)	LoRA and full fine-tuning	Llama 3.2 Community License Agreement
meta-llama/Llama-3.2-3B (Model card)	LoRA and full fine-tuning	Llama 3.2 Community License Agreement
meta-llama/Llama-3.3-70B-Instruct (Model card)	LoRA and full fine-tuning	Llama 3.3 Community License Agreement

Qwen

Nebius Token Factory supports both dense and coder variants across Qwen3 and Qwen2.5 families.
All Qwen models below use the Apache 2.0 license (see each model card for details).

Qwen3 dense + base models

Name	Supported fine-tuning type	Model card / license
Qwen/Qwen3-32B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-14B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-14B-Base (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-8B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-8B-Base (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-4B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-4B-Base (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-1.7B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-1.7B-Base (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-0.6B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen3-0.6B-Base (Model card)	LoRA and full fine-tuning	Apache 2.0

Qwen3 coder models

Name	Supported fine-tuning type	Model card / license
Qwen/Qwen3-Coder-30B-A3B-Instruct (Model card)	Full fine-tuning	Apache 2.0
Qwen/Qwen3-Coder-480B-A35B-Instruct (Model card)	Full fine-tuning	Apache 2.0

Qwen2.5 dense + coder models

Name	Supported fine-tuning type	Model card / license
Qwen/Qwen2.5-0.5B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-0.5B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-7B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-7B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-14B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-14B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-32B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-32B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-72B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-72B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-Coder-32B (Model card)	LoRA and full fine-tuning	Apache 2.0
Qwen/Qwen2.5-Coder-32B-Instruct (Model card)	LoRA and full fine-tuning	Apache 2.0

OpenAI / Unsloth GPT-OSS

These models are OpenAI GPT-OSS weights (bf16) packaged by Unsloth.
They are Apache 2.0–licensed and suitable for both research and commercial use (subject to the license). To convert the weights into MXFP4 please follow instructions here.

Name	Supported fine-tuning type	Model card / license
unsloth/gpt-oss-20b-BF16 (Model card)	LoRA and full fine-tuning	Apache 2.0
unsloth/gpt-oss-120b-BF16 (Model card)	LoRA and full fine-tuning	Apache 2.0

For merging MoE LoRA adapter weights please follow the guide here.

DeepSeek

Nebius Token Factory integrates DeepSeek V3 models for high-capacity reasoning workloads.
DeepSeek V3 and its variants are released under the MIT License (see model cards for details).

Name	Supported fine-tuning type	Model card / license
deepseek-ai/DeepSeek-V3-0324 (Model card)	Full fine-tuning	MIT License
deepseek-ai/DeepSeek-V3.1 (Model card)	Full fine-tuning	MIT License

DeepSeek V3 and DeepSeek V3.1 are currently only available in Nebius US data centers.

Base LoRA adapter models available for deployment

You can deploy serverless LoRA adapter models in Nebius Token Factory with per-token billing.
To deploy a LoRA-adapted model, first fine-tune an adapter on one of the base models below:

Name	Supported fine-tuning type for adapters	License
meta-llama/Meta-Llama-3.1-8B-Instruct (Model card)	LoRA and full fine-tuning (LoRA deployable as serverless)	Llama 3.1 Community License Agreement
meta-llama/Llama-3.3-70B-Instruct (Model card)	LoRA fine-tuning (LoRA deployable as serverless)	Llama 3.3 Community License Agreement

For other models listed on this page, fine-tuning is supported, but deployment options may differ (for example, only via custom hosting or future releases of serverless runtimes).

Get Started

AI Models Inference

Observability

Post-training

Data Lab

Utilities

Teams & Access Management

Other Capabilities

Integrations

Model List

Meta (Llama 3.1 / 3.2 / 3.3)

Qwen

Qwen3 dense + base models

Qwen3 coder models

Qwen2.5 dense + coder models

OpenAI / Unsloth GPT-OSS

DeepSeek

Base LoRA adapter models available for deployment

Get Started

AI Models Inference

Observability

Post-training

Data Lab

Utilities

Teams & Access Management

Other Capabilities

Integrations

​Model List

​Meta (Llama 3.1 / 3.2 / 3.3)

​Qwen

​Qwen3 dense + base models

​Qwen3 coder models

​Qwen2.5 dense + coder models

​OpenAI / Unsloth GPT-OSS

​DeepSeek

​Base LoRA adapter models available for deployment

Model List

Meta (Llama 3.1 / 3.2 / 3.3)

Qwen

Qwen3 dense + base models

Qwen3 coder models

Qwen2.5 dense + coder models

OpenAI / Unsloth GPT-OSS

DeepSeek

Base LoRA adapter models available for deployment