Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tokenfactory.nebius.com/llms.txt

Use this file to discover all available pages before exploring further.

Following functionality will no longer be available in Nebius Token Factory :
  • Some of the text models
  • LoRA model per-token deployment and inference
  • Text to Image models
These models will no longer be supported in UI Playground and API We strongly encourage you to migrate to newer, actively supported models to ensure continued stability and performance.

Deprecation timeline

On Apr 13, 2026 Affected model APIs and UI will be disabled

Models affected

Text-to-Text

Model id
zai_org/glm_4.7_fp8
minimaxai/minimax_m2.1
deepseek_ai/deepseek_r1_0528
deepseek_ai/deepseek_v3_0324
meta_llama/llama_3.3_70b_instruct_fast
qwen/qwen3_coder_480b_a35b_instruct
moonshotai/kimi_k2_instruct
moonshotai/kimi_k2_thinking
deepseek_ai/deepseek_r1_0528_fast
deepseek_ai/deepseek_v3_0324_fast
openai/gpt_oss_20b
zai_org/glm_4.5
qwen/qwen3_32b_fast
qwen/qwen3_235b_a22b_thinking_2507
zai_org/glm_4.5_air
qwen/qwen3_30b_a3b_thinking_2507
qwen/qwen3_coder_30b_a3b_instruct
meta_llama/meta_llama_3.1_8b_instruct
meta_llama/meta_llama_3.1_8b_instruct_fast
google/gemma_2_9b_it_fast
qwen/qwen2.5_coder_7b_fast
baai/bge_en_icl
nvidia/nemotron_nano_v2_12b
google/gemma_3_27b_it_fast
meta_llama/llama_guard_3_8b
baai/bge_multilingual_gemma2
intfloat/e5_mistral_7b_instruct
google/gemma_2_2b_it

Text-to-Image

We are also deprecating all Text-to-Image models as we streamline supported modalities and focus our infrastructure on Text-based workloads excellence. We may continue supporting image generation through more robust and scalable alternatives in the future.** For now both UI and API will no longer be available.**
Model id
black_forest_labs/flux_schnell
black_forest_labs/flux_dev

LoRA per-token serverless endpoints

We are deprecating LoRA per-token deployments as part of a shift toward more scalable and production-ready deployment option - Dedicated Endpoints. If you are currently using LoRA-based setups, we recommend transitioning to:
  • Standard public model deployments
  • Dedicated Endpoints for controlled and predictable performance
Model id
meta_llama/meta_llama_3.1_8b_instruct_lora
gemma-2-2b-it-lora
llama-3.3-70b-lora

What you should do

  • Review your current usage for any dependencies on deprecated models
  • Migrate to supported models available in the platform
  • Reach out to our Sales Team to discuss Dedicated Endpoints if the option you’re looking for is not available at the platform.
For production workloads, higher stability requirements, or custom configurations, we recommend using Dedicated Endpoints, which provide:
  • Full control over model versions
  • Predictable performance and scaling
  • Enterprise-grade reliability and isolation

Need help?

If you’re impacted or want to ensure a smooth transition, our team is ready to help. 👉 Contact our Sales team: Link 👉 Contact our Support team: tokenfactory-support@nebius.com, We can support you with:
  • Migration planning
  • Model selection
  • Dedicated endpoint setup