Skip to main content
Data Lab is the unified workspace inside Nebius Token Factory for working with inference logs (chat completions) and datasets. It centralizes everything related to inference logs, dataset preparation, batch inference outputs and fine-tuning, providing a consistent, clean, and reusable data layer for all model-development workflows. Data Lab reduces reliance on manual scripts and custom pipelines by bringing your data into a single, unified workflow:
  • A single place to view, filter, and export inference logs (chat completions).
  • A unified interface for preparing datasets via SQL queries.
  • Reusable datasets for batch inference and fine-tuning.

What lives inside Data Lab

Data Lab stores and manages several types of data:
  • Inference Logs: automatically collected chat completions generated via API or Playground (unless Zero Data Retention is enabled).
  • Filtered Datasets: datasets created from inference logs using sql queries.
  • Uploaded Datasets: user-provided datasets uploaded manually.
  • Batch Inference Outputs: results generated by batch inference jobs.
  • Fine-tuning Outputs: artifacts produced by fine-tuning jobs, including model checkpoints and the resulting fine-tuned model.

Import Chat Completions

Data Lab allows you to import historical chat completion logs into structured datasets for analysis and reuse

Batch Inference

Enables asynchronous processing of large datasets without requiring real-time responses

Fine-tuning

Fine-tuning workflows support preparing, validating, and managing training datasets through Data Lab interface

Data Processing

This section explains how your data is processed, where it is processed, and what level of control you have