Skip to main content
Data Lab offers tools for preparing and managing datasets for fine-tuning workflows. For more information on fine-tuning, see the Post-training documentation.

Dataset preparation

You can create fine-tuning datasets by:
  • Importing and filtering inference logs (chat completions)
  • Uploading and filtering structured datasets
Prepared datasets are stored in Data Lab and can be reused across fine-tuning jobs.

Reproducibility

By keeping datasets versioned and centralized, Data Lab enables:
  • Consistent training inputs across experiments
  • Easier comparison of fine-tuning results
  • Safer iteration without accidental data changes

Data location and compliance

  • Fine-tuning jobs run in EU or US data centers, for more information, see our Legal Quick Guide (Data Location part)
  • All datasets are stored centrally in the eu-north1 (Finland) data center.
This design balances compliance requirements with operational simplicity.

Typical use cases

  • Adapting models to internal domain-specific data
  • Improving response quality for specific tasks
  • Training student models for distillation pipelines