Name: open-sci/sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: open-sci

Model Overview

This model, open-sci/sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated, is a 1.5 billion parameter language model. It is a fine-tuned variant of the ali-elganzory/Qwen2.5-1.5B-SFT-Tulu3-decontaminated base model, specifically adapted through supervised fine-tuning (SFT).

Training Details

The model was fine-tuned on the /gpfs/scratch/ehpc524/ot/hf_hub/datasets/open_thoughts_open_thoughts3-1.2_m_30000_samples/default/0.0.0/f679a5c592c8dffb dataset. Key training hyperparameters included:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 4 steps, leading to a total effective batch size of 128
Optimizer: ADAMW_TORCH_FUSED
Scheduler: Cosine with 0.1 warmup ratio
Epochs: 5.0

Intended Use

Given its fine-tuning on a specific dataset, this model is best suited for tasks and applications that align with the content and domain of the open_thoughts3-1.2_m_30000_samples dataset. Developers should consider its 1.5 billion parameter size and 32768 token context length for deployment efficiency and handling longer inputs.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)