open-sci/sft__ot30k_Qwen3-1.7B-Base-DPO-Tulu3-decontaminated
The open-sci/sft__ot30k_Qwen3-1.7B-Base-DPO-Tulu3-decontaminated model is a 2 billion parameter language model, fine-tuned from ali-elganzory's Qwen3-1.7B-Base-DPO-Tulu3-decontaminated. It features a 32K context length and is specialized through supervised fine-tuning on the open_thoughts3-1.2_m_30000_samples dataset. This model is designed for tasks benefiting from its specific fine-tuning data, offering enhanced performance in areas aligned with the dataset's content.
Loading preview...
Model Overview
This model, sft__ot30k_Qwen3-1.7B-Base-DPO-Tulu3-decontaminated, is a specialized version of the Qwen3-1.7B-Base-DPO-Tulu3-decontaminated architecture, featuring approximately 2 billion parameters and a 32K token context length. It has undergone supervised fine-tuning (SFT) on the open_thoughts3-1.2_m_30000_samples dataset, indicating an optimization for tasks related to the content and style of this specific dataset.
Training Details
The fine-tuning process utilized a learning rate of 4e-05, with a total training batch size of 128 across 32 devices. The optimizer was ADAMW_TORCH_FUSED, and a cosine learning rate scheduler with 0.1 warmup steps was employed over 5 epochs. This configuration suggests a focused effort to adapt the base model's capabilities to the nuances of the target dataset.
Key Characteristics
- Base Model: Fine-tuned from
ali-elganzory/Qwen3-1.7B-Base-DPO-Tulu3-decontaminated. - Parameter Count: Approximately 2 billion parameters.
- Context Length: Supports a context window of 32,768 tokens.
- Fine-tuning Data: Specialized on the
open_thoughts3-1.2_m_30000_samplesdataset.
Potential Use Cases
This model is best suited for applications where its specific fine-tuning on the open_thoughts3-1.2_m_30000_samples dataset provides a distinct advantage. Developers should consider its use for tasks that align with the domain, style, or content distribution of this training data, as its performance will be optimized for such scenarios.