Name: open-sci/sft__ot30k_Qwen3-1.7B-Base-SFT-Tulu3-decontaminated API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: open-sci

Model Overview

This model, open-sci/sft__ot30k_Qwen3-1.7B-Base-SFT-Tulu3-decontaminated, is a fine-tuned variant of the ali-elganzory/Qwen3-1.7B-Base-SFT-Tulu3-decontaminated base model. It features approximately 2 billion parameters and supports a context length of 32,768 tokens, enabling it to handle extensive textual inputs and outputs.

Training Details

The model was fine-tuned on the /gpfs/scratch/ehpc524/ot/hf_hub/datasets/open_thoughts_open_thoughts3-1.2_m_30000_samples/default/0.0.0/f679a5c592c8dffb dataset. Key training hyperparameters included a learning rate of 4e-05, a total train batch size of 128 (with 32 devices and 4 gradient accumulation steps), and 5 epochs. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling and 0.1 warmup steps.

Potential Use Cases

Given its fine-tuning on a dataset likely related to open-ended thoughts or conversational data, this model is potentially well-suited for:

Instruction following and dialogue generation: Its SFT (Supervised Fine-Tuning) nature suggests improved performance in responding to specific instructions or engaging in conversational exchanges.
Long-context understanding: The 32K context window makes it capable of processing and generating coherent text over extended passages, useful for summarization or detailed content creation.
Research and experimentation: As a fine-tuned model, it provides a base for further domain-specific adaptation or exploration of its learned capabilities.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)