Name: rl-rag/qwen3-8B-sft-mix-v20250921 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rl-rag

Overview

This model, rl-rag/qwen3-8B-sft-mix-v20250921, is an 8 billion parameter language model derived from the Qwen3-8B base architecture. It has undergone specific fine-tuning using the rl-rag/sft-mix-v20250921 dataset, suggesting an emphasis on instruction-following or a diverse set of tasks. The model supports a substantial context length of 32768 tokens, making it suitable for processing and generating content from moderately long inputs.

Training Details

The fine-tuning process involved a learning rate of 4e-05, a total training batch size of 128 (with 16 gradient accumulation steps across 8 devices), and was conducted for 5 epochs. The optimizer used was ADAMW_TORCH with a cosine learning rate scheduler and a 0.1 warmup ratio. This configuration indicates a robust training regimen aimed at optimizing performance on its target dataset.

Intended Use

Given its fine-tuned nature and substantial context window, this model is likely intended for applications that benefit from specialized instruction-following capabilities or require processing and generating text within a significant contextual scope. Developers should consider its specific fine-tuning dataset for alignment with their use cases.

Overview

Overview

Training Details

Intended Use

Full Model Card (README)