Name: rl-rag/qwen3-8B-sft-mix-v20250921-plus-v20251001-onpolicy-rs-longform_0921 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rl-rag

Model Overview

This model, qwen3-8B-sft-mix-v20250921-plus-v20251001-onpolicy-rs-longform_0921, is an 8 billion parameter language model with a substantial 32,768 token context length. It is a fine-tuned iteration of the rl-rag/qwen3-8B-sft-mix-v20250921 base model.

Key Characteristics

Base Model: Fine-tuned from rl-rag/qwen3-8B-sft-mix-v20250921.
Context Length: Supports a long context window of 32,768 tokens, enabling processing and generation of extensive text.
Specialized Fine-tuning: The model underwent specific fine-tuning on the rl-rag/sft-mix-v20251001-onpolicy-rs-longform_0921 dataset. This indicates an optimization for tasks involving long-form content generation, likely leveraging on-policy reinforcement learning strategies.

Training Details

The training procedure involved:

Learning Rate: 4e-05
Batch Size: A train_batch_size of 1 with gradient_accumulation_steps of 16, resulting in a total_train_batch_size of 128.
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 5 epochs.
Frameworks: Utilized Transformers 4.52.4, Pytorch 2.8.0+cu128, Datasets 3.6.0, and Tokenizers 0.21.1.

Potential Use Cases

Given its fine-tuning on a long-form dataset, this model is likely well-suited for applications requiring:

Extended Text Generation: Creating detailed articles, reports, stories, or other lengthy documents.
Summarization of Long Documents: Processing and summarizing very long texts due to its large context window.
Complex Question Answering: Answering questions that require synthesizing information from extensive source material.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)