Name: shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shubhamrgandhi

Model Overview

This model, shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned using the prm_sft_train dataset, indicating an optimization for specific instruction-following or conversational tasks.

Key Training Details

Base Model: Qwen/Qwen3-8B
Fine-tuning Dataset: prm_sft_train
Context Length: 32,768 tokens
Learning Rate: 5e-06
Optimizer: AdamW (fused) with specific beta and epsilon values
Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
Epochs: 3.0

Potential Use Cases

Given its fine-tuning on a specific SFT (Supervised Fine-Tuning) dataset and its multi-turn designation, this model is likely suitable for:

Multi-turn dialogue systems
Instruction-following applications
Chatbot development

Further details on intended uses, limitations, and specific evaluation results are not provided in the original model card.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)