shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 29, 2026License:otherArchitecture:Transformer Cold

The shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was trained on the prm_sft_train dataset with a 32K context length. This model is optimized for multi-turn conversational tasks, leveraging a specific learning rate and training procedure.

Loading preview...

Model Overview

This model, shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned using the prm_sft_train dataset, indicating an optimization for specific instruction-following or conversational tasks.

Key Training Details

  • Base Model: Qwen/Qwen3-8B
  • Fine-tuning Dataset: prm_sft_train
  • Context Length: 32,768 tokens
  • Learning Rate: 5e-06
  • Optimizer: AdamW (fused) with specific beta and epsilon values
  • Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
  • Epochs: 3.0

Potential Use Cases

Given its fine-tuning on a specific SFT (Supervised Fine-Tuning) dataset and its multi-turn designation, this model is likely suitable for:

  • Multi-turn dialogue systems
  • Instruction-following applications
  • Chatbot development

Further details on intended uses, limitations, and specific evaluation results are not provided in the original model card.