shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean_think

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 28, 2026License:otherArchitecture:Transformer Cold

shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean_think is an 8 billion parameter language model, fine-tuned by shubhamrgandhi from the Qwen3-8B architecture. This model is specifically fine-tuned on the prm_sft_train dataset, indicating a specialization in areas related to its training data. It features a substantial context length of 32768 tokens, making it suitable for tasks requiring extensive contextual understanding.

Loading preview...

Model Overview

This model, qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean_think, is an 8 billion parameter language model developed by shubhamrgandhi. It is a fine-tuned variant of the base Qwen3-8B architecture, specifically trained on the prm_sft_train dataset. The model supports a significant context window of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

The fine-tuning process utilized a learning rate of 5e-06 over 3 epochs, with a cosine learning rate scheduler and a warmup ratio of 0.1. Training was conducted on a multi-GPU setup with 8 devices, using an AdamW optimizer. These parameters suggest a focused and optimized training regimen aimed at leveraging the specific characteristics of the prm_sft_train dataset.

Potential Use Cases

Given its foundation on Qwen3-8B and specialized fine-tuning, this model is likely well-suited for applications that align with the prm_sft_train dataset's domain. Its large context window makes it effective for tasks requiring deep contextual understanding, such as long-form content generation, complex question answering, or detailed summarization within its specialized domain.