sagnikM/grpo_sgd_qwen3-8b_3k_seqlen
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 25, 2025Architecture:Transformer Cold

The sagnikM/grpo_sgd_qwen3-8b_3k_seqlen is an 8 billion parameter language model with a 32,768 token context length. This model is based on the Qwen3 architecture, indicating a foundation in advanced transformer designs. Specific differentiators or primary use cases are not detailed in the provided information, suggesting it may be a base model or a work in progress.

Loading preview...