junseojang/Qwen3-1.7B-MATH-RLVR-250-RE
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 14, 2026Architecture:Transformer Warm

The junseojang/Qwen3-1.7B-MATH-RLVR-250-RE model is a 1.7 billion parameter language model based on the Qwen3 architecture, developed by junseojang. This model is specifically fine-tuned for reasoning and mathematical tasks, leveraging Reinforcement Learning from Human Feedback (RLHF) with 250 steps. It features a substantial context length of 32768 tokens, making it suitable for complex problem-solving and detailed analytical applications.

Loading preview...