y-ohtani/qwen3-4b-ra-sft-epoch3
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 19, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The y-ohtani/qwen3-4b-ra-sft-epoch3 is a 4 billion parameter Qwen3-based model, full fine-tuned by y-ohtani, specifically designed for multi-turn agentic reasoning with tool use. It excels at iteratively solving mathematical and coding problems by calling a code interpreter. This model is an intermediate checkpoint, optimized for agentic loops like Think-Code-Execute-Observe-Answer, and serves as a cold-start for subsequent reinforcement learning.

Loading preview...