farffadet/syllogym-judge-qwen3-4b-grpo-v3
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
The farffadet/syllogym-judge-qwen3-4b-grpo-v3 is a 4 billion parameter Qwen3 model, developed by farffadet, fine-tuned using Unsloth and Huggingface's TRL library. This model was trained significantly faster, leveraging optimized techniques for efficiency. It is designed for general language tasks, building upon the Qwen3 architecture with a 32768 token context length.
Loading preview...