farffadet/syllogym-judge-qwen3-4b-grpo-v2
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The farffadet/syllogym-judge-qwen3-4b-grpo-v2 is a 4 billion parameter Qwen3 model developed by farffadet, fine-tuned from unsloth/Qwen3-4B-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, enabling faster training. It is designed for general language tasks, leveraging its Qwen3 architecture and 32768 token context length.

Loading preview...