farffadet/syllogym-judge-qwen3-4b-grpo-v3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The farffadet/syllogym-judge-qwen3-4b-grpo-v3 is a 4 billion parameter Qwen3 model, developed by farffadet, fine-tuned using Unsloth and Huggingface's TRL library. This model was trained significantly faster, leveraging optimized techniques for efficiency. It is designed for general language tasks, building upon the Qwen3 architecture with a 32768 token context length.

Loading preview...

Model Overview

The farffadet/syllogym-judge-qwen3-4b-grpo-v3 is a 4 billion parameter Qwen3-based language model developed by farffadet. It was fine-tuned from the unsloth/Qwen3-4B-unsloth-bnb-4bit base model, utilizing Unsloth and Huggingface's TRL library for accelerated training.

Key Characteristics

  • Architecture: Based on the Qwen3 family of models.
  • Parameter Count: 4 billion parameters, offering a balance between performance and computational efficiency.
  • Training Efficiency: Leverages Unsloth for a 2x faster fine-tuning process, indicating optimized training methodologies.
  • Context Length: Supports a substantial context window of 32768 tokens, suitable for processing longer inputs and generating coherent, extended outputs.

Potential Use Cases

This model is suitable for a variety of general language understanding and generation tasks, benefiting from its efficient training and robust base architecture. Its 32K context length makes it particularly useful for applications requiring detailed comprehension of lengthy documents or generating comprehensive responses.