minsu0567/Uni-IAD-R2-Qwen3.5-GRPO-si

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 9, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The minsu0567/Uni-IAD-R2-Qwen3.5-GRPO-si is a 4.5 billion parameter Qwen3.5 model developed by minsu0567, fine-tuned from minsu0567/Uni-IAD-R2-Qwen3.5-si. This model was trained 2x faster using Unsloth and Huggingface's TRL library, offering a 32768 token context length. It is optimized for efficient training and deployment, making it suitable for applications requiring a capable yet resource-conscious language model.

Loading preview...

Model Overview

The minsu0567/Uni-IAD-R2-Qwen3.5-GRPO-si is a 4.5 billion parameter language model, fine-tuned by minsu0567. It is based on the Qwen3.5 architecture and was specifically fine-tuned from the minsu0567/Uni-IAD-R2-Qwen3.5-si model.

Key Characteristics

  • Efficient Training: This model was trained significantly faster, achieving a 2x speedup, by leveraging the Unsloth library in conjunction with Huggingface's TRL (Transformer Reinforcement Learning) library.
  • Context Length: It supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended outputs.
  • License: The model is released under the Apache-2.0 license, permitting broad use and distribution.

Use Cases

This model is particularly well-suited for developers and researchers looking for:

  • Resource-efficient deployments: Its optimized training process suggests potential for efficient inference.
  • Applications requiring long context: The 32K context window is beneficial for tasks like summarization of lengthy documents, complex question answering, or maintaining extended conversational history.
  • Further experimentation and fine-tuning: As a fine-tuned base, it can serve as a strong starting point for domain-specific adaptations.