g4me/QwenRolina3-Base-LR1e5-wsd-b32g2gc8-order-domain-2ep

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 3, 2026Architecture:Transformer0.0K Gated Cold

QwenRolina3-Base-LR1e5-wsd-b32g2gc8-order-domain-2ep is a 2 billion parameter language model fine-tuned from Qwen3-1.7B-Base, featuring a 32768 token context length. This model was trained using the TRL framework, specializing in general text generation tasks. It is designed for applications requiring a compact yet capable base model for further adaptation or direct use in conversational AI.

Loading preview...

Model Overview

This model, QwenRolina3-Base-LR1e5-wsd-b32g2gc8-order-domain-2ep, is a 2 billion parameter language model derived from the Qwen3-1.7B-Base architecture. It has been fine-tuned using the TRL (Transformers Reinforcement Learning) framework, indicating a focus on optimizing its performance through supervised fine-tuning (SFT).

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-1.7B-Base.
  • Parameter Count: Approximately 2 billion parameters.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Training Method: Utilizes Supervised Fine-Tuning (SFT) via the TRL library.
  • Framework Versions: Developed with TRL 0.29.0, Transformers 5.2.0, Pytorch 2.8.0a0, Datasets 4.6.0, and Tokenizers 0.22.2.

Intended Use Cases

This model is suitable for general text generation tasks, particularly where a compact yet capable model with a large context window is beneficial. Its fine-tuned nature suggests improved performance on tasks aligned with its training data compared to its base model. Developers can leverage this model for applications requiring robust language understanding and generation, potentially as a foundation for further domain-specific fine-tuning.