g4me/QwenRolina3-Base-LR1e5-WSD-b32g2gc8-order-domain-3ep

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 8, 2026Architecture:Transformer Gated Cold

g4me/QwenRolina3-Base-LR1e5-WSD-b32g2gc8-order-domain-3ep is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the TRL library with a context length of 32768 tokens. It is designed for general text generation tasks, leveraging its base architecture and fine-tuning process to produce coherent and contextually relevant responses.

Loading preview...

Model Overview

This model, g4me/QwenRolina3-Base-LR1e5-WSD-b32g2gc8-order-domain-3ep, is a 2 billion parameter language model built upon the Qwen3-1.7B-Base architecture. It has been specifically fine-tuned using the TRL (Transformers Reinforcement Learning) library, indicating an optimization process beyond standard pre-training.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-1.7B-Base.
  • Parameter Count: 2 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing and generating longer sequences of text.
  • Training Framework: Utilizes TRL (version 0.29.0) for its fine-tuning process, specifically employing Supervised Fine-Tuning (SFT).

Intended Use Cases

This model is suitable for a variety of text generation tasks where a robust, fine-tuned base model with a large context window is beneficial. Its fine-tuning with TRL suggests potential for improved instruction following or specific task performance, making it a good candidate for:

  • General conversational AI.
  • Content creation requiring longer context understanding.
  • Question answering based on extensive input.

Developers can quickly integrate this model using the transformers library, as demonstrated in the provided quick start example.