Name: g4me/QwenRolina3-06B-base-LR1e5-b32g2gc8-AR-order-batch API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: g4me

Overview

This model, g4me/QwenRolina3-06B-base-LR1e5-b32g2gc8-AR-order-batch, is a 0.8 billion parameter language model derived from the Qwen3-0.6B-Base architecture. It has been specifically fine-tuned using the TRL (Transformers Reinforcement Learning) framework, employing a Supervised Fine-Tuning (SFT) approach.

Key Capabilities

Base Model: Built upon the robust Qwen3-0.6B-Base, providing a strong foundation for language understanding and generation.
Fine-Tuned Performance: Enhanced through SFT, suggesting improved performance on specific tasks or domains compared to its base model.
Context Length: Supports a substantial context window of 32768 tokens, enabling the processing and generation of longer texts while maintaining coherence.
Framework: Developed using TRL, a library known for facilitating advanced fine-tuning techniques for transformer models.

Training Details

The model's training involved SFT, a common method for adapting pre-trained language models to specific tasks by training on labeled datasets. The process utilized TRL version 0.29.0, Transformers 5.2.0, Pytorch 2.8.0a0, Datasets 4.6.0, and Tokenizers 0.22.2.

Good For

General text generation tasks requiring a model with a moderate parameter count.
Applications benefiting from a large context window for processing extensive inputs or generating detailed outputs.
Developers looking for a fine-tuned Qwen3 variant for experimentation or deployment.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)