Name: khaire/qwen3-finetuned API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: khaire

Model Overview

The khaire/qwen3-finetuned model is a 0.8 billion parameter language model, derived from the Qwen/Qwen3-0.6B base architecture. It features a substantial context length of 32768 tokens, indicating its potential for handling long sequences of text.

Training Details

This model was fine-tuned using the following hyperparameters:

Learning Rate: 2e-05
Batch Size: 2 (train), 8 (eval)
Gradient Accumulation Steps: 8 (resulting in a total effective batch size of 16)
Optimizer: AdamW Torch Fused
Epochs: 3

During training, the model's validation loss decreased from 3.1107 in the first epoch to 3.0508 by the third epoch. The specific dataset used for fine-tuning is not disclosed in the available documentation.

Current Status and Limitations

As of the current documentation, detailed information regarding the model's specific capabilities, intended uses, and the nature of its training and evaluation data is not provided. Users should be aware that its primary differentiators and optimal use cases are yet to be fully defined.

Overview

Model Overview

Training Details

Current Status and Limitations

Full Model Card (README)