Name: krishnaTO/qwen3-finetuned API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: krishnaTO

Model Overview

The krishnaTO/qwen3-finetuned model is a fine-tuned variant of the Qwen/Qwen3-0.6B base model, developed by krishnaTO. It has 0.8 billion parameters and supports a context length of 32768 tokens. The fine-tuning process involved a single epoch, resulting in a validation loss of 3.7385.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 2e-08
Batch Sizes: train_batch_size of 4, eval_batch_size of 8, with gradient_accumulation_steps of 4, leading to a total_train_batch_size of 16.
Optimizer: ADAMW_TORCH_FUSED with default betas and epsilon.
LR Scheduler: Linear type.
Epochs: 1

Limitations

The available documentation does not specify the dataset used for fine-tuning, nor does it detail the intended uses or known limitations of this particular fine-tuned version. Users should exercise caution and conduct further evaluation to determine its suitability for specific tasks.

Overview

Model Overview

Training Details

Limitations

Full Model Card (README)