Name: russwest404/Qwen3-4B-ReTool-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: russwest404

Model Overview

The russwest404/Qwen3-4B-ReTool-SFT is a specialized language model derived from the Qwen3-4B base architecture. It has undergone supervised fine-tuning (SFT) using the retool dataset, indicating an optimization for tasks and data patterns present within this specific dataset.

Key Characteristics

Base Model: Qwen/Qwen3-4B, a 4 billion parameter model from the Qwen family.
Fine-tuning: Specifically fine-tuned on the retool dataset.
Performance Metric: Achieved a validation loss of 0.3798 on the evaluation set, with training losses consistently decreasing during the fine-tuning process.

Training Details

The model was trained with a learning rate of 1e-05, a train_batch_size of 2, and gradient_accumulation_steps of 4, resulting in a total_train_batch_size of 64. The training utilized 8 devices and ran for 2 epochs with a cosine learning rate scheduler and a warmup ratio of 0.1. The optimizer used was adamw_torch.

Intended Use Cases

While specific intended uses and limitations require more detailed information, the fine-tuning on the retool dataset suggests its primary utility lies in applications that align with the characteristics and content of this dataset. Developers should consider this model for tasks where its specialized training on retool data would provide an advantage over general-purpose models.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)