Name: huseyinatahaninan/appworld_distillation_sft_v2-SFT-Qwen3-14B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: huseyinatahaninan

Model Overview

This model, huseyinatahaninan/appworld_distillation_sft_v2-SFT-Qwen3-14B, is a 14 billion parameter language model built upon the Qwen3-14B architecture. It has undergone supervised fine-tuning (SFT) using the appworld_distillation_sft_v2 dataset.

Key Characteristics

Base Model: Qwen3-14B, a large language model developed by Qwen.
Fine-tuning Dataset: Specifically trained on the appworld_distillation_sft_v2 dataset, implying a focus on tasks or data distributions present within this dataset.
Performance: Achieved a final validation loss of 0.6408 during training, indicating its learned performance on the evaluation set.

Training Details

The model was trained for 25 epochs using a learning rate of 5e-06, a total batch size of 32, and the AdamW optimizer. The training utilized 8 GPUs with a cosine learning rate scheduler and a warmup ratio of 0.1.

Intended Use Cases

Given its fine-tuning on the appworld_distillation_sft_v2 dataset, this model is best suited for applications and tasks that align with the nature and content of that specific dataset. Users should evaluate its performance on their particular use case, especially if it falls within the domain of the training data.