Name: waleko/Qwen3-8B-SFT-envbench_qwen-green-yellow API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: waleko

Model Overview

waleko/Qwen3-8B-SFT-envbench_qwen-green-yellow is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This specific iteration has undergone supervised fine-tuning (SFT) on the envbench_qwen-green-yellow dataset.

Performance Highlights

During its evaluation, the model demonstrated notable performance metrics:

Loss: 0.1656
Accuracy: 0.9472
Input Tokens Seen: 2,242,920

These results indicate its proficiency in tasks related to the envbench_qwen-green-yellow dataset.

Training Details

The fine-tuning process utilized the following key hyperparameters:

Learning Rate: 5e-05
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
Batch Size: 1 (train), 1 (eval) with 4 gradient accumulation steps, resulting in a total train batch size of 16
Epochs: 5.0
LR Scheduler: Cosine type with a 0.1 warmup ratio

The model was trained across 4 multi-GPU devices, ensuring efficient processing. It leverages Transformers 4.52.4, Pytorch 2.6.0a0+df5bbc09d1.nv24.12, Datasets 3.6.0, and Tokenizers 0.21.1.

Intended Use Cases

Given its fine-tuning on the envbench_qwen-green-yellow dataset, this model is best suited for applications and tasks that align closely with the characteristics and domain of its training data. Its high accuracy on the evaluation set suggests strong performance in similar environments.

Overview

Model Overview

Performance Highlights

Training Details

Intended Use Cases

Full Model Card (README)