Name: waleko/Qwen3-8B-SFT-envbench_qwen-all API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: waleko

Model Overview

This model, waleko/Qwen3-8B-SFT-envbench_qwen-all, is a specialized fine-tuned version of the Qwen/Qwen3-8B base model. It features 8 billion parameters and supports a substantial context length of 32768 tokens, making it suitable for processing longer inputs.

Training Details

The model was fine-tuned using the envbench_qwen-all dataset. During its evaluation, it demonstrated strong performance metrics:

Loss: 0.1477
Accuracy: 0.9511
Num Input Tokens Seen: 36,600,520

Training was conducted with a learning rate of 5e-05, a total batch size of 16 (achieved with gradient accumulation), and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 5 epochs. The training environment included Transformers 4.52.4 and PyTorch 2.6.0a0.

Intended Use

Given its specific fine-tuning on the envbench_qwen-all dataset, this model is best suited for applications and tasks that align with the characteristics and content of that particular dataset. Users should consider the nature of the training data when determining its applicability for their specific use cases.