Name: laion/swesmith-31600-opt100k__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/swesmith-31600-opt100k__Qwen3-8B, is an 8 billion parameter language model based on the Qwen3-8B architecture. It has been specifically fine-tuned by laion using the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--swesmith-unified-31600 dataset. This fine-tuning process suggests an optimization for tasks related to the characteristics of this particular dataset.

Key Training Details

The model was trained with a learning rate of 4e-05 over 5 epochs, utilizing a multi-GPU setup with 32 devices and a total batch size of 96. The optimizer used was AdamW_Torch_Fused with cosine learning rate scheduling and a warmup ratio of 0.1. The training leveraged Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Intended Use

Given its foundation on Qwen3-8B and specific fine-tuning, this model is likely best suited for applications that align with the data distribution and characteristics of the laion/swesmith-unified-31600 dataset. Developers should consider its 32768 token context length for tasks requiring extensive input understanding.

Overview

Model Overview

Key Training Details

Intended Use

Full Model Card (README)