Name: laion/swesmith-1000__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/swesmith-1000__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model leverages a substantial 32,768 token context window, enabling it to process and generate longer sequences of text with improved coherence and contextual awareness.

Training Details

The model was fine-tuned on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--swesmith-unified-1000/snapshots/031ef1b66d8d55421f68d0afcbf7872ef3644c1e_thinking_preprocessed dataset. Key training hyperparameters included:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Optimizer: ADAMW_TORCH_FUSED
Epochs: 7.0
Distributed Training: Multi-GPU setup with 32 devices and 3 gradient accumulation steps, resulting in a total train batch size of 96.

Intended Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the characteristics and content of the laion/swesmith-unified-1000 dataset. Developers should evaluate its performance on tasks related to the dataset's domain to determine optimal utility.

Overview

Model Overview

Training Details

Intended Use Cases

Full Model Card (README)