Name: laion/swesmith-1000-opt1k__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

The laion/swesmith-1000-opt1k__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model was specifically trained on the laion/swesmith-unified-1000 dataset, indicating a potential specialization or adaptation to the characteristics of this particular data.

Training Details

The fine-tuning process involved several key hyperparameters:

Learning Rate: 4e-05
Batch Size: A train_batch_size of 1 with gradient_accumulation_steps of 3, resulting in a total_train_batch_size of 96.
Optimizer: Utilized ADAMW_TORCH_FUSED with specific beta values and epsilon.
Scheduler: A cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 7.0 epochs across 32 devices.

Intended Use

While specific intended uses and limitations require further information, as a fine-tuned Qwen3-8B model, it is generally suitable for a broad range of natural language processing tasks, including text generation, summarization, and question answering, particularly benefiting from its 32768 token context length. Its performance characteristics would be influenced by the specific nature of the swesmith-unified-1000 dataset.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)