Name: laion/swesmith-unified-1000__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/swesmith-unified-1000__Qwen3-8B, is a fine-tuned variant of the Qwen3-8B base model developed by Qwen. It has undergone further training on a specific dataset: /e/data1/datasets/playground/ot/hf_hub/datasets--laion--swesmith-unified-1000/snapshots/f36966d2485fc81ece28e25248939b0db9f34677_thinking_preprocessed.

Training Details

The fine-tuning process utilized the following key hyperparameters:

Learning Rate: 4e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
Batch Size: A total training batch size of 96 (1 per device with 32 devices and 3 gradient accumulation steps)
Epochs: 7.0
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio

Framework Versions

The training environment included:

Transformers 4.57.6
Pytorch 2.9.1+cu130
Datasets 4.7.0
Tokenizers 0.22.2

Intended Use

Given its fine-tuning on a specific dataset, this model is likely best suited for tasks and applications that align with the content and structure of the swesmith-unified-1000 dataset. Users should consider the nature of this dataset when evaluating the model's applicability to their specific use cases.

Overview

Overview

Training Details

Framework Versions

Intended Use

Full Model Card (README)