Name: laion/swesmith-316__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/swesmith-316__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model was specifically trained on the laion/swesmith-unified-316 dataset, indicating a focus on the characteristics and data distribution of that particular corpus.

Training Details

The fine-tuning process utilized a learning rate of 4e-05, with a total train batch size of 96 across 32 devices and 3 gradient accumulation steps. The optimizer used was ADAMW_TORCH_FUSED with standard beta values and epsilon, employing a cosine learning rate scheduler with a 0.1 warmup ratio over 7 epochs. The training was conducted using Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Intended Use

While specific intended uses and limitations are not detailed in the provided information, as a fine-tuned Qwen3-8B model, it is generally suitable for a broad range of natural language processing tasks, including text generation, summarization, and question answering, especially those aligned with the characteristics of its training dataset.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)