Name: DCAgent2/swesmith-stack-over5050 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent2

Overview

DCAgent2/swesmith-stack-over5050 is an 8 billion parameter language model derived from the Qwen3-8B architecture. This model has undergone specific fine-tuning on two distinct datasets: penfever/GLM-4.6-swesmith-32ep-131k-nosumm-reasoning and penfever/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning. The selection of these datasets indicates a strong focus on enhancing the model's reasoning capabilities, particularly within domains that might involve complex problem-solving or technical discussions, such as those found on Stack Exchange platforms.

Key Training Details

Base Model: Qwen/Qwen3-8B
Fine-tuning Datasets:
- penfever/GLM-4.6-swesmith-32ep-131k-nosumm-reasoning
- penfever/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning
Hyperparameters:
- Learning Rate: 4e-05
- Optimizer: AdamW_Torch_Fused with betas=(0.9, 0.98) and epsilon=1e-08
- Epochs: 7.0
- Total Train Batch Size: 16 (across 8 GPUs with gradient accumulation)

Intended Use Cases

Given its specialized training, this model is likely best suited for applications requiring:

Reasoning and Problem Solving: Particularly in technical or domain-specific contexts.
Content Generation: For responses or explanations similar to those found on Stack Exchange.
Information Retrieval and Summarization: Where detailed, reasoned answers are preferred.

Overview

Overview

Key Training Details

Intended Use Cases

Full Model Card (README)