Name: DCAgent/a1-swesmith API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Overview

DCAgent/a1-swesmith is a specialized language model derived from the Qwen3-8B architecture. It has undergone fine-tuning by DCAgent on a unique dataset, specifically /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces/snapshots/b9b0e0d113e9c37dd035f03644315478acc04487_thinking_preprocessed. This fine-tuning process aims to adapt the base Qwen3-8B model for particular applications or performance characteristics related to the training data's content.

Key Training Details

The model was trained with a learning rate of 4e-05 over 7 epochs, utilizing a multi-GPU setup with 16 devices and a total batch size of 16. The optimizer used was ADAMW_TORCH_FUSED with specific beta and epsilon parameters, and a cosine learning rate scheduler with a 0.1 warmup ratio. The training leveraged Transformers 4.57.6 and Pytorch 2.9.1+cu130.

Intended Use & Limitations

While specific details on intended uses and limitations are not provided in the model card, its fine-tuning on a dataset named swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces suggests a potential focus on tasks involving code, testing, or agentic reasoning within sandboxed environments. Users should refer to the dataset's characteristics to infer the model's specialized capabilities and potential limitations.

Overview

Overview

Key Training Details

Intended Use & Limitations

Full Model Card (README)