Name: DCAgent/a1-nemotron_rust API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Overview

DCAgent/a1-nemotron_rust is a specialized language model derived from the Qwen3-8B architecture, developed by DCAgent. It has undergone fine-tuning on a unique dataset, specifically /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_nemotron-rust_10k_glm_4.7_traces_jupiter/snapshots/3132525161b49015e6ef5c5bf75c3a14ca21c34b_thinking_preprocessed.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Distributed Training: Multi-GPU setup with 16 devices, resulting in a total train batch size of 16 and eval batch size of 128.
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08.
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
Epochs: 7.0

Intended Uses & Limitations

Specific intended uses and limitations are not detailed in the provided information, suggesting that further exploration or documentation is needed to fully understand its optimal applications and potential constraints. Developers should consult additional resources for comprehensive guidance on its deployment and capabilities.

Overview

Overview

Training Details

Intended Uses & Limitations

Full Model Card (README)