Name: DCAgent/a1-r2egym API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/a1-r2egym is a specialized language model derived from the Qwen3-8B architecture. It has undergone fine-tuning on a unique dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--r2egym_sandboxes_10k_glm_4.7_traces_jupiter/snapshots/bf10c6912b106ea55b7b06e79c99fc4d038a8437_thinking_preprocessed, suggesting a focus on tasks related to agent environments or reinforcement learning.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Distributed Training: Multi-GPU setup with 16 devices, resulting in a total effective batch size of 16 for training and 128 for evaluation.
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08.
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
Epochs: 7.0

This fine-tuning process, utilizing specific training data and parameters, indicates an intent to adapt the base Qwen3-8B model for particular interactive or decision-making applications, likely within simulated or sandbox environments as suggested by the dataset name.

Overview

Model Overview

Training Details

Full Model Card (README)