DCAgent/a1-r2egym
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026License:otherArchitecture:Transformer Cold
DCAgent/a1-r2egym is a fine-tuned version of the Qwen3-8B causal language model. This model was trained on the r2egym_sandboxes_10k_glm_4.7_traces_jupiter dataset, indicating a specialization in environments related to reinforcement learning or agent-based tasks. It leverages a multi-GPU setup with 16 devices and a cosine learning rate scheduler over 7 epochs. The fine-tuning process suggests an optimization for specific interactive or decision-making scenarios.
Loading preview...