Overview

This model, exp-uns-r2egym-4_2x_glm_4_7_traces_jupiter, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been fine-tuned on a specific dataset located at /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--DCAgent--exp-uns-r2egym-4_2x_glm_4.7_traces_jupiter/snapshots/755851ab1bce4a626f500aaf4e6827f1642f1699_thinking_preprocessed.

Training Details

The fine-tuning process involved several key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation Steps: 2, leading to a total train batch size of 16
Optimizer: AdamW_Torch_Fused with betas=(0.9, 0.98) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 7.0

Framework Versions

The model was trained using:

Transformers 4.57.6
Pytorch 2.9.0+cu128
Datasets 4.4.1
Tokenizers 0.22.2

Intended Use

While specific details on intended uses and limitations are not provided, its fine-tuning on a specialized dataset suggests it is optimized for tasks related to that data. Users should consider the nature of the training data when evaluating its suitability for their specific applications.

Overview

Overview

Training Details

Framework Versions

Intended Use

Full Model Card (README)