DCAgent/g1_weighted_100k_8b_v2
DCAgent/g1_weighted_100k_8b_v2 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_weighted_top4_100k_glm47_traces dataset, suggesting an optimization for specific data patterns or tasks related to its training data. Its 32768 token context length supports processing extensive inputs.
Loading preview...
Model Overview
DCAgent/g1_weighted_100k_8b_v2 is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. This model was developed by DCAgent and trained on a specialized dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_weighted_top4_100k_glm47_traces, indicating a focus on tasks or data characteristics present within this specific training corpus.
Training Details
The model underwent fine-tuning with the following key hyperparameters:
- Learning Rate: 4e-05
- Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
- Batch Size: A total training batch size of 96 (train_batch_size: 1, gradient_accumulation_steps: 2, num_devices: 48)
- Epochs: 5.0
- Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1
This configuration suggests a robust training approach designed to adapt the base Qwen3-8B model to the nuances of the DCAgent-specific dataset. The model's 32768 token context length allows for handling substantial input sequences.