DCAgent/g1_min_episodes_sampled_swesmith_psu

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 16, 2026License:otherArchitecture:Transformer Cold

DCAgent/g1_min_episodes_sampled_swesmith_psu is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model is specifically adapted using a dataset derived from 'g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces'. It is intended for specialized applications related to the characteristics of its fine-tuning data, likely focusing on specific task-oriented dialogues or agentic behaviors.

Loading preview...

Overview

This model, DCAgent/g1_min_episodes_sampled_swesmith_psu, is an 8 billion parameter language model built upon the Qwen3-8B architecture developed by Qwen. It has been fine-tuned using a specialized dataset sourced from /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces/snapshots/857b3ce8060050ded9af40dc129460f566d0c635_thinking_preprocessed.

Training Details

The fine-tuning process involved a learning rate of 4e-05 and a total training batch size of 16 across 16 GPUs. The optimizer used was ADAMW_TORCH_FUSED with specific beta and epsilon values, and a cosine learning rate scheduler with a warmup ratio of 0.1 was applied over 7.0 epochs. The training utilized Transformers 4.57.6, Pytorch 2.9.1+cu130, and Datasets 4.7.0.

Key Characteristics

  • Base Model: Qwen3-8B
  • Parameter Count: 8 billion
  • Context Length: 32768 tokens
  • Fine-tuning Data: Specialized dataset related to 'g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces', suggesting a focus on specific interaction patterns or data structures.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the characteristics and domain of the g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces data. Developers should investigate the nature of this dataset to determine suitability for their specific tasks, particularly those involving agentic behaviors or structured interactions.