DCAgent/c1_kimi_k2.5_fixed

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 10, 2026License:otherArchitecture:Transformer Warm

DCAgent/c1_kimi_k2.5_fixed is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was trained on a specific dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--c1_kimi_k2.5_fixed/snapshots/5807137b49d0d1d27e7b100da3e8d4156ddb94e3_thinking_preprocessed, suggesting a specialization in processing or generating content related to "thinking" or internal monologue. With a 32K context length, it is suitable for tasks requiring extended conversational or document understanding.

Loading preview...

Model Overview

DCAgent/c1_kimi_k2.5_fixed is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. This model was developed by DCAgent and utilizes a substantial 32,768 token context window, enabling it to process and generate longer sequences of text.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-8B.
  • Parameter Count: 8 billion parameters.
  • Context Length: Supports a 32,768 token context window.
  • Training Data: Fine-tuned on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--c1_kimi_k2.5_fixed/snapshots/5807137b49d0d1d27e7b100da3e8d4156ddb94e3_thinking_preprocessed dataset, indicating a potential specialization in tasks related to internal thought processes or reasoning.

Training Details

The model was trained with a learning rate of 4e-05, using a distributed setup across 16 devices with a total training batch size of 16. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduler and a warmup ratio of 0.1 over 7 epochs. This configuration suggests a robust training process aimed at optimizing performance on its specific fine-tuning dataset.

Potential Use Cases

Given its fine-tuning on a "thinking" related dataset and large context window, this model could be particularly effective for:

  • Complex Reasoning Tasks: Analyzing and generating text that involves logical steps or internal monologues.
  • Long-form Content Generation: Creating detailed narratives, reports, or conversational turns that require maintaining context over extended periods.
  • Specialized Conversational AI: Developing agents that can simulate or understand complex thought processes.