Model Overview
This model, sft__Kimi-2-5-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-32k__40-0__Qwen3-8B, is a specialized fine-tuned version of the Qwen3-8B base model, developed by Qwen. It features 8 billion parameters and supports a substantial context length of 32,768 tokens, making it suitable for processing longer inputs and maintaining conversational coherence over extended interactions.
Training Details
The model was fine-tuned using a learning rate of 4e-05 over 7 epochs, with a total training batch size of 96 across 32 GPUs. The training utilized the AdamW_TORCH_FUSED optimizer and a cosine learning rate scheduler with a 0.1 warmup ratio. This configuration suggests a focus on robust and efficient training to adapt the base model to specific tasks.
Potential Use Cases
Given its fine-tuning on the Kimi-2.5-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-32k dataset, this model is likely optimized for tasks involving:
- Code analysis and generation: The dataset name implies interaction with sandboxed environments and tests, which are common in software development.
- Problem-solving in structured environments: 'Oracle verified' and 'maxeps' suggest a focus on tasks requiring precise, verifiable outputs, potentially in technical or logical domains.
- Extended context understanding: The 32k context length is beneficial for handling complex problem descriptions, multi-file codebases, or lengthy technical documentation.