laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 20, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It is specifically optimized for code generation and understanding, leveraging a fine-tuning dataset focused on coding tasks. This model is designed for applications requiring robust performance in programming-related contexts, offering a context length of 32768 tokens.

Loading preview...

Model Overview

laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model has been specialized through fine-tuning on the penfever/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k dataset, indicating a focus on coding and programming-related tasks. It supports a substantial context length of 32768 tokens, making it suitable for processing and generating longer code snippets or complex programming instructions.

Key Capabilities

  • Code-centric Fine-tuning: Optimized for tasks related to code generation, completion, and understanding due to its specialized training data.
  • Large Context Window: Benefits from a 32k token context length, allowing for handling extensive codebases or detailed technical specifications.
  • Qwen3-8B Base: Inherits the foundational capabilities of the Qwen3-8B model, providing a strong base for its specialized performance.

Training Details

The model was trained with a learning rate of 4e-05 over 7.0 epochs, utilizing a total batch size of 16 across 8 GPUs. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and a warmup ratio of 0.1.

Good For

  • Developers and researchers working on code generation and analysis.
  • Applications requiring a model proficient in understanding and producing programming language constructs.
  • Scenarios where a large context window is crucial for handling complex coding problems.