laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k
The laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It is specifically optimized for code generation and understanding, leveraging a fine-tuning dataset focused on coding tasks. This model is designed for applications requiring robust performance in programming-related contexts, offering a context length of 32768 tokens.
Loading preview...
Model Overview
laion/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model has been specialized through fine-tuning on the penfever/Kimi-K2T-ling-coder-sft-sandboxes-1-maxeps-32k dataset, indicating a focus on coding and programming-related tasks. It supports a substantial context length of 32768 tokens, making it suitable for processing and generating longer code snippets or complex programming instructions.
Key Capabilities
- Code-centric Fine-tuning: Optimized for tasks related to code generation, completion, and understanding due to its specialized training data.
- Large Context Window: Benefits from a 32k token context length, allowing for handling extensive codebases or detailed technical specifications.
- Qwen3-8B Base: Inherits the foundational capabilities of the Qwen3-8B model, providing a strong base for its specialized performance.
Training Details
The model was trained with a learning rate of 4e-05 over 7.0 epochs, utilizing a total batch size of 16 across 8 GPUs. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and a warmup ratio of 0.1.
Good For
- Developers and researchers working on code generation and analysis.
- Applications requiring a model proficient in understanding and producing programming language constructs.
- Scenarios where a large context window is crucial for handling complex coding problems.