DR.Kernel-14B-ColdStart Overview
The hkust-nlp/drkernel-14b-coldstart is a 14 billion parameter model based on the Qwen3 architecture, developed by hkust-nlp. This specific release represents the cold-start supervised fine-tuning (SFT) checkpoint for the DR.Kernel project. It has been trained exclusively on multi-turn SFT data, specifically the hkust-nlp/drkernel-coldstart-8k dataset, to teach the model kernel-generation and refinement behaviors.
Key Capabilities
- Structured Kernel Optimization: Specializes in generating and refining code to replace standard PyTorch operators with custom Triton kernels, aiming for performance optimization.
- Initialization for RL: Primarily intended as an initialization checkpoint for subsequent reinforcement learning (RL) stages, including TRLOO, MRS, PR, and PRS.
- Strong SFT Baseline: Functions as a robust baseline for kernel generation tasks, useful for ablations comparing cold-start versus post-RL checkpoints.
Intended Use Cases
- RL Training Initialization: Use this model as the starting point for further RL training to develop the full DR.Kernel model.
- Kernel Generation Baseline: Employ it as a strong SFT baseline for tasks involving the generation of optimized Triton kernels.
- Ablation Studies: Ideal for research and development to compare the performance and characteristics of a cold-start SFT model against models that have undergone additional RL training.