Overview
DR.Kernel-8B-ColdStart Overview
The hkust-nlp/drkernel-8b-coldstart is an 8 billion parameter model based on the Qwen3 architecture, developed by hkust-nlp. It represents the cold-start supervised fine-tuning (SFT) checkpoint for the DR.Kernel project, focusing on generating structured kernel-optimization responses. This model is trained exclusively on multi-turn SFT data from the hkust-nlp/drkernel-coldstart-8k dataset, which teaches kernel-generation and refinement behaviors.
Key Capabilities & Purpose
- Structured Kernel Optimization: Specializes in producing optimized kernel code, particularly for Triton kernels, by transforming existing PyTorch operators.
- Initialization for RL: Designed as the foundational checkpoint for subsequent reinforcement learning (RL) stages (TRLOO, MRS, PR, PRS) within the DR.Kernel framework.
- Strong SFT Baseline: Provides a robust supervised fine-tuning baseline for tasks involving kernel generation and optimization.
- Ablation Studies: Useful for researchers conducting ablations to compare performance between cold-start and post-RL checkpoints.
Intended Use Cases
- RL Training Initialization: The primary use is to serve as the starting point for DR.Kernel's reinforcement learning training.
- Kernel Generation Baseline: Can be used as a strong SFT model for generating optimized Triton kernels.
- Research & Development: Ideal for experimental setups and comparative analysis in kernel optimization research.
This model does not include the final performance claims of the full DR.Kernel RL results and is not intended for safety-critical production deployment without further verification.