tencent/DRIVE-SFT

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Nov 12, 2025Architecture:Transformer0.0K Cold

DRIVE-SFT is a 32 billion parameter supervised fine-tuned model developed by the Hunyuan Team at Tencent, based on Qwen2.5. It is specifically optimized for competitive code generation, utilizing a difficulty-aware sampling strategy during SFT to focus on challenging problems. This model serves as the initial stage for the DRIVE-RL pipeline, which further enhances performance on complex coding tasks through a two-stage reinforcement learning process.

Loading preview...

DRIVE-SFT: Supervised Fine-Tuning for Competitive Code Generation

DRIVE-SFT, developed by the Hunyuan Team at Tencent, is the Supervised Fine-Tuning (SFT) component of the larger DRIVE pipeline, designed for competitive programming code generation. Built upon Qwen2.5-32B, this model incorporates a key innovation: Difficulty-Aware Sampling. During training, competitive programming prompts are categorized by difficulty, and hard samples are duplicated to force the model to focus on more challenging problems. This SFT phase also augments training with general-purpose coding and reasoning-intensive data to enhance overall capabilities.

Key Capabilities

  • Enhanced Code Generation: Specifically fine-tuned to improve performance on competitive programming tasks.
  • Difficulty-Aware Training: Prioritizes learning from harder coding problems through strategic data sampling.
  • Foundation for RL: Serves as a robust base model before undergoing a two-stage Reinforcement Learning process (DRIVE-RL) for further performance gains.

Good For

  • Developers and researchers interested in advanced code generation, particularly for competitive programming.
  • As a strong baseline model for further fine-tuning or reinforcement learning in coding tasks.
  • Exploring techniques for improving model performance on challenging, complex problems through data curation.