zhuyaoyu/CodeV-R1-RL-Qwen-7B

Warm
Public
7.6B
FP8
32768
1
Jun 3, 2025
Hugging Face
Overview

CodeV-R1-Qwen-7B: Specialized Verilog Generation

CodeV-R1-Qwen-7B is a 7.6 billion parameter model developed by Yaoyu Zhu et al., specifically designed for generating Verilog hardware description language (HDL) from natural language specifications. It is built upon the Qwen-2.5 series and fine-tuned using a unique Reinforcement Learning with Verifiable Reward (RLVR) framework called CodeV-R1.

Key Capabilities & Innovations

  • Verilog Generation & Completion: Achieves state-of-the-art performance on VerilogEval v2 and RTLLM v1.1 benchmarks for both specification-to-RTL translation and code completion tasks.
  • RLVR Framework: Employs a novel RLVR framework that includes a rule-based testbench generator for robust equivalence checking and a round-trip data synthesis method to create high-quality NL-code pairs.
  • Adaptive DAPO: Utilizes an adaptive DAPO (Distillation-then-Adaptive Policy Optimization) training pipeline, which is a two-stage process involving distillation for initial reasoning abilities followed by an RL algorithm that reduces training cost by adaptively adjusting the sampling rate.
  • Computational Efficiency: Demonstrates better results with fewer computational resources compared to models like DeepSeek-R1, as highlighted by its inference time vs. FLOPs consumption.
  • Reasoning Enhancement: The acquisition of reasoning processes for Verilog problems also enhances the model's out-of-domain mathematical capabilities.

Performance Highlights

CodeV-R1-Qwen-7B significantly outperforms many general-purpose and coding-specific LLMs, including GPT-4o, Llama3.1, CodeLlama, and DeepSeek Coder, on Verilog-specific benchmarks. For instance, it achieves 68.8% on VerilogEval v2 (Spec-to-RTL) and 72.9% on RTLLM v1.1 (Pass@1), surpassing its distillation-based precursor, CodeV-R1-Distill-Qwen-7B.

Usage Recommendations

It is recommended to use a specific system prompt during inference to guide the model's reasoning process and output format, ensuring the final Verilog code is enclosed within <answer> ```verilog ... ``` </answer> tags.