zhuyaoyu/CodeV-R1-RL-Qwen-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 3, 2025Architecture:Transformer0.0K Cold

CodeV-R1-RL-Qwen-7B is a 7.6 billion parameter language model developed by zhuyaoyu, fine-tuned for Verilog generation using a novel reinforcement learning with verifiable reward (RLVR) framework. Built upon the Qwen-2.5 series, it excels at translating natural language specifications into Verilog and performing code completion tasks. This model demonstrates superior efficiency and performance on Verilog benchmarks like VerilogEval v2 and RTLLM v1.1, achieving 68.8% on VerilogEval spec-to-RTL and 72.9% on RTLLM pass@1.

Loading preview...

Overview

CodeV-R1-RL-Qwen-7B is a 7.6 billion parameter language model specifically fine-tuned for generating Verilog hardware description language (HDL) from natural language specifications and for Verilog code completion. Developed by zhuyaoyu, this model is built upon the Qwen-2.5 series and leverages a novel reinforcement learning with verifiable reward (RLVR) framework called CodeV-R1. It addresses key challenges in electronic design automation (EDA) by employing a rule-based testbench generator for robust equivalence checking, a round-trip data synthesis method for high-quality NL-code pairs, and an adaptive DAPO RLVR algorithm to reduce training costs.

Key Capabilities

  • Verilog Generation: Translates natural language specifications into Verilog RTL code.
  • Code Completion: Completes Verilog code snippets.
  • Reinforcement Learning Fine-tuning: Utilizes an adaptive DAPO algorithm for efficient RLVR training.
  • High Performance: Achieves 68.8% on VerilogEval v2 (spec-to-RTL) and 72.9% on RTLLM v1.1 (pass@1), outperforming many general-purpose and coding-specific LLMs of similar size.
  • Computational Efficiency: Demonstrates better results with fewer computational resources compared to models like DeepSeek-R1, as highlighted by FLOPs consumption analysis.
  • Reasoning Enhancement: The acquisition of reasoning processes for Verilog problems also enhances its out-of-domain mathematical capabilities.

Good for

  • Developers and engineers working on electronic design automation (EDA) tasks.
  • Automating the generation of Verilog HDL from natural language descriptions.
  • Improving Verilog code completion workflows.
  • Research and development in RLVR for code generation, particularly in specialized domains like hardware design.

For more details on the training methodology and evaluation, refer to the project page and the associated paper.