Leopo1d/OpenVul-Qwen3-4B-SFT-ep1

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Feb 14, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Leopo1d/OpenVul-Qwen3-4B-SFT-ep1 is a 4 billion parameter Qwen3-based language model developed by Youpeng Li, Fuxun Yu, and Xinda Wang, fine-tuned for vulnerability detection in C/C++ code. It specializes in identifying security flaws by analyzing inter-procedural contexts rather than isolated functions. This model establishes basic security expertise and instruction-following capabilities through training on high-quality vulnerability reasoning Chain-of-Thought (CoT) data. Its primary use is for context-level vulnerability detection, focusing on Common Weakness Enumeration (CWE) standards.

Loading preview...

OpenVul-Qwen3-4B-SFT: Vulnerability Detection Model

OpenVul-Qwen3-4B-SFT is a 4 billion parameter model built on the Qwen3 architecture, specifically fine-tuned for vulnerability detection in C/C++ code. Developed by Youpeng Li, Fuxun Yu, and Xinda Wang, this model is designed to identify security flaws with a focus on Common Weakness Enumeration (CWE) standards.

Key Capabilities & Features

  • Context-Level Vulnerability Detection: Unlike models that analyze isolated functions, OpenVul-Qwen3-4B-SFT excels at utilizing inter-procedural contexts, including global variables, type definitions, and callee functions, for more comprehensive analysis.
  • Specialized Training: It has been fine-tuned on high-quality vulnerability reasoning Chain-of-Thought (CoT) data, curated using Rejection Sampling from DeepSeek-R1-0528. This method was chosen to prevent "ground-truth leakage" and reasoning hallucinations, ensuring robust security expertise.
  • Instruction Following: The model demonstrates strong instruction-following capabilities tailored for security analysis tasks.
  • Evidence-Based Analysis: It is designed to provide precise, evidence-based analysis without speculation, clearly labeling detected vulnerabilities and their CWE identifiers.

Recommended Use Cases

  • Automated Security Audits: Ideal for integrating into pipelines for automated scanning of C/C++ codebases to identify potential security vulnerabilities.
  • Developer Tools: Can be used to assist developers in identifying and understanding security flaws during the coding process.
  • Research in Code Security: Provides a strong foundation for further research and development in LLM-based vulnerability detection, particularly for post-training pipelines as detailed in its associated paper.