VulnHunter: AI Security Agent
VulnHunter is a specialized 7.6 billion parameter AI agent developed by gateremark, fine-tuned from the Qwen2.5-Coder-7B-Instruct model. It leverages Group Relative Policy Optimization (GRPO) with a custom security reward function to identify and fix web application security vulnerabilities. The model was trained efficiently using Unsloth and Huggingface's TRL library, achieving PPO-quality learning without the high memory overhead of traditional PPO.
Key Capabilities
- Vulnerability Detection: Accurately identifies SQL Injection, Cross-Site Scripting (XSS), and Path Traversal vulnerabilities in code.
- Automatic Fix Generation: Capable of suggesting and generating secure code patches to remediate detected vulnerabilities.
- Code Understanding: Built on a code-pretrained base model (Qwen2.5-Coder-7B), enabling strong comprehension of programming patterns and security contexts.
- Reinforcement Learning: Utilizes GRPO for effective policy optimization, guided by a reward function that incentivizes identifying vulnerability types, generating valid patches, and blocking exploits.
Training and Performance
The model was trained on an NVIDIA A100 GPU for approximately 90 minutes, demonstrating efficient training with 4-bit quantization. Its base model, Qwen2.5-Coder-7B, was chosen for its code pre-training, instruction-following capabilities, and compatibility with Unsloth for accelerated training. VulnHunter also includes an OpenEnv-compatible RL environment and an A2A-compatible agent wrapper for integration into agent-based systems.
Good For
- Automated security analysis of web application codebases.
- Generating secure code suggestions and patches.
- Integrating into CI/CD pipelines for proactive vulnerability management.
- Research and development in AI-driven cybersecurity tools.