RedSage-Qwen3-8B-DPO: Cybersecurity LLM with DPO Alignment
RISys-Lab/RedSage-Qwen3-8B-DPO is the culmination of RISysLab's RedSage cybersecurity LLM series, representing the fourth and final stage of its multi-stage training pipeline. This model is fine-tuned from RedSage-Qwen3-8B-Ins using Direct Preference Optimization (DPO) on the allenai/llama-3.1-tulu-3-8b-preference-mixture dataset. This DPO alignment significantly enhances the model's general reasoning capabilities and safety behaviors, building upon its specialized cybersecurity domain expertise.
Key Capabilities & Performance
- Cybersecurity Expertise: Achieves strong performance across various cybersecurity benchmarks, including RedSage-Bench (84.83% Macro Average), CTI-Bench, CyberMetric, MMLU (Security), SecBench, SecEva, and SECURE, consistently outperforming the base Qwen3-8B model.
- Enhanced General Reasoning: The DPO alignment improves general benchmarks, with a mean score of 74.33% on the OpenLLM Leaderboard, showing notable gains in MMLU, ARC-C, GSM8K, HellaSwag, and TruthfulQA compared to the non-reasoning Qwen3-8B.
- Instruction Following: Benefits from DPO alignment for better instruction adherence based on human preferences.
Ideal Use Cases
- General-purpose cybersecurity assistance: Answering queries related to cybersecurity concepts, threats, and best practices.
- Log analysis: Identifying potential indicators of compromise (IOCs) within log entries.
- Threat intelligence summarization: Processing and summarizing threat intelligence reports.
- Educational queries: Providing explanations and information on cybersecurity topics.
Limitations
- While aligned for safety and helpfulness, users should always verify outputs, especially in critical security environments, as the model may still produce incorrect information.