Overview
Llama-Primus-Nemotron-70B-Instruct Overview
Llama-Primus-Nemotron-70B-Instruct is a 70 billion parameter instruction-tuned model developed by trend-cybertron, extending NVIDIA's Llama-3.1-Nemotron-70B-Instruct. This model underwent continued pre-training on over 10 billion tokens of large-scale cybersecurity corpora, followed by supervised fine-tuning and a DELLA-based merging process. It is designed to significantly enhance performance in cybersecurity-specific applications while preserving its general instruction-following capabilities.
Key Capabilities and Performance
- Cybersecurity Specialization: Achieves an 18.18% improvement in aggregate scores across public cybersecurity benchmarks, including CTI-Bench, CyberMetric, SecEval, and CISSP exam questions.
- Maintained General Performance: Demonstrates comparable performance to its base model on general-purpose instruction following benchmarks like Arena Hard.
- Extensive Training Data: Pre-trained on specialized cybersecurity datasets such as Primus-Seed-V2, Primus-FineWeb, and Primus-Nemotron-CC, focusing on blogs, news, books, websites, MITRE, and Trend Micro knowledge.
- Robust Context Window: Features a 32768 token context length, suitable for processing detailed cybersecurity reports and analyses.
Ideal Use Cases
- Cyber Threat Intelligence (CTI): Excels in tasks like CVE to CWE mapping, CVSS scoring, and analyzing attack techniques and epidemiology (ATE).
- Security Operations: Useful for evaluating cybersecurity knowledge, answering CISSP-level questions, and general security analysis.
- Research and Development: Provides a strong foundation for further fine-tuning or research in specialized cybersecurity domains.