Nitish-Garikoti/Foundation-Sec-8B
Foundation-Sec-8B is an 8-billion parameter base language model developed by Foundation AI at Cisco, specialized for cybersecurity applications. Built on the Llama-3.1-8B architecture, it underwent continued pretraining on 5.1 billion tokens of cybersecurity-specific data, including threat intelligence and vulnerability databases. This model excels at tasks like threat detection, vulnerability assessment, and security automation, offering significant performance gains over Llama-3.1-8B on security benchmarks while maintaining general reasoning capabilities. It is designed for local deployment in security-sensitive environments, enabling AI-driven security tools.
Loading preview...
Foundation-Sec-8B: Cybersecurity-Specialized LLM
Foundation-Sec-8B is an 8-billion parameter base language model from Foundation AI at Cisco, built upon the Llama-3.1-8B architecture. It has been extensively pretrained on a curated corpus of 5.1 billion cybersecurity-specific tokens, encompassing threat intelligence, vulnerability databases, and security standards. This specialization allows it to understand complex security concepts, terminology, and practices.
Key Capabilities & Performance
The model demonstrates notable performance improvements on cybersecurity benchmarks, achieving +3 to +9 point gains over Llama-3.1-8B on tasks like CTI-MCQA (cybersecurity knowledge) and CTI-RCM (vulnerability root cause mapping). It performs comparably to or better than the 70B parameter Llama-3.1 model on these specialized tasks, with only a minimal 2% drop in general language reasoning (MMLU).
Intended Use Cases
Foundation-Sec-8B is optimized for three core categories:
- SOC Acceleration: Automating triage, summarization, and evidence collection for Security Operations Centers.
- Proactive Threat Defense: Simulating attacks, prioritizing vulnerabilities, and modeling attacker behavior.
- Engineering Enablement: Assisting with security validation, configuration assessment, and compliance.
It is particularly suited for local deployment, addressing concerns around data security and regulatory compliance. The model serves as a robust foundation for fine-tuning across various cybersecurity workflows, including summarization, classification, named entity recognition, and question answering in security contexts. For more details, refer to the technical report.