trendmicro-ailab/Llama-Primus-Merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 16, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

Llama-Primus-Merged is an 8 billion parameter instruction-tuned language model developed by trendmicro-ailab, built upon Llama-3.1-8B-Instruct. It was pre-trained on a large cybersecurity corpus (Primus-Seed and Primus-FineWeb) and instruction fine-tuned on cybersecurity QA tasks. This model is specifically optimized for cybersecurity applications, demonstrating a 14.84% improvement in aggregated scores across multiple cybersecurity benchmarks compared to its base model. It maintains a 32768 token context length and is designed for specialized cybersecurity use cases.

Loading preview...

Overview

Llama-Primus-Merged is an 8 billion parameter instruction-tuned language model from trendmicro-ailab, specifically engineered for cybersecurity applications. It builds upon Llama-3.1-8B-Instruct, having undergone extensive pre-training on a proprietary cybersecurity corpus, including Primus-Seed and Primus-FineWeb datasets. The model was further instruction fine-tuned using approximately 1,000 curated cybersecurity QA tasks (Primus-Instruct) to enhance its instruction-following capabilities within the domain.

Key Capabilities

  • Enhanced Cybersecurity Performance: Achieves a 14.84% improvement in aggregated scores across various cybersecurity benchmarks, including CTI-Bench, CyberMetric, and SecEval, compared to Llama-3.1-8B-Instruct.
  • Specialized Training Data: Leverages a unique collection of open-source datasets (Primus-Seed, Primus-FineWeb, Primus-Instruct) tailored for cybersecurity LLM training.
  • Instruction Following: Retains strong instruction-following abilities, crucial for practical application in cybersecurity tasks.

Good For

  • Developers and researchers focused on cybersecurity-specific natural language processing tasks.
  • Applications requiring robust performance in areas like cyber threat intelligence (CTI), vulnerability analysis, and security operations.
  • Use cases where domain-specific knowledge and instruction adherence are critical for accurate and reliable outputs.