qcy98/MACyber-12B

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:May 19, 2026Architecture:Transformer0.0K Warm

MACyber-12B is a 12.18 billion parameter cybersecurity language model developed by qcy98, built on the Gemma3ForConditionalGeneration architecture with a 131,072 token context length. It is designed for analyzing structured security data across heterogeneous sources, excelling at anomaly detection, situation assessment, and response recommendation. The model integrates a dual-channel threat-intelligence retrieval mechanism and CyberLoRA for rapid adaptation to unseen threats, outperforming 13 LLM baselines by 21.42% on the MACyber-INT benchmark.

Loading preview...

Overview

MACyber-12B is a 12.18 billion parameter cybersecurity language model developed by qcy98, utilizing the Gemma3ForConditionalGeneration architecture. It is specifically designed to bridge security data silos by analyzing operational records from diverse, structured security domains. The model's primary function is to produce unified, evidence-grounded outputs for tasks such as anomaly detection, situation assessment, reasoning, and response recommendation, rather than decontextualized question answering.

Key Capabilities & Features

  • Adaptive Threat Intelligence: Incorporates a dual-channel threat-intelligence retrieval mechanism for both known and unknown attacks.
  • CyberLoRA: Features a LoRA-based adaptation mechanism for single-step, rapid adaptation to previously unseen threats, achieving an average 3.18-point gain with 23.2 ms adaptation latency for unknown threats.
  • Performance: On the MACyber-INT benchmark, MACyber-12B outperforms the average score of 13 LLM baselines by 21.42% across four cybersecurity tasks.
  • Structured Output: Generates structured assessments in JSON format, including fields for evidence, analysis, action (none, monitor, block), official threat label, and severity (benign, suspicious, low, medium, high).
  • High Context Length: Configured with a maximum context length of 131,072 tokens.
  • Multimodal Potential: While primarily for structured text, the underlying Gemma3 architecture supports vision input (896x896 pixels, 256 image tokens).

Intended Uses

  • Security Telemetry Analysis: Analyzing threat-intelligence records.
  • Threat Classification: Structured classification, severity assessment, and response recommendation.
  • Situational Awareness: Retrieval-augmented cybersecurity situational awareness.
  • Benchmark Evaluation: Research and evaluation on the MACyber-INT benchmark.

Limitations

  • Model outputs may contain inaccuracies or inappropriate recommendations.
  • Retrieval-augmented outputs depend on the quality and coverage of reference records.
  • Not intended for autonomous incident response; human review is required for operational security workflows.