RISys-Lab/RedSage-Qwen3-8B-CFW

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Oct 20, 2025Architecture:Transformer Cold

RedSage-Qwen3-8B-CFW is an 8 billion parameter large language model developed by RISys-Lab, continually pre-trained on the CyberFineWeb corpus, a specialized dataset of 11.7 billion cybersecurity tokens. This base model, built upon Qwen3-8B-Base with a 32768 token context length, is optimized for cybersecurity text completion and generation, demonstrating improved performance on cybersecurity benchmarks while retaining general reasoning capabilities through data replay. It is primarily intended for further fine-tuning on downstream cybersecurity tasks and research into domain adaptation.

Loading preview...

RedSage-Qwen3-8B-CFW: A Cybersecurity-Specialized LLM

RedSage-Qwen3-8B-CFW is an 8 billion parameter Large Language Model (LLM) developed by RISys-Lab. It represents the foundational stage of the RedSage pipeline, specifically undergoing Continual Pre-training (CPT) on the CyberFineWeb corpus. This model adapts the general-purpose Qwen3-8B-Base to the cybersecurity domain.

Key Capabilities

  • Domain Adaptation: Specialized for cybersecurity through CPT on ~11.7 billion tokens of high-quality cybersecurity web data.
  • General Knowledge Retention: Utilizes a data replay strategy with educational content to prevent catastrophic forgetting and maintain general reasoning abilities.
  • Improved Cybersecurity Performance: Demonstrates enhanced performance over its base model on various cybersecurity benchmarks, including RedSage-Bench and external benchmarks like CTI-Bench and MMLU (Security).
  • Base Model Functionality: Serves as a completion engine, ideal for generating cybersecurity-related text.

Good For

  • Further Fine-tuning: Excellent as a base model for subsequent fine-tuning on specific downstream cybersecurity tasks.
  • Research: Suitable for research into domain adaptation techniques and continual pre-training dynamics in LLMs.
  • Cybersecurity Text Generation: Effective for generating and completing text within the cybersecurity domain.

Note: As a base model, RedSage-Qwen3-8B-CFW has not been instruction-tuned or aligned. For a chat-ready assistant, users should refer to RISys-Lab/RedSage-Qwen3-8B-DPO.