empero-ai/Qwythos-9B-Claude-Mythos-5-1M

Hugging Face
VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 19, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Qwythos-9B-Claude-Mythos-5-1M by Empero is a 9 billion parameter reasoning model built on a Qwen3.5-9B base, post-trained on over 500 million tokens of Claude Mythos and Fable traces. It features a 1,048,576-token context window, native function calling, and self-correction with tools. This model excels in technically demanding questions across cybersecurity, biomedical, and quantitative reasoning, demonstrating significant performance gains over its base model on MMLU and GSM8K.

Loading preview...

Qwythos-9B: A Powerful Reasoning Model

Empero's Qwythos-9B is a 9 billion parameter model, fine-tuned from a Qwen3.5-9B base, specifically designed for advanced reasoning tasks. It leverages over 500 million tokens of high-quality Claude Mythos and Fable traces, processed with Empero AI's internal rethink tool for chain-of-thought generation.

Key Capabilities

  • 1,048,576-token context window: Achieved through YaRN rope-scaling, enabling whole-codebase reasoning, multi-document research, and long agentic trajectories.
  • Enhanced Reasoning Performance: Demonstrates significant improvements over the base Qwen3.5-9B, with +34 MMLU points and +30 GSM8K-strict points under matched evaluation conditions.
  • Native Function Calling: Supports OpenAI/Qwen3.5-style function calling out-of-the-box, without requiring additional wrappers or tool-specific fine-tuning.
  • Self-Correction with Tools: Proven to produce source-cited, factually correct answers on complex prompts by integrating Python execution and web search tools.
  • Uncensored Design: Intentionally uncensored to engage substantively with technically demanding questions in domains like cybersecurity, red-teaming, biology, pharmacology, and clinical medicine.

Good For

  • Complex Problem Solving: Excels in multi-step reasoning, especially in cybersecurity, biomedical, and quantitative fields.
  • Agentic Workflows: Ideal for retrieval-augmented agentic settings where models need to verify specifics and integrate information from tools.
  • Long-Context Applications: Suitable for tasks requiring extensive context, such as analyzing large codebases or synthesizing multiple research papers.
  • Technical Domain Expertise: Provides detailed and accurate responses to specialized technical inquiries without refusal or hedging.