Gryphe/Pantheon-Reasoning-26B-A4B-1.1

VISIONConcurrency Cost:2Model Size:26BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 6, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gryphe/Pantheon-Reasoning-26B-A4B-1.1 is a 26 billion parameter sparse Mixture-of-Experts (MoE) model based on Google's Gemma 4 architecture, fine-tuned for enhanced reasoning capabilities in roleplay scenarios. It integrates full thinking traces into its training, allowing the model to plan narrative beats and character responses. This model is specifically designed to improve the quality and depth of roleplay interactions by enabling more grounded and prose-forward writing.

Loading preview...

Model Overview

Gryphe/Pantheon-Reasoning-26B-A4B-1.1 is a 26 billion parameter model built on Google's Gemma 4 MoE architecture, specifically the google/gemma-4-26B-A4B-it variant. This model is an experimental finetune focused on integrating advanced reasoning capabilities into roleplay generation. Unlike typical Qwen models, which often have strong reasoning but weaker writing, this Pantheon series model aims to combine reasoning with high-quality narrative output.

Key Capabilities

  • Enhanced Roleplay Quality: Designed to improve roleplay by enabling the model to "think" through character motivations, tone, and narrative planning before generating responses.
  • Reasoning Integration: Trained with full reasoning traces across all assistant turns, derived from diverse sources including Pantheon roleplay data, general roleplay, text adventures, and the Opus-4.6-Reasoning-24k dataset.
  • Back-Generated Reasoning: Utilizes DeepSeek 3.2 to back-generate planning-oriented thinking traces for roleplay and text adventure data, validated by a judge model to ensure quality and relevance.
  • Grounded Prose: Incorporates text adventure data to foster a more grounded and prose-forward writing style.

What Makes it Different

This model's primary differentiator is its unique approach to integrating reasoning directly into roleplay generation. By training on explicit thinking traces that simulate a writer's planning process, it aims to produce more coherent, in-character, and narratively sound roleplay responses. The 1.1 update further refines these reasoning traces through multiple QA stages and a self-iterating pipeline, ensuring higher quality and more effective planning. It's an exploration into whether explicit reasoning can meaningfully enhance creative, character-driven text generation.