rpDungeon/Gemma-4-E4B-Luchador

VISIONConcurrency Cost:1Model Size:7.9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 24, 2026License:gemmaArchitecture:Transformer0.0K Cold

rpDungeon/Gemma-4-E4B-Luchador is a 7.9 billion parameter Gemma 4-E4B-scale language model developed by rpDungeon, specifically optimized for roleplay and creative writing. This model leverages advanced merging techniques, including SVD instruct-subspace masks and Fisher importance-based dampening, to preserve instruction-following while significantly enhancing prose quality. It excels in generating high-quality, creative text with reduced repetition and improved narrative flow.

Loading preview...

Overview

Gemma-4-E4B-Luchador is a 7.9 billion parameter (effective ~4B) Gemma 4 merge by rpDungeon, meticulously crafted for roleplay and creative writing. It maintains strong instruction-following capabilities while significantly improving prose quality and reducing repetition, as evidenced by its IFEval strict-P score of 85.95% and a +3.07 h_delta vs. the IT baseline.

Key Capabilities

  • Enhanced Prose Quality: Achieves a +3.07 h_delta, indicating improved writing style and narrative flow.
  • Reduced Repetition: Features lower 'slop' (22.6) and 'rep3g' (5.7) counts compared to the baseline, leading to more varied and engaging output.
  • Instruction-Following Preservation: Utilizes advanced merging techniques like SVD instruct-subspace masks and Fisher importance + per-layer dampening to ensure the base model's instruction-following ability remains intact.
  • Specialized Fine-tuning: Incorporates style and prose-quality signals from multiple upstream adapters and datasets, including a 7-corpus blend of long-form RP/creative-writing data.

Techniques Used

This model employs sophisticated in-house tooling:

  • SVD Instruct-Subspace Mask: Protects the instruction-following manifold during training by projecting style-LoRA gradients out of the IT model's low-rank subspace.
  • Fisher Importance + Per-Layer Dampening: Applies a multiplicative dampener based on parameter and layer importance to control how much merge delta is applied, preserving critical instruction-following parameters.
  • Slerp Heal: Uses spherical linear interpolation for the final merge, ensuring a smoother loss landscape and better performance for tightly-coupled weights.

Limitations

  • Evaluation has focused on IFEval for instruction-following and an internal scoreboard for style; external benchmarks (MMLU, HellaSwag) have not been run.
  • Not recommended for code, math, or safety-critical generation due to its training on conversational and roleplay data.