electroglyph/gemma-3-4b-it-unslop-GRPO-v3

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Aug 20, 2025License:gemmaArchitecture:Transformer Cold

electroglyph/gemma-3-4b-it-unslop-GRPO-v3 is a 4.3 billion parameter instruction-tuned language model developed by electroglyph, fine-tuned from Google's Gemma-3-4b-it. This model focuses on refining text generation characteristics, specifically targeting reduced unusual phrasing and improved lexical diversity through a novel reward system during training. It is optimized for generating more natural and varied text outputs, making it suitable for applications requiring high-quality, less repetitive conversational or creative content.

Loading preview...

Overview

electroglyph/gemma-3-4b-it-unslop-GRPO-v3 is a 4.3 billion parameter instruction-tuned model, building upon Google's gemma-3-4b-it. This version represents the third iteration of electroglyph's 'unslop' experiments, primarily focusing on enhancing text generation quality and reducing idiosyncratic outputs often seen in large language models. A key differentiator is its refined training methodology, which includes specific adjustments to temperature and reward functions to achieve more natural and diverse language.

Key Capabilities

  • Improved Text Coherence: Training at a temperature of 1.0 has significantly reduced 'weird' or nonsensical outputs, leading to more coherent and logical responses.
  • Reduced Repetitive Phrasing: The reward system was adjusted to allow a controlled number of complex sentences (e.g., with multiple commas), thereby cutting down on excessive parenthetical phrases without eliminating natural sentence structures.
  • Enhanced Lexical Diversity: A sophisticated lexical diversity score, based on the Mean Type-Token Ratio (MTLD) from a large corpus of books, was integrated into the reward system. This encourages the model to produce text with a richer vocabulary, aiming for an MTLD score between 80-120.

Good for

  • Applications requiring natural language generation where output quality and readability are paramount.
  • Use cases demanding less repetitive and more varied text, such as creative writing, content generation, or advanced chatbots.
  • Developers looking for a Gemma-based model with refined conversational characteristics and improved stylistic control.