nazdef/gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Mar 4, 2026License:gemmaArchitecture:Transformer Warm

The nazdef/gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564 is a 1 billion parameter Gemma-3-1B-IT model, fine-tuned by nazdef, specifically optimized for solving the Italian "Ghigliottina" word game. This model integrates a LoRA adapter via `merge_and_unload()` to provide a standalone solution, excelling at identifying a common word linking five given clues. It is designed for structured output, including a thinking process and a precise solution format.

Loading preview...

Overview

This model, gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564, is a 1 billion parameter Gemma-3-1B-IT base model fine-tuned by nazdef. It's specifically designed for the Italian word game "Ghigliottina," where the goal is to find a single common word linking five given clues. The model is a merged version of the base model and a LoRA adapter, making it a standalone, directly loadable model without needing separate adapter application.

Key Capabilities

  • Ghigliottina Game Solving: Optimized to identify the common word connecting five bullet-point clues in Italian.
  • Structured Output: Trained to produce output in a specific format, including a <think> section for reasoning and a soluzione: <parola>. for the final answer.
  • GRPO Training: Utilizes a custom GRPO (Generative Reinforcement Learning with Policy Optimization) pipeline with multi-component reward shaping, including format rewards, exact match, embedding similarity, and reasoning rewards.

Use Cases

  • Italian Word Game Applications: Ideal for integrating into applications that require solving the Ghigliottina game or similar word association tasks in Italian.
  • Baseline for Further Development: Serves as a merged baseline model for continued experimentation and improvement in structured reasoning tasks.

Limitations

  • As an intermediate checkpoint, the model may not always perfectly adhere to the strict output format.
  • The exact_match performance is noted as not yet high at this specific checkpoint.