Name: nazdef/gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nazdef

Overview

This model, gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564, is a 1 billion parameter Gemma-3-1B-IT base model fine-tuned by nazdef. It's specifically designed for the Italian word game "Ghigliottina," where the goal is to find a single common word linking five given clues. The model is a merged version of the base model and a LoRA adapter, making it a standalone, directly loadable model without needing separate adapter application.

Key Capabilities

Ghigliottina Game Solving: Optimized to identify the common word connecting five bullet-point clues in Italian.
Structured Output: Trained to produce output in a specific format, including a <think> section for reasoning and a soluzione: <parola>. for the final answer.
GRPO Training: Utilizes a custom GRPO (Generative Reinforcement Learning with Policy Optimization) pipeline with multi-component reward shaping, including format rewards, exact match, embedding similarity, and reasoning rewards.

Use Cases

Italian Word Game Applications: Ideal for integrating into applications that require solving the Ghigliottina game or similar word association tasks in Italian.
Baseline for Further Development: Serves as a merged baseline model for continued experimentation and improvement in structured reasoning tasks.

Limitations

As an intermediate checkpoint, the model may not always perfectly adhere to the strict output format.
The exact_match performance is noted as not yet high at this specific checkpoint.

Overview

Overview

Key Capabilities

Use Cases

Limitations

Full Model Card (README)