Name: KeeganC/gemma-3-1b-it-amr_thinking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KeeganC

Overview

KeeganC/gemma-3-1b-it-amr_thinking is a 1 billion parameter instruction-tuned model built upon the Gemma architecture, specifically designed for generating structured reasoning. Developed by KeeganC, this model leverages Group Relative Policy Optimization (GRPO) during its training, a method that distinguishes it from standard fine-tuning approaches.

Key Capabilities

Structured Reasoning Output: The model is trained to produce output in a specific format, including a <reasoning> tag for step-by-step thought processes and an <answer> tag for the final solution. This makes its decision-making process transparent.
GRPO Training: Utilizes GRPO, building on a base model (chimbiwide/gemma-3-1b-it-thinking-32k-sft-base) that was initially fine-tuned (SFT). This advanced training method aims to enhance its ability to generate coherent and logical reasoning traces.
Extended Context Length: Features a 32,768 token context window, allowing it to process and reason over longer and more complex inputs.

Use Cases

This model is particularly well-suited for applications where not just the answer, but also the methodology and thought process leading to that answer, are crucial. This includes tasks such as:

Problem-solving requiring explicit logical steps.
Educational tools that demonstrate how to arrive at solutions.
Automated reasoning systems where transparency is key.

Training Details

The model was trained using the Tunix (JAX) framework on a v6e-1 TPU, with LoRA rank 32 and LoRA alpha 64.0, indicating a focused and efficient fine-tuning process.