Name: KeeganC/gemma-3-1b-it-amr_thinking-2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KeeganC

Overview

KeeganC/gemma-3-1b-it-amr_thinking-2 is a 1 billion parameter model based on the Gemma architecture, specifically fine-tuned for generating structured reasoning. It leverages Group Relative Policy Optimization (GRPO) to enhance its ability to produce explicit thought processes. This model is a GRPO-trained version of the chimbiwide/gemma-3-1b-it-thinking-32k-sft-base model, developed using the Tunix (JAX) framework on v6e-1 TPU hardware.

Key Capabilities

Structured Reasoning Output: Generates a distinct reasoning trace (<reasoning>step-by-step thinking process</reasoning>) before providing the final answer (<answer>final answer</answer>).
GRPO Training: Utilizes Group Relative Policy Optimization for improved policy learning in reasoning tasks.
Extended Context Window: Supports a context length of 32768 tokens, allowing for processing longer inputs and more complex reasoning chains.

Good For

Applications requiring transparent, step-by-step explanations of a model's thought process.
Tasks where understanding how an answer was derived is as important as the answer itself.
Developing agents or systems that need to articulate their reasoning for debugging, auditing, or user comprehension.