Name: KeeganCarey/gemma-3-1b-it-amr_thinking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KeeganCarey

Model Overview

KeeganCarey/gemma-3-1b-it-amr_thinking is a 1 billion parameter model built upon the Gemma architecture, specifically fine-tuned for generating structured reasoning. This model leverages Group Relative Policy Optimization (GRPO) to enhance its ability to produce explicit, step-by-step thought processes alongside its final answers.

Key Capabilities

Structured Reasoning Output: Generates output in a distinct <reasoning>step-by-step thinking process</reasoning><answer>final answer</answer> format.
GRPO Training: Utilizes Group Relative Policy Optimization for improved reasoning trace generation.
Extended Context Window: Features a 32k token context length, allowing for processing longer inputs and more complex reasoning tasks.

Training Details

This model was trained using a combination of Supervised Fine-Tuning (SFT) and GRPO. The base model for this training was chimbiwide/gemma-3-1b-it-thinking-32k-sft-base. Training was conducted using the Tunix (JAX) framework on a v6e-1 TPU, with LoRA rank 32 and LoRA alpha 64.0.

Ideal Use Cases

This model is particularly well-suited for applications where not just the answer, but also the explicit thought process leading to that answer, is crucial. This includes tasks like:

Problem-solving requiring transparent steps.
Educational tools that explain solutions.
Debugging or diagnostic systems that outline reasoning.

Overview

Model Overview

Key Capabilities

Training Details

Ideal Use Cases

Full Model Card (README)