KeeganCarey/gemma-3-1b-it-amr_thinking-2

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 10, 2026License:gemmaArchitecture:Transformer0.0K Cold

KeeganCarey/gemma-3-1b-it-amr_thinking-2 is a 1 billion parameter Gemma-based instruction-tuned language model, fine-tuned using Group Relative Policy Optimization (GRPO) to generate structured reasoning traces. This model is specifically designed to output a step-by-step thinking process followed by a final answer, making it suitable for tasks requiring explicit reasoning. With a 32k token context length, it excels at complex problem-solving where intermediate thought processes are crucial.

Loading preview...

Overview

This model, KeeganCarey/gemma-3-1b-it-amr_thinking-2, is a 1 billion parameter instruction-tuned variant of the Gemma architecture, specifically optimized for generating structured reasoning. It was developed by KeeganCarey and fine-tuned using a novel method called Group Relative Policy Optimization (GRPO) on top of a Supervised Fine-Tuning (SFT) base model.

Key Capabilities

  • Structured Reasoning Output: The model is engineered to produce a distinct <reasoning> block detailing its thought process, followed by an <answer> block for the final result. This explicit output format is highly beneficial for transparency and debugging in AI applications.
  • Enhanced Problem Solving: By focusing on generating intermediate reasoning steps, the model aims to improve performance on tasks that require complex logical deduction or multi-step problem-solving.
  • 32k Context Length: Built upon a base model with a 32,768 token context window, it can process and reason over significantly longer inputs compared to many other models in its size class.

Training Details

The model's training involved a combination of Supervised Fine-Tuning (SFT) and GRPO, utilizing a LoRA configuration with a rank of 32 and an alpha of 64.0. The training was conducted using the Tunix (JAX) framework on a v6e-1 TPU.

Good for

  • Applications requiring explainable AI outputs.
  • Tasks where the step-by-step thought process is as important as the final answer.
  • Educational tools that demonstrate problem-solving methodologies.
  • Automated systems needing to justify their decisions or conclusions.