Name: lmassaron/gemma-2-2b-it-grpo-gsm8k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lmassaron

Model Overview

This model, lmassaron/gemma-2-2b-it-grpo-gsm8k, is a specialized fine-tuned version of Google's Gemma-2-2b-it, featuring 2.6 billion parameters and an 8192-token context length. It was developed by lmassaron with a primary focus on enhancing mathematical reasoning abilities.

Key Capabilities

Mathematical Reasoning: The model has been fine-tuned on the GSM8k dataset, making it particularly adept at solving grade school-level math problems.
GRPO Training Method: It leverages the GRPOTrainer, implementing the GRPO method as introduced in the DeepSeekMath paper, which is designed to push the limits of mathematical reasoning in open language models.

Good For

Arithmetic Problem Solving: Ideal for tasks requiring numerical comparisons, basic arithmetic, and multi-step mathematical deductions.
Educational Applications: Can be used in tools or systems that assist with or evaluate mathematical understanding at a foundational level.
Research in Mathematical LLMs: Provides a specific example of a model trained with GRPO for mathematical reasoning, useful for comparative studies or further development.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)