Name: VikramR/cypherbench-grpo-4.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: VikramR

Overview

VikramR/cypherbench-grpo-4.3 is a 5.1 billion parameter language model, fine-tuned from the google/gemma-4-E2B-it base model. The fine-tuning process utilized the TRL library and incorporated the GRPO (Gradient-based Reward Policy Optimization) method.

Key Capabilities

Enhanced Mathematical Reasoning: The model's training with GRPO, a method introduced in the DeepSeekMath paper, suggests a focus on improving mathematical problem-solving and logical deduction.
Instruction Following: As a fine-tuned instruction model, it is designed to respond effectively to user prompts and questions.

Training Details

The model was trained using specific versions of key frameworks:

TRL: 1.6.0
Transformers: 5.12.0
Pytorch: 2.10.0+cu129
Datasets: 4.8.5
Tokenizers: 0.22.2

Good For

Applications requiring strong mathematical reasoning.
Tasks involving complex logical problem-solving.
General instruction-following scenarios where a robust understanding of numerical and logical relationships is beneficial.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)