Name: odats/rl_nmt_2026_04_03_17_04 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: odats

Model Overview

The odats/rl_nmt_2026_04_03_17_04 is a 1 billion parameter language model, fine-tuned from the google/gemma-3-1b-it architecture. This model leverages the TRL framework for its training process.

Key Capabilities

Enhanced Mathematical Reasoning: The model was specifically trained using the GRPO (Gradient-based Reward Policy Optimization) method, as introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This training approach aims to significantly improve its performance on mathematical and logical reasoning tasks.
Instruction Following: As a fine-tuned version of an instruction-tuned model, it is designed to follow user prompts effectively.
Context Length: Supports a substantial context length of 32768 tokens, allowing for processing longer inputs and maintaining coherence over extended interactions.

Ideal Use Cases

This model is particularly well-suited for applications that require:

Solving complex mathematical problems.
Logical deduction and reasoning tasks.
Generating responses that demand a structured, analytical approach.
Scenarios where a smaller, specialized model can outperform larger general-purpose models on specific reasoning benchmarks.

Overview

Model Overview

Key Capabilities

Ideal Use Cases

Full Model Card (README)